Schema Inconsistency

Rushil09
Level 3
Schema Inconsistency

We have multiple csv files being read by amazon s3 and we figured out that there is schema inconsistency in it. How can we handle it using dataiku?

Because as per my knowledge it will only read certain files with matching schema to the first in the line.


Operating system used: Ubuntu

0 Kudos
1 Reply
AlexT
Dataiker

Hi @Rushil09 ,

In this case, it seems it would be better to use Folder to upload the different files with different schemas.

Then use files in the folder dataset ( +Dataset - Internal - Files in folder) to create various datasets from the files and then stack them ( Stack recipe) as needed. 

With "files in folder" dataset you can also specify which file to read the schema from. 

 

Hope that helps!

0 Kudos