column names order change

Solved!
dialpemo
Level 2
column names order change

I noticed that if I am using flat files like Excel, if the header order changes Dataiku doesn't see the change?

Is there a way that Dataiku sees this change without manually dropping the schema ?

The issue I have is that if column 1 was currency 1 and column 2 was currency 2;

And someone has changed the order column in the Excel file (names rename the same) when I update the file  in Dataiku by deleting the file and uploading the new file the headers and values do not match the input file


Operating system used: linux

0 Kudos
1 Solution
MiguelangelC
Dataiker

Hi dialpemo,

Updating the schema due to changes in the headers but not the data should be manually driven and not automatised. Automating this kind of schema changes can lead to hiding problems with the data fed to the flow and potentially break it, though in this case I understand where you are coming from. Once the schema of a dataset is built, the only check available tests its consistency, not whether the headers match with the new dataset.In your case both columns are possibly of the decimal type, and the schema integrity is not affected.

If you are using the same recipe or managed folder for inputing the same data, its associated metadata should always be the same.

Still, you can always delete all the columns in your dataset under Settings > Schema. This will prompt the appearance of a new button "Reload Schema Using detected Data" to rebuild it considering the new column order.

View solution in original post

2 Replies
MiguelangelC
Dataiker

Hi dialpemo,

Updating the schema due to changes in the headers but not the data should be manually driven and not automatised. Automating this kind of schema changes can lead to hiding problems with the data fed to the flow and potentially break it, though in this case I understand where you are coming from. Once the schema of a dataset is built, the only check available tests its consistency, not whether the headers match with the new dataset.In your case both columns are possibly of the decimal type, and the schema integrity is not affected.

If you are using the same recipe or managed folder for inputing the same data, its associated metadata should always be the same.

Still, you can always delete all the columns in your dataset under Settings > Schema. This will prompt the appearance of a new button "Reload Schema Using detected Data" to rebuild it considering the new column order.

dialpemo
Level 2
Author

Hi @MiguelangelC ,

I thought as much thank you for your prompt response.

Kind Regards,

 

 

0 Kudos