Is it possible to disable Dataiku data type detection?

wvde · August 2023

Hello,

Is it possible to disable Dataiku's automatic data type detection? I find this feature to be more trouble than helpful and would prefer to have everything read in and kept as a string unless I explicitly cast it to something else.

Some specific troubles that relate to this are:

(1) Auto-detecting ID columns as integers rather than strings for new files

(2) Determining detected types off of first x records in a union which happens to be Nulls and thus forcing the type to be bigint rather than double.

Thanks,

Operating system used: Windows

Miasm1 · August 2023

Hello,

Yes, you can adjust Dataiku's automatic data type detection:

During data import, select "Advanced" and choose to read all columns as strings.
To address your concerns:

Manually set ID columns as strings during import.
Set types before union operations to prevent incorrect type inference.

Always review the schema after actions to ensure correctness.

I hope this will help you!

Jason · September 2023

I have this problem as well, but it extends beyond just the initial data import. Recipes that use python (and specifically Pandas) sample the top of the table to determine data types. I have a field that contains item numbers, and in nearly all cases they are an integer, but sometimes they have a letter suffix. The type detection in pandas then treats it as an integer just long enough to force the schema, then when the data arrives, the database freaks out about the type mismatch. This occurs in several places, most infuriatingly in the time series resampling recipe.

me2 · September 2023

I had a similar problem involving Dataiku's Data Type detection. It is definitely an area for improvement.

https://community.dataiku.com/t5/Using-Dataiku/unintended-data-filtering-from-Prepare-Recipe/m-p/37527#M13867

tgb417 · September 2023

I’ve been pointing out these “duck typing” of columns challenges for a while now. I’ve submitted two product ideas that it would be great to get further feedback to the Dataiku team about.

Please consider “voting” for either of these ideas, or adding your own product idea if neither of these cover your use case or suggestion.

https://community.dataiku.com/t5/Product-Ideas/The-ability-to-turn-off-Cell-level-quot-Duck-Typing-quot-within/idi-p/16792

https://community.dataiku.com/t5/Product-Ideas/Override-to-Standard-quot-Duck-Typing-quot-of-Variables-in-DSS/idi-p/31904

psvnm · August 2024

https://community.dataiku.com/discussion/comment/36944#Comment_36944

Hi, I am not able find the option in Advanced as you have mentioned. Can you please check once?

psvnm · August 2024

Hi, I am not able find the option in Advanced as you have mentioned. Can you please check once?

Is it possible to disable Dataiku data type detection?

Answers

Categories

Setup Info

Tags