Schema Change in Dataiku recipe Dataset

sunith992
Level 3
Schema Change in Dataiku recipe Dataset

Hi Guys,

is there any way to convert the storage data type of a column into the required/expected type, as dataiku detects the schema automatically while creating the recipe dataset, it has becoming tough day by day to work on these kind of changes, kindly please provide your suggestions. before any response do check the below comments/effort made from my end.

- for ex, i have a column Int which is now showing double (interpreted automatically) tried converting to int again (required format) by making changes in the schema tab located in the settings, which has then turned to chaos by updating the whole column with empthy values.

- also checked the column view option from the sample dataset through visual recipe, i am neither seeing an option for INT nor reflecting with expected values after selecting txt.

4 Replies
AlexT
Dataiker

Hi,
Indeed some visual recipes will infer the type. 

For example, in a prepared recipe, you can change the type directly by clicking on the column itself and selecting the type. Can you try this option and see if you still see the same behavior?
Screenshot 2023-03-01 at 11.41.07.png

The output dataset will then have Int, if you want to propagate this change across the flow you can use the schema propagation tool :
https://knowledge.dataiku.com/latest/data-preparation/pipelines/concept-schema-propagation.html

For the code recipe, if you want to maintain the existing schema, you can use infer_with_pandas = False : https://doc.dataiku.com/dss/latest/python-api/datasets-data.html#typing-of-dataframes 


0 Kudos
sunith992
Level 3
Author

Hi Alex,

Thanks for the response, i tried the above (clicking on the col) but still no luck. its the same what I had mentioned in my first note, all the values are converting to empty values.

please let me know how should i have convert these values directly.

 

0 Kudos
sunith992
Level 3
Author

I found one solution below.

- should i have to use round to integer option by clicking on the column to convert it to integer in the prepare recipe?, it worked for me.

but i am more looking for a step in recipe script, where i can easily able to convert the type by inputting/mentioning the input format and output format , so when this recipe step runs  we get our own results (instead clicking on the explore/sample column tab options )

0 Kudos
AlexT
Dataiker

Hi,
Would using the format formula step help in your case?
https://knowledge.dataiku.com/latest/data-preparation/formulas/index.html
You should be able to convert double to int using format. Similar to what you did with round to integer option.

As for the ability to "convert the type by inputting/mentioning the input format and output format," I don't see this capability being available. I would suggest you submit this to  https://community.dataiku.com/t5/Product-Ideas/idb-p/Product_Ideas 

Thanks