Dataiku recipes don't keep the data types in output datasets

Solved!
acuesta
Level 2
Dataiku recipes don't keep the data types in output datasets

DB: MySQL Server 5.6



Dataiku: 4.1



I have been trying to avoid the varchar(500) and bigint types that Dataiku generates (in the output dataset) as a consecuence of applying a recipe on an input dataset.



My objective is: if in my input dataset a column name is of type varchar(25), after applying a recipe the output dataset should still have the same type varchar(25) instead of the varchar(500) that it generates.



I have not been able to figure out how to do such thing.



I tried to set manually the storage type in the recipe for that column but doesn't seem to work, even if I edit the JSON maxlength manually.



I know that I can edit the CREATE TABLE statement from the advanced tab in the Output dataset, but that is not a thing that I would like to mantain.



The main problem I found is the following:





 





Thanks and regards,



Adriรกn

1 Solution
matthias_funke
Dataiker Alumni
Adrian, I can really only think of work-arounds, what are your thoughts?

1. You already mentioned you can edit the "create table" statement. You could potentially do that in python code using the DSS API

2. Could you use DSS managed dataset instead of MySQL as the back-end? You don't have the same limitations

3. Could you use PostgreSQL instead of MySQL?

View solution in original post

0 Kudos
4 Replies
matthias_funke
Dataiker Alumni
Adrian, one of the ideas underlying dataiku is to abstract away some of the "messy details" of data management. So perhaps you could elaborate a bit on why you want to force the storage type? Do you use mainly visual or mainly code recipes? Understanding what you want to achieve helps for giving a better answer!
0 Kudos
acuesta
Level 2
Author
I am using visual recipes.
I updated the description with the problem that lead me to trying to force the types.
0 Kudos
acuesta
Level 2
Author
Bump!
matthias_funke
Dataiker Alumni
Adrian, I can really only think of work-arounds, what are your thoughts?

1. You already mentioned you can edit the "create table" statement. You could potentially do that in python code using the DSS API

2. Could you use DSS managed dataset instead of MySQL as the back-end? You don't have the same limitations

3. Could you use PostgreSQL instead of MySQL?
0 Kudos