Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

SQL: Automate schema adjustment should the job fail because of Data Truncation: Data too long error

0 Kudos

Automate schema adjustment should the job fail because of Data Truncation: Data too long error

My & my team work with text data, using MySQL db, and very often we run into the error related to Data Truncation: Data too long for column XYZ

By default, DSS is using varchar(500) as default for writing strings to MySQL:

AntonB_0-1654092477521.jpeg

 

then I get an error specifying the problem and the column name

AntonB_1-1654092477069.jpeg

 

I go into settings->advanced and change the schema to text or longtext

AntonB_2-1654092477224.jpeg

 

I run again and it works fine, Job succeeded 

 

Apparently, DSS is aware of the error type and also the field name, so from my point of view, it should be able to adjust the schema by itself from varchar(500) to text or longtext and load the data again without human intervention. It would save us loads of time to have this automated fix of schema implemented.

2 Comments
apichery
Dataiker
Dataiker

Hi Anton,

This is an interesting idea, thanks for submitting it! One thing to consider though is that switching from VARCHAR to LONGTEXT has some consequences as LONGTEXT is not stored "inline" with the other columns in the row by MySQL. It is stored separately and - as a consequence - has some non-negligible performance impacts on some queries.

So, in some cases it might more interesting to use for example VARCHAR(600)  instead of LONGTEXT. Not something that Dataiku can easily infer.

Anyway, we've logged your request and we'll see if/when/how we can schedule it!

CoreyS
Community Manager
Community Manager
Status changed to: In Backlog