SQL: Automate schema adjustment should the job fail because of Data Truncation: Data too long error

Automate schema adjustment should the job fail because of Data Truncation: Data too long error

My & my team work with text data, using MySQL db, and very often we run into the error related to Data Truncation: Data too long for column XYZ

By default, DSS is using varchar(500) as default for writing strings to MySQL:

AntonB_0-1654092477521.jpeg

 

then I get an error specifying the problem and the column name

AntonB_1-1654092477069.jpeg

 

I go into settings->advanced and change the schema to text or longtext

AntonB_2-1654092477224.jpeg

 

I run again and it works fine, Job succeeded 

 

Apparently, DSS is aware of the error type and also the field name, so from my point of view, it should be able to adjust the schema by itself from varchar(500) to text or longtext and load the data again without human intervention. It would save us loads of time to have this automated fix of schema implemented.

3 Comments
apichery
Dataiker

Hi Anton,

This is an interesting idea, thanks for submitting it! One thing to consider though is that switching from VARCHAR to LONGTEXT has some consequences as LONGTEXT is not stored "inline" with the other columns in the row by MySQL. It is stored separately and - as a consequence - has some non-negligible performance impacts on some queries.

So, in some cases it might more interesting to use for example VARCHAR(600)  instead of LONGTEXT. Not something that Dataiku can easily infer.

Anyway, we've logged your request and we'll see if/when/how we can schedule it!

Hi Anton,

This is an interesting idea, thanks for submitting it! One thing to consider though is that switching from VARCHAR to LONGTEXT has some consequences as LONGTEXT is not stored "inline" with the other columns in the row by MySQL. It is stored separately and - as a consequence - has some non-negligible performance impacts on some queries.

So, in some cases it might more interesting to use for example VARCHAR(600)  instead of LONGTEXT. Not something that Dataiku can easily infer.

Anyway, we've logged your request and we'll see if/when/how we can schedule it!

CoreyS
Dataiker Alumni
 
Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
Status changed to: In Backlog
 
MichaelG
Community Manager
Community Manager
 
I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
Status changed to: In the Backlog