Dataset overwritten instead of error
When building datasets I have seen that, on changes to the schema given by the recipe, the dataset is fully overwritten, data and all. This means that, when a recipe suddenly does not return the correct schema, all previous data is lost…
Previously we did get an error message if this was the case and we would not lose any data unexpectedly but after updating to 12.3 (from 8.0) this is suddenly not done anymore. Is there a setting we would need to change to resolve this? It is becoming quite a problem as some 'dynamic' steps like pivots/folds can quikly return a somewhat different schema based on what input is given which causes big problems for the flow as well as historical data.
When knowingly changing the schema inside of a recipe and saving the recipe we do get an error message like the one below. But if we do not save explicitly and rather run the recipe instead, it gets saved anyway and the output dataset is dropped and recreated. Previously we would see this message pop up when running the recipe as well, is this possible to achieve?
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron
Administration ⇒ Settings ⇒ Other ⇒ Misc. ⇒ Schema incompatibilities ⇒ Auto-accept schema changes at end of flow
-
For me it's under Administration⇒Settings⇒Engines & Flow⇒Flow build⇒Schema incompatibilities⇒Auto-accept schema changes at end of flow. I am on 12.3.1 if that matters.
It was disabled originally but enabling it did not change anything unfortunately. Should DSS be restarted after changes to this? Whether enabled or disabled it still overwrites the data without giving an error message.