Make Dataiku Managed Datasets Less Opinionated (aka stop dropping my tables)

Options
importthepandas
importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

After 11.4.0 (or earlier as we upgraded from 11.0.3), Dataiku not defaults to dropping and re-creating by default when using Dataset python APIs if for some reason the dataset schema and underlying table do not match. It will do this silently and pass jobs, where later we find out that we've lost our history in the base table (snowflake in our case).

This is really scary default behavior. Dataiku should default to throwing errors and stopping jobs vs dropping tables and re-creating them. Even better, the user should be able to intelligently control this behavior.

"Dont do exotic things importthepandas and youll be ok" - sure, however, if someone changes or alters a data type in snowflake and forgets to re-sync schemas in dataiku, DSS should not drop my table.

Give us more flexibility for the drop/re-create behavior.

3
3 votes

Released · Last Updated

Comments

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    This is particularly important when working with slow to acquire datasets. I've lost a few day with a dataset getting dropped in similar senarios.

  • apichery
    apichery Dataiker, Alpha Tester, Product Ideas Manager Posts: 62 Dataiker
    Options

    We fixed the issue in DSS 11.4.1 and above.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,726 Neuron
    Options

    Great to see this fixed!

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron
    Options

    @apichery
    Python dataset methods no longer drop by default?

Setup Info
    Tags
      Help me…