Column descriptions lost with next recipe

Options
Wuser92
Wuser92 Registered Posts: 20 ✭✭✭✭
Is there a way to keep the column descriptions along the pipeline? If I add column descriptions at the beginning of the pipeline, it seems like I need to add them again for every output along the pipeline. Is this the case or am I doing something wrong?

Thanks,
Simon
Tagged:

Answers

  • Alex_Reutter
    Alex_Reutter Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer Posts: 105 ✭✭✭✭✭✭✭
    Options
    Hi,

    When you add a column description at the beginning of a pipeline, that's a change to the schema of that dataset. That schema change needs to be propagated to all datasets along the pipeline: https://answers.dataiku.com/1237/is-there-a-way-to-propagate-schema-changes-in-a-whole-flow
  • Wuser92
    Wuser92 Registered Posts: 20 ✭✭✭✭
    Options
    Thanks Alex, I've tried propagating it before, but all the schema checks say everything is already propagated. Even dropping and deleting the schema of the output datasets doesn't help. The only way I can get it to work is to replace every recipe along the flow with a new one and manually add all the steps to the new recipes. It seems like somehow the column descriptions don't make it into the existing recipe, as even copying an existing recipe also removes the column descriptions.
  • Alex_Reutter
    Alex_Reutter Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer Posts: 105 ✭✭✭✭✭✭✭
    Options
    After I propagate the schema changes, I do a smart reconstruction build of the final dataset in the pipeline, and then I see the descriptions added in the first dataset.
  • Wuser92
    Wuser92 Registered Posts: 20 ✭✭✭✭
    Options
    Even a smart or forced rebuild doesn't work in my case. I'm adding the descriptions in the middle of the flow. Do the descriptions need to be at the very beginning of the flow?

    Here's some screenshots:
    soep_selected input dataset with column descriptions (right after adding them in the visual recipe): https://snag.gy/I5RFHV.jpg
    Flow with all schemas propagated: https://snag.gy/w95LRC.jpg
    soep_cleaned output dataset missing the descriptions: https://snag.gy/WZxuiH.jpg

    I'm using Dataiku Version 4.2.0.
Setup Info
    Tags
      Help me…