Is there a way to keep the column descriptions along the pipeline? If I add column descriptions at the beginning of the pipeline, it seems like I need to add them again for every output along the pipeline. Is this the case or am I doing something wrong?
When you add a column description at the beginning of a pipeline, that's a change to the schema of that dataset. That schema change needs to be propagated to all datasets along the pipeline: https://answers.dataiku.com/1237/is-there-a-way-to-propagate-schema-changes-in-a-whole-flow
Even a smart or forced rebuild doesn't work in my case. I'm adding the descriptions in the middle of the flow. Do the descriptions need to be at the very beginning of the flow?
Here's some screenshots: soep_selected input dataset with column descriptions (right after adding them in the visual recipe): https://snag.gy/I5RFHV.jpg Flow with all schemas propagated: https://snag.gy/w95LRC.jpg soep_cleaned output dataset missing the descriptions: https://snag.gy/WZxuiH.jpg
Thanks Alex, I've tried propagating it before, but all the schema checks say everything is already propagated. Even dropping and deleting the schema of the output datasets doesn't help. The only way I can get it to work is to replace every recipe along the flow with a new one and manually add all the steps to the new recipes. It seems like somehow the column descriptions don't make it into the existing recipe, as even copying an existing recipe also removes the column descriptions.