Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
How do I get my column descriptions to persist across the pipeline?
I have an uploaded dataset that I defined descriptions for all of the columns. I then created 2 independent recipes on the original dataset -- one which performs a simple prepare step, and the other that filters the rows from the uploaded dataset. Both datasets are created in HDFS.
When I initially ran the build for the 2 new datasets, I didn't see any of my descriptions in the downstream datasets. I tried propagating the schema change, but no descriptions appeared. Then I tried indexing all of the Hive datasets. This resulted in having my descriptions show up on the filtered dataset,, but not on the prepared dataset.
Any idea how to make this work for all of the datasets?
Operating system used: RHEL 7.9