Missing column descriptions on downstream datasets

VickeyC
VickeyC Registered Posts: 27 ✭✭✭✭

How do I get my column descriptions to persist across the pipeline?

I have an uploaded dataset that I defined descriptions for all of the columns. I then created 2 independent recipes on the original dataset -- one which performs a simple prepare step, and the other that filters the rows from the uploaded dataset. Both datasets are created in HDFS.

When I initially ran the build for the 2 new datasets, I didn't see any of my descriptions in the downstream datasets. I tried propagating the schema change, but no descriptions appeared. Then I tried indexing all of the Hive datasets. This resulted in having my descriptions show up on the filtered dataset,, but not on the prepared dataset.

Any idea how to make this work for all of the datasets?


Operating system used: RHEL 7.9

Tagged:

Best Answer

Setup Info
    Tags
      Help me…