Missing column header
Hi,
I'm trying to sync a dataset to Google Cloud Storage using sync recipe. the output csv doesn't have header, I found these 2 threads: https://community.dataiku.com/t5/Using-Dataiku-DSS/Missing-column-headers-when-syncing-a-table-to-GCS/m-p/9440 and https://community.dataiku.com/t5/Using-Dataiku-DSS/Sync-a-dataset-to-S3-with-headers-no-compression-and-custom-name/m-p/822
, and followed them. however, the output still doesn't have the header.
Any ideas?
Answers
-
Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
Hi @NurulPutri
,
The links you referenced do apply to GCP as well, and selecting the option "Parse next line as column header" in your output dataset should resolve the missing header in the GCP CSV file.The only difference I can think of is that the Sync recipe does need to be re-run in order for the change to be reflected in your output dataset. Can you make sure that you've re-run the recipe after making the "Parse next line as column headers" adjustment to the output dataset? Do you have any partitioning or "append" mode on in your Sync recipe?
Thanks,
Sarina -
I found the the promote header processor in a prepare recipe worked better than doing it in a sync recipe.
I think I may have been missing a vital step, but doing it in the sync recipe forced me to delete the columns to promote the headers, but the data rows didn't flow in perhaps due to a schema issue.
Though I tried a few things, I couldn't get it to work, where the 'promote header' processor just gave me a completed dataset without much further fanfare.