Missing column header

NurulPutri
NurulPutri Registered Posts: 5 ✭✭✭

Hi,

I'm trying to sync a dataset to Google Cloud Storage using sync recipe. the output csv doesn't have header, I found these 2 threads: https://community.dataiku.com/t5/Using-Dataiku-DSS/Missing-column-headers-when-syncing-a-table-to-GCS/m-p/9440 and https://community.dataiku.com/t5/Using-Dataiku-DSS/Sync-a-dataset-to-S3-with-headers-no-compression-and-custom-name/m-p/822

, and followed them. however, the output still doesn't have the header.

Any ideas?

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker

    Hi @NurulPutri
    ,

    The links you referenced do apply to GCP as well, and selecting the option "Parse next line as column header" in your output dataset should resolve the missing header in the GCP CSV file.

    Screen Shot 2021-08-24 at 11.24.58 AM.png

    The only difference I can think of is that the Sync recipe does need to be re-run in order for the change to be reflected in your output dataset. Can you make sure that you've re-run the recipe after making the "Parse next line as column headers" adjustment to the output dataset? Do you have any partitioning or "append" mode on in your Sync recipe?

    Thanks,
    Sarina

  • chi_wong
    chi_wong Registered Posts: 5 ✭✭✭✭

    I found the the promote header processor in a prepare recipe worked better than doing it in a sync recipe.

    I think I may have been missing a vital step, but doing it in the sync recipe forced me to delete the columns to promote the headers, but the data rows didn't flow in perhaps due to a schema issue.

    Though I tried a few things, I couldn't get it to work, where the 'promote header' processor just gave me a completed dataset without much further fanfare.

Setup Info
    Tags
      Help me…