Missing column headers when syncing a table to GCS

Options
ben_p
ben_p Neuron 2020, Registered, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant Posts: 143 ✭✭✭✭✭✭✭

Hi all,

As the title says, I am syncing a dataset to Google Cloud Storage using the sync recipe. When I do this my resulting file does not have any headers, how can I get them into the output?

Also, my file is being sharded, even when I select "Force single output file" in the advanced settings. Is this something we are not able to force?

Also, one more thing, can we specify an output name/suffix for the file?

Ben

Best Answer

  • ben_p
    ben_p Neuron 2020, Registered, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant Posts: 143 ✭✭✭✭✭✭✭
    Answer ✓
    Options

    Hi all,

    I managed to work this out in the end - going into the dataset settings and clicking onto the "preview" table shows some more advanced options, including parsing of headers. Interestingly when I clicked this the preview that is shown no longer keeps the correct headers in place, but the output file is correct.

    headers.PNG

    I would suggest that having these settings here is a bit confusing, as it makes it read like they are only effecting the [review, rather than the actual output file. It would make more logical sense to have these settings as part of the sync job, since this seems like when they should be applied?

    Ben

Setup Info
    Tags
      Help me…