Replace headers with other row in table imported from Google Sheet

jjc_al
jjc_al Registered Posts: 5 ✭✭✭

Hi,

I have imported a table from a Google Sheet using Dataiku's Google Sheet plugin. What I would like to do now is set the third row (with 'horodate' at the far left) as the headers row, and delete all rows above the third line.

1. How can I achieve this using one of Dataiku's built-in recipes?

2. Is there a way to do so directly when importing the Google Sheet? When creating a new dataset, in the format cell, I only see "First row contains headers" but I cannot select the third row as headers.

Please find below an example with the third row as the row I would like to move up and turn into the column headers.

Thank you for your help,

jjc_al

Test_Dataiku_import GSheet.PNG

Best Answer

  • Jurre
    Jurre Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered, Dataiku DSS Developer, Neuron 2022 Posts: 114 ✭✭✭✭✭✭✭
    Answer ✓

    Hi @jjc_al
    ,

    Possibly a suitable built-in recipe for this is the prepare recipe with it's long list of processors. In this case a filter processor and a renaming columns one (under Misc.) would do the trick. A custom solution based on some python might be more helpfull, here is a kind-of-related-post with some code examples in it.

    Expanding on your question : sourcedata in excelformat, when uploading those, you can define a number of records to skip and use what follows as the column header. See attached screenshot.

    Cheers,

    Jurre

Setup Info
    Tags
      Help me…