Replace headers with other row in table imported from Google Sheet

Solved!
jjc_al
Level 2
Replace headers with other row in table imported from Google Sheet

Hi,

I have imported a table from a Google Sheet using Dataiku's Google Sheet plugin. What I would like to do now is set the third row (with 'horodate' at the far left) as the headers row, and delete all rows above the third line.

1. How can I achieve this using one of Dataiku's built-in recipes?

2. Is there  a way to do so directly when importing the Google Sheet? When creating a new dataset, in the format cell, I only see "First row contains headers" but I cannot select the third row as headers.

Please find below an example with the third row as the row I would like to move up and turn into the column headers.

Thank you for your help,

jjc_al

Test_Dataiku_import GSheet.PNG

0 Kudos
1 Solution
Jurre
Level 5

Hi @jjc_al ,

Possibly a suitable built-in recipe for this is the prepare recipe with it's long list of processors. In this case a filter processor and a renaming columns one (under Misc.) would do the trick. A custom solution based on some python might be more helpfull, here is a kind-of-related-post with some code examples in it. 

Expanding on your question : sourcedata in excelformat, when uploading those, you can define a number of records to skip and use what follows as the column header. See attached screenshot. 

Cheers, 

Jurre

View solution in original post

0 Kudos
1 Reply
Jurre
Level 5

Hi @jjc_al ,

Possibly a suitable built-in recipe for this is the prepare recipe with it's long list of processors. In this case a filter processor and a renaming columns one (under Misc.) would do the trick. A custom solution based on some python might be more helpfull, here is a kind-of-related-post with some code examples in it. 

Expanding on your question : sourcedata in excelformat, when uploading those, you can define a number of records to skip and use what follows as the column header. See attached screenshot. 

Cheers, 

Jurre

0 Kudos