Try your hand at analyzing royal sentiment in Dataiku DSS! Learn more

Nested json data Manuplation

Level 1
Nested json data Manuplation

Hello all,

my data is like 

info_tdata_t
["first_name","last_name","age","height"]["sam","john","35","6.5ft"]

 

and I would like the output as 

first_namelast_nameageheight
samjohn356.5ft

I have tried several steps such as "fold an array" and transpose the row to a column but I am able to perform this task for one column only (as there is no option to perform this task for multiple columns), and the information in data_t column repeats for every new column that I have created. The results that I am getting is similar to this 

first_namelast_nameageheight
["sam","john","35","6.5ft"]["sam","john","35","6.5ft"]["sam","john","35","6.5ft"]["sam","john","35","6.5ft"]

 

 

Thank you for your assistance.

Robel.

1 Reply
Dataiker
Dataiker

Hi,

Actually a valid JSON object would be more:

{"first_name":"sam","last_name":"john","age":"35","height":"6.5ft"}

If you cells were formatted as a JSON object you would then be able to use in a "Prepare recipe" the "Unnest object" step that would provide the expected output:

Capture d’écran 2020-05-04 à 09.43.26.png

So the simplest options I see would be to:

  1. transform your data to have directly a JSON object that can be unnested
    1. you could perform that transformation in a "Prepare recipe" using for example a "Formula" step
  2. use a "Python function" step in "row" mode where you would for each row extract the new columns and corresponding values. More info on this step can be found in the documentation: Python function

I hope this helps.

Regards,

Mickaël | R&D at Dataiku