Parsing or Folding XML file

Options
KevinHart
KevinHart Registered Posts: 6 ✭✭✭

Hi All,

I uploaded an XML file to my environment but it seems like some fields need to be unnested/unfolded.

In dataiku my file looks like this:

Name

Keywords

Bruce Springsteen

[{"Keyword":[{"Type":"Musician","Category":"Primary ","Keyword":" Born in the USA","SubCategory":"Music","Since":"2002-01-09","Country":"USA","To":"2007-01-09"}]}]

However I would like to get the data in this format:

Name

Type

Category

Keyword

Subcategory

Since

County

To

Bruce Springsteen

Musician

Primary

Born in the USA

Music

2002-01-09

USA

2007-01-09

Is there anyone who can point me in the right direction which preprocessor I should use to unfold or unnest my data?

Thanks in advance!


Operating system used: Win

Tagged:

Answers

  • MatiasL
    MatiasL Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 12 Dataiker
    Options

    Hi @KevinHart
    ,

    While exploring your dataset, If clicking on the column 'Keywords', you should be able to see an option to 'unnest' objects, as you can see below:

    (screenshot attached)

    Could you please try that and let me know if indeed that separates the different objects?

    Looking forward to your reply

Setup Info
    Tags
      Help me…