Parsing or Folding XML file
Hi All,
I uploaded an XML file to my environment but it seems like some fields need to be unnested/unfolded.
In dataiku my file looks like this:
Name | Keywords |
Bruce Springsteen | [{"Keyword":[{"Type":"Musician","Category":"Primary ","Keyword":" Born in the USA","SubCategory":"Music","Since":"2002-01-09","Country":"USA","To":"2007-01-09"}]}]
|
However I would like to get the data in this format:
Name | Type | Category | Keyword | Subcategory | Since | County | To |
Bruce Springsteen | Musician | Primary | Born in the USA | Music | 2002-01-09 | USA | 2007-01-09 |
Is there anyone who can point me in the right direction which preprocessor I should use to unfold or unnest my data?
Thanks in advance!
Operating system used: Win
Answers
-
Matias Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 12 Dataiker
Hi @KevinHart
,While exploring your dataset, If clicking on the column 'Keywords', you should be able to see an option to 'unnest' objects, as you can see below:
(screenshot attached)
Could you please try that and let me know if indeed that separates the different objects?
Looking forward to your reply