XML automation

Options
jeanclaude_ho
jeanclaude_ho Registered Posts: 1

Hi,

I want to automate an XML process where everytime an xml is dropped, a bunch of recipes process the xml and generate a csv at the end. However, my xml may vary, some tags and attibutes may or may not be present from one file to another. So how can I handle this? If I have an xml with tags A, B, and C and then is converted into a dataframe with multiple columns (each pair tag/attribute is a column), what happens if my next file has a column D? will it be ignore? Should I build my flow with an xml that has all possible tags to ensure the flow will work correctly?

Thanks for your insights.


Operating system used: windows 10

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,730 Neuron
    Options

    You will need to use a Python recipe with custom code to handle this. But there is a limit on how much you can change the source file before your code won't be able to handle it.

Setup Info
    Tags
      Help me…