Reading Source files with different Schema

Options
sj0071992
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron

Hi,

I have one File System where all my log files are Stored (Every Day)

That log files contain Details about (SQL_CONNECTION, SQL_QUERY and LOCAL_PROCESS) Usage type

These each Usage type have different set of columns and in single Log file we have details about all these 3 Usage type

I want to build the separate process based Usage type so i used split recipe. But now when i ran for LOCAL_PROCESS the columns which i am expecting is not coming but in source log file i am able to see those columns

Could you please help here.

Thanks in Advance

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi @sj0071992
    ,

    When creating the schema it may not find the columns in the entries it analyzes to build the schema.

    One way you can approach this is by setting "one record per line" and later using a prepare recipe to split to the columns you need.

    Screenshot 2021-12-11 at 12.18.20.png

    Let me know if that approach works for you if not can you please share a small sample( obfuscated if needed) of the first few lines from your log file/s.

    Thanks,

Setup Info
    Tags
      Help me…