Adding file name as a column

Nithin
Nithin Registered Posts: 2 ✭✭✭

Hi members,

My data comes from an excel file titled "World Trade Matrix - 202204 - Polyethylene Film and Sheet.xlsx".

In the prepared data i need two columns which has month and product info.

Example: Column1 = 202204

column2: Polyethylene Film and Sheet

Is it possible to do this using prepare recipe??


Operating system used: windows 10

Tagged:

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker

    Hi @Nithin
    ,

    Yes, you can extract the file name using the “Enrich records with context information” processor within the Prepare recipe. You will want to add a column header name under "output filename column". Please note that you will need to run the recipe in order to view the column values.

    Screen Shot 2022-06-29 at 12.55.38 PM.png

    Then, you can use the “Find and replace” processor to output a column for Month using the following regular expression: [^0-9]

    Screen Shot 2022-06-29 at 12.58.08 PM.png

    Lastly, you can use the “Split column” processor to output a column for Product using ‘-’ as the delimiter.

    Screen Shot 2022-06-29 at 12.58.55 PM.png

    I hope that information is helpful. Please let me know if the steps above are not working as expected for you.

    Thanks!

    Jordan

  • Nithin
    Nithin Registered Posts: 2 ✭✭✭

    Thanks a lot Jordan! Let me try this.

Setup Info
    Tags
      Help me…