Adding file name as a column
Hi members,
My data comes from an excel file titled "World Trade Matrix - 202204 - Polyethylene Film and Sheet.xlsx".
In the prepared data i need two columns which has month and product info.
Example: Column1 = 202204
column2: Polyethylene Film and Sheet
Is it possible to do this using prepare recipe??
Operating system used: windows 10
Answers
-
JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 296 Dataiker
Hi @Nithin
,Yes, you can extract the file name using the “Enrich records with context information” processor within the Prepare recipe. You will want to add a column header name under "output filename column". Please note that you will need to run the recipe in order to view the column values.
Then, you can use the “Find and replace” processor to output a column for Month using the following regular expression: [^0-9]
Lastly, you can use the “Split column” processor to output a column for Product using ‘-’ as the delimiter.
I hope that information is helpful. Please let me know if the steps above are not working as expected for you.
Thanks!
Jordan
-
Thanks a lot Jordan! Let me try this.