Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

How do I extract filename of file uploaded using Dataset -> Upload your files

Level 1
How do I extract filename of file uploaded using Dataset -> Upload your files

I have a csv that contains 2 datasets arranged vertically (one below the other) in it - 

1. Header

2. Body

After parsing these 2 datasets using prepare recipe, they need to be joined together.

However, there is no common key between these 2 datasets.

One way is to enrich these 2 datasets during prepare recipe step with the csv filename and then join the 2 datasets using this filename as the key.

I am unable to find any option in DSS that can help identify/ extract the uploaded file's name.

Please help.

0 Kudos
3 Replies


In a prepare recipe you should be able to use: Misc > Enrich record with context information. Where you can add the filename and join based on that.

Please note there could some limitations for other file types besides txt or csv. 

See :

Let me know if this would work for you. 




0 Kudos
Level 1

Unfortunately, I don't see this option in my version of DSS, any other suggestions please.

Dataiku DSS

Version 6.0.1

0 Kudos

If you are unable to upgrade.

One possible suggestion would be to use a managed folder to upload all your files to. Use a python recipe to add the file name and output to another managed folder from which you can build create your datasets. 

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import os

input_folder = dataiku.Folder("PAcVjikK")
paths = input_folder.list_paths_in_partition()
output_folder = dataiku.Folder("MLpqB40C")

# Iterate through files, check if they fit certain regex condition, and write them to output managed folders accordingly.
for paths[x] in paths:
    with input_folder.get_download_stream(paths[x]) as f:
        data = pd.read_csv(f)
        filename= paths[x][1:]
        data['filename_column'] = filename
        output_folder.upload_stream(filename, data.to_csv(index=False).encode("utf-8"))
x +=1




0 Kudos
A banner prompting to get Dataiku DSS