Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I have my output as "Final_output" at the end of the flow. I want to export this into S3 as a csv with the name "Final_output_$datetime.csv"
So everytime the flow runs, it has to create a file with that timestamp. I tried with variable. But didnt work when it comes to file name creation.
DSS doesn't let you control the name of the files it produces, so you need a Python recipe to a managed folder to do it. For example with
v# -*- coding: utf-8 -*- import dataiku import pandas as pd import os ds = dataiku.Dataset("...the dataset name") df = ds.get_dataframe() f = dataiku.Folder("...the folder id") path = f.get_path() df.to_csv(os.path.join(path, "final_output_%s.csv" % dataiku.get_custom_variables()["datetime"]))
or with a first recipe Export to folder, followed by a Python recipe to rename the file, like
# -*- coding: utf-8 -*- import dataiku import pandas as pd exported = dataiku.Folder("f") final = dataiku.Folder("g") csv_in_folder = [x for x in exported.list_paths_in_partition() if x.endswith('.csv')] with exported.get_download_stream(csv_in_folder) as s: data = s.read() final.upload_data("final_output_%s.csv" % dataiku.get_custom_variables()["datetime"], data)