Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

How to move files between managed folders with Python

Level 1
How to move files between managed folders with Python
Dear Community,

does somebody know how to move files between managed folders with Python? Is it possible?

One idea is that the file can be read from one managed folder firstly and written into one another managed folder, after that then delete the file. But I can't find one method to delete files in managed folder.

Could somebody help me? Many thanks! 🙂
1 Reply
Level 3

Are you loading the initial files into python as dataframes or something similar?  If so, you can use any number of writers to write the data back out in the format of your choosing. Below is some code bashed together from a few projects, showing general methods to import files (you would need to tweak to get the desired objects) and then export those objects (my example here is using ExcelWriter, but you can substitute any appropriate writer).  I am not sure how you would handle deletion of the files in the folder, but you may be able to find something in the python or datasets API documentation.

from pandas import ExcelWriter
import os

input_folder = dataiku.Folder("input_folder")
output_folder = dataiku.Folder("output_folder")

# declare input paths for files, output path for export
in_files = input_folder.list_paths_in_partition()
out_path = os.path.join(input_folder.get_path(), 'filename.txt')

# read in .csv files to a single dataframe
# if you need multiple files, you can use the general method below but alter it to your needs
# for example, iterate to write a list of dataframes to send to an exporting loop below
raw_data = pd.DataFrame()
for i in (in_paths):
with input_folder.get_download_stream(i) as f:
new_data = pd.read_csv(f, header = 0)
new_data['source'] = i
new_data['import_date'] =
raw_data = pd.concat([raw_data, new_data])

# write out one dataframe as an excel book
# substitute any writer from any package as needed here
writer = pd.ExcelWriter(out_path, engine='xlsxwriter')
raw_data.to_excel(writer, sheet_name="name")

# write out a list of dataframes to sheets in an excel book
# snippet below assumes you have a list of dfs and a list of associated labels
writer = pd.ExcelWriter(out_path, engine='xlsxwriter')
for n, df in enumerate(list_dfs):
df.to_excel(writer, sheet_name=labels[n])
print("Export complete. File exported to:")


0 Kudos
Labels (1)