How to move files between managed folders with Python

daida
daida Registered Posts: 5 ✭✭✭✭
Dear Community,

does somebody know how to move files between managed folders with Python? Is it possible?

One idea is that the file can be read from one managed folder firstly and written into one another managed folder, after that then delete the file. But I can't find one method to delete files in managed folder.

Could somebody help me? Many thanks! :)
Tagged:

Answers

  • rtaylor
    rtaylor Registered Posts: 24 ✭✭✭✭✭
    edited July 18

    Are you loading the initial files into python as dataframes or something similar? If so, you can use any number of writers to write the data back out in the format of your choosing. Below is some code bashed together from a few projects, showing general methods to import files (you would need to tweak to get the desired objects) and then export those objects (my example here is using ExcelWriter, but you can substitute any appropriate writer). I am not sure how you would handle deletion of the files in the folder, but you may be able to find something in the python or datasets API documentation.


    from pandas import ExcelWriter
    import os

    input_folder = dataiku.Folder("input_folder")
    output_folder = dataiku.Folder("output_folder")

    # declare input paths for files, output path for export
    in_files = input_folder.list_paths_in_partition()
    out_path = os.path.join(input_folder.get_path(), 'filename.txt')

    # read in .csv files to a single dataframe
    # if you need multiple files, you can use the general method below but alter it to your needs
    # for example, iterate to write a list of dataframes to send to an exporting loop below
    raw_data = pd.DataFrame()
    for i in (in_paths):
    with input_folder.get_download_stream(i) as f:
    new_data = pd.read_csv(f, header = 0)
    new_data['source'] = i
    new_data['import_date'] = datetime.datetime.now()
    raw_data = pd.concat([raw_data, new_data])

    # write out one dataframe as an excel book
    # substitute any writer from any package as needed here
    writer = pd.ExcelWriter(out_path, engine='xlsxwriter')
    raw_data.to_excel(writer, sheet_name="name")

    # write out a list of dataframes to sheets in an excel book
    # snippet below assumes you have a list of dfs and a list of associated labels
    writer = pd.ExcelWriter(out_path, engine='xlsxwriter')
    for n, df in enumerate(list_dfs):
    df.to_excel(writer, sheet_name=labels[n])
    writer.save()
    print("Export complete. File exported to:")
    print(path)

Setup Info
    Tags
      Help me…