How to create subfolder inside S3 bucket using python recipe in Dataiku?

Options
ShubhamRai8080
ShubhamRai8080 Registered Posts: 3 ✭✭✭

Hi,

I need to store output CSVs in S3 bucket, so I am looking for a solution, how I can create subfolder inside S3 bucket and store CSV files in subfolder.
New Subfolder will be created in each cycle with specific name.

Is there any solution for this?

Regards,

Shubham

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    edited July 17 Answer ✓
    Options

    Hi,

    You can use a similar approach as suggested :

    https://community.dataiku.com/t5/General-Discussion/Writing-Data-to-s3-from-Dataiku/m-p/20511

    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    
    from datetime import datetime
    todays_date = datetime.today().strftime('%Y-%m-%d')
    
    # Replace dataset anme
    
    my_dataset = dataiku.Dataset("orders")
    df = my_dataset.get_dataframe()
    
    #replace folder_id 
    managed_folder_id = "QEDQ4XeS"
    
    # Write recipe outputs
    
    sub_path = str(todays_date) + '/'
    output_folder = dataiku.Folder(managed_folder_id)
    output_folder.upload_stream(sub_path + "daily_orders.csv", df.to_csv(index=False).encode("utf-8"))


    This would write an object in a folder with today's date:
    Screenshot 2022-06-09 at 08.40.47.png

    Hope that helps!

Answers

Setup Info
    Tags
      Help me…