How do i read a pickle file from a managed folder

Options
ssuhas76
ssuhas76 Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 31

Hello there,

I am trying to read a pickle file from a managed folder in the dataiku api endpoint code.

client = dataikuapi.DSSClient(host, apiKey)
client._session.verify = False
project = client.get_project("xxxxxx")
variables = project.get_variables()['standard']
managedfolder = project.get_managed_folder("xxxxxxxxxxxxxxx")

modelid = 'babababababababababab'
with managedfolder.get_file(f"models/{modelid}.pkl") as fd:
loaded_model = pickle.load(open(f"models/{modelid}.pkl", 'rb'))

I am facing a issue saying the file doesn't exist. Please help

Best Answer

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 293 Dataiker
    edited July 17 Answer ✓
    Options

    Hi @ssuhas76
    ,

    If the pickle file is on a non-local managed folder (i.e., S3), please use the following for loading & saving:

    import dataiku
    import pickle
    
    remote_folder = dataiku.Folder("pkl-models") # This is a managed folder on S3 connection
    
    # Save pickle
    with remote_folder.get_writer("/test-model.pkl") as writer:
        pickle.dump(clf, writer) # Assuming clf is a sklearn object
    
    # Load pickle
    with remote_folder.get_download_stream('test-model.pkl') as f:
        clf_loaded = pickle.load(f)

    If the pickle file is on a local managed folder, please try this:

    import pickle
    import dataiku
    
    folder = dataiku.Folder("folderID").get_path()
    model_path = folder + "/model.pkl"
    with open(model_path, 'rb') as file:
        clf = pickle.load(file)

    Thanks!

    Jordan

Setup Info
    Tags
      Help me…