How do i read a pickle file from a managed folder

Solved!
ssuhas76
Level 3
How do i read a pickle file from a managed folder

Hello there,

 

I am trying to read a pickle file from a managed folder in the dataiku api endpoint code.

client = dataikuapi.DSSClient(host, apiKey)
client._session.verify = False
project = client.get_project("xxxxxx")
variables = project.get_variables()['standard']
managedfolder = project.get_managed_folder("xxxxxxxxxxxxxxx")

 

modelid = 'babababababababababab'
with managedfolder.get_file(f"models/{modelid}.pkl") as fd:
    loaded_model = pickle.load(open(f"models/{modelid}.pkl", 'rb'))

 

I am facing a issue saying the file doesn't exist. Please help 

0 Kudos
1 Solution
JordanB
Dataiker

Hi @ssuhas76,

If the pickle file is on a non-local managed folder (i.e., S3), please use the following for loading & saving:

import dataiku
import pickle

remote_folder = dataiku.Folder("pkl-models") # This is a managed folder on S3 connection

# Save pickle
with remote_folder.get_writer("/test-model.pkl") as writer:
    pickle.dump(clf, writer) # Assuming clf is a sklearn object

# Load pickle
with remote_folder.get_download_stream('test-model.pkl') as f:
    clf_loaded = pickle.load(f)

 

If the pickle file is on a local managed folder, please try this:

import pickle
import dataiku

folder = dataiku.Folder("folderID").get_path()
model_path = folder + "/model.pkl"
with open(model_path, 'rb') as file:
    clf = pickle.load(file)

Thanks!

Jordan

View solution in original post

0 Kudos
1 Reply
JordanB
Dataiker

Hi @ssuhas76,

If the pickle file is on a non-local managed folder (i.e., S3), please use the following for loading & saving:

import dataiku
import pickle

remote_folder = dataiku.Folder("pkl-models") # This is a managed folder on S3 connection

# Save pickle
with remote_folder.get_writer("/test-model.pkl") as writer:
    pickle.dump(clf, writer) # Assuming clf is a sklearn object

# Load pickle
with remote_folder.get_download_stream('test-model.pkl') as f:
    clf_loaded = pickle.load(f)

 

If the pickle file is on a local managed folder, please try this:

import pickle
import dataiku

folder = dataiku.Folder("folderID").get_path()
model_path = folder + "/model.pkl"
with open(model_path, 'rb') as file:
    clf = pickle.load(file)

Thanks!

Jordan

0 Kudos