How to export a saved model, as zip file, in a managed folder?
It would be great to have some help from the community:
"How to export a saved model, as zip file, in a managed folder?"
It seems like that I need to do it two steps:
# Step 1: save the model as a zip file at the instance where the current project is running
# Step 2: upload the zip file from the instance location to the managed folder
How to do it in 1 step without saving the file in the local instance?
The example provided in the link #1 below uses "get_scoring_python" method which takes a 'filename' as an input. One can provide path + filename to save the zip file in a specific location. I might be wrong, but it seems like it does not allow 'folder_path' as a 'path' but accept any path from the local instance.
Thank you.
Location of the project:
/home/dataiku/lib/
Location of the managed folder:
/home/dataiku/data/managed_folder/xxxx
I have explored the following links:
1. https://developer.dataiku.com/latest/concepts-and-examples/ml.html
2. https://doc.dataiku.com/dss/latest/machine-learning/models-export.html
I have checked the following threads where there were discussion along this line but could not find the exact solution that I need:
https://community.dataiku.com/t5/Using-Dataiku/Using-Models-Outside-DSS/m-p/32693
Best Answers
-
You can try to combine the model's get_scoring_python_stream with the managed folder's upload_stream, e.g.
with model.get_scoring_python_stream() as s: folder.upload_stream(managedfolder_file_name, s)
-
radiantly Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 6 ✭
Thank you. The method 'get_scoring_python_stream' works. Adding my complete code in case someone needs it in the future.
client = dataiku.api_client() project = client.get_default_project() project_key = project.project_key # Get the saved model id from here project.list_saved_models() sm_id = 'saved model id' saved_model = project.get_saved_model( sm_id ) version_id = saved_model.get_active_version()['id'] saved_model= saved_model.get_version_details( version_id=version_id ) folder = dataiku.Folder('your managed folder name') managedfolder_file_name= 'model-archieve.zip' with saved_model.get_scoring_python_stream() as s: folder.upload_stream(folder_file_name, s)
Answers
-
I have a similar use case, and tried to follow radiantly's approach. However, the saved_model I get does not have get_scoring_python_stream. Looking at the docs for that function, I see that "This works provided that you have the license to do so and that the model is compatible with Python scoring." What license is needed to access this and other scoring output functions?