How to export a saved model, as zip file, in a managed folder?

radiantly
radiantly Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 6

It would be great to have some help from the community:
"How to export a saved model, as zip file, in a managed folder?"

It seems like that I need to do it two steps:
# Step 1: save the model as a zip file at the instance where the current project is running
# Step 2: upload the zip file from the instance location to the managed folder

How to do it in 1 step without saving the file in the local instance?

The example provided in the link #1 below uses "get_scoring_python" method which takes a 'filename' as an input. One can provide path + filename to save the zip file in a specific location. I might be wrong, but it seems like it does not allow 'folder_path' as a 'path' but accept any path from the local instance.

Capture01.PNG

Thank you.

Location of the project:
/home/dataiku/lib/

Location of the managed folder:
/home/dataiku/data/managed_folder/xxxx

I have explored the following links:

1. https://developer.dataiku.com/latest/concepts-and-examples/ml.html

2. https://doc.dataiku.com/dss/latest/machine-learning/models-export.html

I have checked the following threads where there were discussion along this line but could not find the exact solution that I need:

https://community.dataiku.com/t5/Using-Dataiku/Using-Models-Outside-DSS/m-p/32693

https://community.dataiku.com/t5/Using-Dataiku/Export-model-from-dataiku-jupyter-to-my-local-machine/m-p/21744

Best Answers

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    edited July 17 Answer ✓

    You can try to combine the model's get_scoring_python_stream with the managed folder's upload_stream, e.g.

    with model.get_scoring_python_stream() as s:
        folder.upload_stream(managedfolder_file_name, s)

  • radiantly
    radiantly Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 6
    edited July 17 Answer ✓

    Thank you. The method 'get_scoring_python_stream' works. Adding my complete code in case someone needs it in the future.

    client = dataiku.api_client()
    project = client.get_default_project()
    
    project_key = project.project_key
    
    # Get the saved model id from here
    project.list_saved_models()
    sm_id = 'saved model id'
    saved_model = project.get_saved_model( sm_id )
    
    version_id = saved_model.get_active_version()['id']
    saved_model= saved_model.get_version_details( version_id=version_id )
    
    folder = dataiku.Folder('your managed folder name')
    
    managedfolder_file_name= 'model-archieve.zip'
    
    with saved_model.get_scoring_python_stream() as s:
        folder.upload_stream(folder_file_name, s)

Answers

  • mhollenb
    mhollenb Registered Posts: 1

    I have a similar use case, and tried to follow radiantly's approach. However, the saved_model I get does not have get_scoring_python_stream. Looking at the docs for that function, I see that "This works provided that you have the license to do so and that the model is compatible with Python scoring." What license is needed to access this and other scoring output functions?

Setup Info
    Tags
      Help me…