predict using model in the lab inside a notebbok

Mohammed
Mohammed Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 43 ✭✭✭
edited July 2024 in Using Dataiku

I'm training a set of models as given below. How do I access the best model I selected at the end and use that model to make predictions in the notebook? I am able to access the saved models but not the models only present in the lab.

if trained_model_MAPE > ERROR_THRESHOLD:
    # Wait for the ML task to be ready
    mltask.wait_guess_complete()
    # Obtain settings, enable GBT, and save settings
    settings = mltask.get_settings()
    settings.set_algorithm_enabled("LEASTSQUARE_REGRESSION", True)
    # Iterate over all features in the dataset and set their use/rejection
#     settings.foreach_feature(handle_feature)
 
    features_to_reject = []
    def handle_feature(feature_name, feature_params):
        if feature_name not in current_features and feature_params["role"] == 'INPUT':
            features_to_reject.append(feature_name)
        return feature_params
 
    settings.foreach_feature(handle_feature)
    for feature_name in current_features:
        settings.use_feature(feature_name)
    for feature_name in features_to_reject:
        settings.reject_feature(feature_name)
 
 
    settings.save()
    mltask.start_train()
    mltask.wait_train_complete()
    # Get the identifiers of the trained models
    ids = mltask.get_trained_models_ids()
    mape_list = []
    for id in ids:
        details = mltask.get_trained_model_details(id)
        algorithm = details.get_modeling_settings()["algorithm"]
        mape = details.get_performance_metrics()["mape"]
        print(f"Algorithm={algorithm} MAPE={mape}")
        mape_list.append(mape)

#Select the best model 
best_model_index = pd.Series(mape_list).idxmin() 
# Deploy the best model
model_to_deploy = ids[best_model_index]


Operating system used: Windows


Operating system used: Windows


Operating system used: Windows

Answers

  • AdrienL
    AdrienL Dataiker, Alpha Tester Posts: 196 Dataiker
    edited July 2024

    There is no supported way to do this, this normally requires the model to be deployed to the flow (as a Saved Model version).

    You can try the following but it is unsupported, it may not work in some situations or break in later versions of DSS.

    from dataiku.doctor.posttraining.model_information_handler import PredictionModelInformationHandler
    predictor = PredictionModelInformationHandler.from_full_model_id(model_id).get_predictor()

    You can then use the predictor as you would with that of the saved model.

Setup Info
    Tags
      Help me…