Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
How do I build a model on a partitioned dataset in Dataiku using the API?
I'm using the below code to develop the model. How do I modify this code to build a partitioned model?
"trainset" is partitioned data in the column "Market".
# client is a DSS API client
p = client.get_project("MYPROJECT")
# Create a new ML Task to predict the variable "target" from "trainset"
mltask = p.create_prediction_ml_task(
input_dataset="trainset",
target_variable="target",
ml_backend_type='PY_MEMORY', # ML backend to use
guess_policy='DEFAULT' # Template to use for setting default parameters
)
# Wait for the ML task to be ready
mltask.wait_guess_complete()
# Obtain settings, enable GBT, save settings
settings = mltask.get_settings()
settings.set_algorithm_enabled("GBT_CLASSIFICATION", True)
settings.save()
# Start train and wait for it to be complete
mltask.start_train()
mltask.wait_train_complete()
# Get the identifiers of the trained models
# There will be 3 of them because Logistic regression and Random forest were default enabled
ids = mltask.get_trained_models_ids()
for id in ids:
details = mltask.get_trained_model_details(id)
algorithm = details.get_modeling_settings()["algorithm"]
auc = details.get_performance_metrics()["auc"]
print("Algorithm=%s AUC=%s" % (algorithm, auc))
# Let's deploy the first model
model_to_deploy = ids[0]
ret = mltask.deploy_to_flow(model_to_deploy, "my_model", "trainset")
print("Deployed to saved model id = %s train recipe = %s" % (ret["savedModelId"], ret["trainRecipeName"]))
I appreciate any help you can provide.
Operating system used: Windows
Hi,
You can try the following before saving the mltask's settings:
settings.get_raw()['partitionedModel']['enabled'] = True
settings.save()
Hi,
You can try the following before saving the mltask's settings:
settings.get_raw()['partitionedModel']['enabled'] = True
settings.save()