API analogous for "Drop existing sets, recompute new ones" when retraining a model.

Options
gnaldi62
gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron

Hi,

is there a way to apply via API and code the same of the option "Drop existing sets, recompute new ones" as

per below image ? Thanks Regards.

Giuseppe

Immagine 2021-02-23 164200.png

Best Answer

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    edited July 17 Answer ✓
    Options

    Hello Giuseppe,
    There is no supported way to do that.
    However with the following code snippet you can reach the same goal. Beware that this code is based on DSS internals and might stop working in the future.
    I added a request to add the option in the python api in our Backlog.

    p = client.get_project('MYPROJECT')
    ml_task = p.get_ml_task("KGmcIliw", "PW0l9Nm8")
    ml_task_settings = ml_task.get_settings()
    ml_task_settings.get_raw()['splitParams']['instanceIdRefresher'] += 1
    ml_task_settings.save()
    ml_task.start_train()

    Best,
    Arnaud

Answers

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Many thanks. Regards.

    Giuseppe

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Hi, sorry to be back again. but this doesn't seem to work properly... I print the value and I see it increasing by one, but the train fails. If I do the same from the UI ("drop....") the training works. But the strange is that sometimes after a few trials it start working. Should it have a specific value to work ?

    Txs. Rgds.

    Giuseppe

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    def train_deploy_models(all_saved_models, models_to_deploy):
    exit_status = 0
    for smod in all_saved_models:
    if smod['name'] in models_to_deploy:
    print("Training model %s" % smod['name'])
    algorithm_index = models_to_deploy.index(smod['name']) + 1
    algorithm_to_deploy = models_to_deploy[algorithm_index]
    current_model = smod['id']
    current_saved_model = this_project.get_saved_model(current_model)
    current_ml_task = current_saved_model.get_origin_ml_task()
    ml_task_settings = current_ml_task.get_settings()
    print(ml_task_settings.get_raw()['splitParams']['instanceIdRefresher'])
    ml_task_settings.get_raw()['splitParams']['instanceIdRefresher'] += 1
    print(ml_task_settings.get_raw()['splitParams']['instanceIdRefresher'])
    ml_task_settings.save()
    nr_attempts = 0
    while nr_attempts < 2:
    try:
    list_trained = current_ml_task.train()
    for jj in list_trained:
    current_algorithm = current_ml_task.get_trained_model_details(jj).get_raw()["modeling"]["algorithm"]
    if current_algorithm == algorithm_to_deploy:
    current_ml_task.redeploy_to_flow(jj, saved_model_id = current_model)
    break
    except:
    exit_status = 1
    nr_attempts += 1
    #raise Exception("MOD-01: Error with training the model %s " % smod['name'])
    if exit_status > 0:
    print("Error with model %s" % smod['name'])
    return(exit_status)

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    Options

    Hello,
    Could you please share the error you get and attach the logs of the training ?
    Thanks

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    edited July 17
    Options
    Hi, here below one of the logs. We know that there is a null column, but if we retrain the
    model by checking the "Drop existing sets, recompute new ones" checkbox, the train run successfully.
    So we'd need to do is the same of the UI but programatically. Txs. Rgds. Giuseppe
    ...
    Traceback (most recent call last): File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/server.py", line 46, in serve ret = api_command(arg) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/dkuapi.py", line 45, in aux return api(**kwargs) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/commands.py", line 271, in train_prediction_models_nosave train_df = df_from_split_desc(split_desc, "train", preprocessing_params['per_feature'], core_params["prediction_type"]) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/utils/split.py", line 59, in df_from_split_desc df = df_from_split_desc_no_normalization(split_desc, split, feature_params, prediction_type) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/utils/split.py", line 19, in df_from_split_desc_no_normalization return load_df_no_normalization(f, split_desc["schema"], feature_params, prediction_type) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/utils/split.py", line 28, in load_df_no_normalization prediction_type=prediction_type) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/utils/__init__.py", line 78, in ml_dtypes_from_dss_schema feature_params["role"], prediction_type) File "/mnt/disks/datadir/dataiku-dss-8.0.2/python/dataiku/doctor/utils/__init__.py", line 60, in ml_dtype_from_dss_column raise safe_exception(ValueError, u"Cannot treat column {} as numeric ({})".format(safe_unicode_str(schema_column["name"]), reason)) ValueError: Cannot treat column BDFL_SAL_D_ASP_MTG_New_LAG12_PROXY_CONV as numeric (its type is string) [2021/02/24-11:40:08.361] [KNL-python-single-command-kernel-monitor-582175] [INFO] [dku.kernels] - Process done with code 0 [2021/02/24-11:40:08.363] [KNL-python-single-command-kernel-monitor-582175] [INFO] [dip.tickets] - Destroying API ticket for analysis-ml-FR_SF_SCORE-9bdcDC3 on behalf of terrico [2021/02/24-11:40:08.363] [KNL-python-single-command-kernel-monitor-582175] [WARN] [dku.resource] - stat file for pid 31388 does not exist. Process died? [2021/02/24-11:40:08.364] [KNL-python-single-command-kernel-monitor-582175] [INFO] [dku.resourceusage] - Reporting completion of CRU:{"context":{"type":"ANALYSIS_ML_TRAIN","authIdentifier":"terrico","projectKey":"FR_SF_SCORE","analysisId":"xYiPqA2f","mlTaskId":"ZqCGSjjW","sessionId":"s21"},"type":"LOCAL_PROCESS","id":"0lKuN8MwsQuacNwm","startTime":1614166805429,"localProcess":{"pid":31388,"commandName":"/mnt/disks/datadir/dataiku_data/bin/python","cpuUserTimeMS":10,"cpuSystemTimeMS":0,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":10,"cpuCurrent":0.0,"vmSizeMB":21,"vmRSSMB":4,"vmHWMMB":4,"vmRSSAnonMB":2,"vmDataMB":2,"vmSizePeakMB":21,"vmRSSPeakMB":4,"vmRSSTotalMBS":0,"majorFaults":0,"childrenMajorFaults":0}} [2021/02/24-11:40:08.364] [MRT-582169] [INFO] [dku.kernels] - Getting kernel tail [2021/02/24-11:40:08.368] [MRT-582169] [INFO] [dku.kernels] - Trying to enrich exception: com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to train : <type 'exceptions.ValueError'> : Cannot treat column BDFL_SAL_D_ASP_MTG_New_LAG12_PROXY_CONV as numeric (its type is string) from kernel com.dataiku.dip.analysis.coreservices.AnalysisMLKernel@29e58718 process=null pid=?? retcode=0 [2021/02/24-11:40:08.368] [MRT-582169] [WARN] [dku.analysis.ml.python] - Training failed com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to train : <type 'exceptions.ValueError'> : Cannot treat column BDFL_SAL_D_ASP_MTG_New_LAG12_PROXY_CONV as numeric (its type is string) at com.dataiku.dip.io.SocketBlockLinkInteraction.throwExceptionFromPython(SocketBlockLinkInteraction.java:302) at com.dataiku.dip.io.SocketBlockLinkInteraction$AsyncResult.checkException(SocketBlockLinkInteraction.java:215) at com.dataiku.dip.io.SocketBlockLinkInteraction$AsyncResult.get(SocketBlockLinkInteraction.java:190) at com.dataiku.dip.io.SingleCommandKernelLink$1.call(SingleCommandKernelLink.java:208) at com.dataiku.dip.analysis.ml.prediction.PredictionTrainAdditionalThread.process(PredictionTrainAdditionalThread.java:74) at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:143) [2021/02/24-11:40:10.878] [FT-TrainWorkThread-7zCgQOd3-582158] [INFO] [dku.analysis.ml.python] T-ZqCGSjjW - Processing thread joined ... [2021/02/24-11:40:10.879] [FT-TrainWorkThread-7zCgQOd3-582158] [INFO] [dku.analysis.ml.python] T-ZqCGSjjW - Joining processing thread ... [2021/02/24-11:40:10.880] [FT-TrainWorkThread-7zCgQOd3-582158] [INFO] [dku.analysis.ml.python] T-ZqCGSjjW - Processing thread joined ... [2021/02/24-11:40:10.880] [FT-TrainWorkThread-7zCgQOd3-582158] [INFO] [dku.analysis.prediction] T-ZqCGSjjW - Train done [2021/02/24-11:40:10.881] [FT-TrainWorkThread-7zCgQOd3-582158] [INFO] [dku.analysis.prediction] T-ZqCGSjjW - Train done [2021/02/24-11:40:10.889] [FT-TrainWorkThread-7zCgQOd3-582158] [INFO] [dku.analysis.prediction] T-ZqCGSjjW - Publishing mltask-train-done reflected event 

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    And here a snapshot from the analysis. The failed training session is the one run from the Python program (the function I've sent you earlier), while the last good one has run after from the UI we checked that checkbox. GN

    AAAAAA.png

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    Options

    Can you check and share what are the feature handling for the failing feature (ie BDFL_SAL_D_ASP_MTG_New_LAG12_PROXY_CONV) in the current_ml_task variable in your code and in the UI ? I suspect it has changed.

    When using the UI to retrain you are using the latest settings in the design tab. Whereas in your api training your are taking the settings that were used when you train & deployed the original model. You should not expect the same behavior if some design settings has changed.

    To make sure your splits where recomputed you can check which split was used by going to the model result page > Training information > Train & test sets > Generated.

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Hi, here the configuration from the API (it's for another variable, but the problem is exactly the same). This is the configuration BEFORE running the snippet you originally sent :

    'BDFL_ACT_D_RR_ASP_3M_PROXY_CONV': {
    'generate_derivative': False,
    'numerical_handling': 'REGULAR',
    'missing_handling': 'IMPUTE',
    'missing_impute_with': 'MEAN',
    'impute_constant_value': 0.0,
    'rescaling': 'AVGSTD',
    'quantile_bin_nb_bins': 4,
    'binarize_threshold_mode': 'MEDIAN',
    'binarize_constant_threshold': 0.0,
    'role': 'INPUT',
    'type': 'NUMERIC',
    'customHandlingCode': '',
    'customProcessorWantsMatrix': False,
    'sendToInput': 'main'}

    and this is AFTER:

    'BDFL_ACT_D_RR_ASP_3M_PROXY_CONV': {
    'category_handling': 'DUMMIFY',
    'missing_handling': 'NONE',
    'missing_impute_with': 'MODE',
    'dummy_clip': 'MAX_NB_CATEGORIES',
    'cumulative_proportion': 0.95,
    'min_samples': 10,
    'max_nb_categories': 100,
    'max_cat_safety': 200,
    'nb_bins_hashing': 1048576,
    'dummy_drop': 'NONE',
    'role': 'REJECT',
    'type': 'CATEGORY',
    'customHandlingCode': '',
    'customProcessorWantsMatrix': False,
    'sendToInput': 'main'}

    And here what in the UI for the same model:

    FEAT_UI.png

     What is not clear to me how to intercept the new UI settings from the API (we have thousands of such features). Rgds. Giuseppe

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Hi, it seems working now. I've removed failed sessions, added a sleep after the save of the settings and retried to

    train the model. Don't know which one of these have fixed the issue, but now all the models can be retrained and

    redeployed.

    Txs. Rgds

    Giuseppe

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Just to let you know....we managed to make the code working, but a minimal manual intervention was needed. The steps we follow are:

    1) duplicate the project;

    2) from the code populate the train and test datasets and retrain (with the snippet you suggested) the saved models a first time;

    3) if some training fail, we go into the analysis and from the GUI we remove the failed sessions;

    4) we then go back to the code and rerun the piece of code which retrain the saved models.

    This seems to work almost always (i.e. unless the model is really bad).

    Rgds. Giuseppe

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    Options

    My guess is that your input dataset has changed and that the feature handling needed to be updated. The start_train method of the api will not do it automatically whereas opening the ui and launching the training will.

    When creating an ML task on a dataset with an empty column the column will automatically be rejected and the role will be 'REJECT'. In the origin ml task settings for 'BDFL_ACT_D_RR_ASP_3M_PROXY_CONV' you shared the feature is not rejected. So it probably means that when you first trained the column it was not mostly empty. Now your new settings show that it is now rejected which means that your column is probably empty. So I think your column data has significantly changed.

    When you use the UI the guessing system will automatically be called when opening the analysis. This does not happen when calling ml_task.get_settings(). Therefore if your dataset changed and you train with the UI you don't have any problem but when you train with the api without updating the feature handling it will fail.

    You can apply the guessing system with the api guess method.

    In the steps you just mentioned I therefore think that the step that make it works is opening the analysis that failed (ie. 3))

    Best,
    Arnaud

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    OK. Is there a guess option which does not change the already chosen algorithms ? The doc speaks about different levels but it is not clear what each one is doing (when I applied it, it changed the selection of the algorithms). Thanks. Rgds, Giuseppe

  • arnaudde
    arnaudde Dataiker Posts: 52 Dataiker
    edited July 17
    Options

    There is no option that will keep your algorithm settings. But you can save the algorithm settings (and any settings that should not change) and override the returned ml task with those. Here is a code sample

    ml_task = p.get_ml_task("2oftsf46", "ITJzkpmW")
    algorithm_settings = ml_task_settings.get_raw()["modeling"]
    ml_task.guess()
    ml_task_settings = ml_task.get_settings()
    ml_task_settings.get_raw()["modeling"] = algorithm_settings
    ml_task_settings.save()
  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron
    Options

    Great: it worked! (just added a line to your code before assignment to algorithms_settings)

    Many thanks. Regards,

    Giuseppe

Setup Info
    Tags
      Help me…