Selecting features in ML task

Level 3
Selecting features in ML task

I'm training a set of models as given below. I want to include only one variable 'feature1' for training. but it appears that all the columns in the data are used for training. How do I include only this feature while training?


if trained_model_MAPE > ERROR_THRESHOLD:

    # Wait for the ML task to be ready


    # Obtain settings, enable GBT, and save settings

    settings = mltask.get_settings()

    settings.set_algorithm_enabled("GBT_REGRESSION", True)


    # Start training and wait for it to be complete



    # Get the identifiers of the trained models

    # There will be 3 of them because Logistic regression and Random forest were default enabled, plus GBT enabled above

    ids = mltask.get_trained_models_ids()

    mape_list = []

    for id in ids:

        details = mltask.get_trained_model_details(id)

        algorithm = details.get_modeling_settings()["algorithm"]

        mape = details.get_performance_metrics()["mape"]

        print(f"Algorithm={algorithm} MAPE={mape}")



Operating system used: Windows

0 Kudos
3 Replies

Like for algorithm, some other features have been enabled by default. You can use reject_feature to disable them.

For instance, using foreach_feature to iterate on all features:



features_to_use = ['feature1']
features_to_reject = []
def handle_feature(feature_name, feature_params):
    if feature_name not in features_to_use and feature_params["role"] == 'INPUT':
    return feature_params

for feature_name in features_to_use:
for feature_name in features_to_reject:




0 Kudos
Level 3

@AdrienL I'm facing the following error with the above solution 

DataikuException: com.dataiku.dip.exceptions.DSSInternalErrorException: Internal error, caused by: NullPointerException: null

---------------------------------------------------------------------------HTTPError                                 Traceback (most recent call last)/opt/dataiku-dss-12.5.1/python/dataikuapi/ in _perform_http(self, method, path, params, body, stream, files, raw_body, headers)   1450                     headers=headers)-> 1451             http_res.raise_for_status()   1452             return http_res
/opt/dataiku-dss-12.5.1/python39.packages/requests/ in raise_for_status(self)   1020         if http_error_msg:-> 1021             raise HTTPError(http_error_msg, response=self)   1022HTTPError: 500 Server Error: Server Error for url:
During handling of the above exception, another exception occurred:
DataikuException                          Traceback (most recent call last)<ipython-input-367-3e542e8df9de> in <module>     17     settings.foreach_feature(handle_feature)     18---> 19     20     21     # Start training and wait for it to be complete/opt/dataiku-dss-12.5.1/python/dataikuapi/dss/ in save(self)    600         """    601--> 602         self.client._perform_empty(    603                 "POST", "/projects/%s/models/lab/%s/%s/settings" % (self.project_key, self.analysis_id, self.mltask_id),    604                 body = self.mltask_settings)/opt/dataiku-dss-12.5.1/python/dataikuapi/ in _perform_empty(self, method, path, params, body, files, raw_body)   1459   1460     def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):-> 1461         self._perform_http(method, path, params=params, body=body, files=files, stream=False, raw_body=raw_body)   1462   1463     def _perform_text(self, method, path, params=None, body=None,files=None, raw_body=None):/opt/dataiku-dss-12.5.1/python/dataikuapi/ in _perform_http(self, method, path, params, body, stream, files, raw_body, headers)   1456             except ValueError:   1457                 ex = {"message": http_res.text}-> 1458             raise DataikuException("%s: %s" % (ex.get("errorType", "Unknown error"), ex.get("detailedMessage", ex.get("message", "No message"))))   1459   1460     def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):DataikuException: com.dataiku.dip.exceptions.DSSInternalErrorException: Internal error, caused by: NullPointerException: null

0 Kudos

Yeah I read the doc too fast, it states the handle_feature function is supposed to return the feature parameters. Also, one should only reject input features, otherwise we risk rejecting the target (not a good idea). I rewrote the code above and rearranged it for clarity.

0 Kudos