Model feature column reduction

bnichols
bnichols Registered Posts: 1 ✭✭✭

I have created a random forest model using the lasso regression feature reduction. This model took around 10 minutes to run. I need this model to run a lot faster and was wondering how to see what feature columns the lasso regression removed from the model, so I can remove these columns manually from my next model to achieve the same result, but a lot faster.

Any help would be appreciated

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 297 Dataiker

    Hi @bnichols
    .,

    After models have been trained, you can check what features have been used by the model, and what features have been rejected under "Features".  

    Screenshot 2023-07-26 at 5.05.45 PM.png

    You can also retrieve a list of all the selected columns using the DSS Python API by obtaining a handle to the ML task: https://developer.dataiku.com/latest/concepts-and-examples/ml.html#obtaining-a-handle-to-an-existing-ml-task.

    Note, with feature reduction, not having the features as input will generally result in an error, for example, when scoring the dataset, you will receive an error of columns missing. And, if you were to retrain the model with new data a new subset of potential features may be selected the next time. If you want to freeze the variables used you would need to select the features desired and then retrain, but this wouldn't work if feature reduction is selected.

Setup Info
    Tags
      Help me…