Can someone tell me that how can I apply LabelEncoding for categorical variable?
I'll borrow the answer from one our devs.
- Create a file in your project library, under python folder. call it my_label_encoder.py
- Populate with this modification of the scikit-learn LabelEncoder
from sklearn import preprocessing
def transform(self, X):
transformed_X = super(MyLabelEncoder, self).transform(X)
return transformed_X.reshape(transformed_X.shape, 1)
- Then in the Design tab of the ML task of your analysis, under feature handling, select the relevant feature, check that it's categorical.
- Under Category Handling select "custom preprocessing" and put this
from my_label_encoder import MyLabelEncoder
processor = MyLabelEncoder()
The reason to do it this way, is that the vanilla encoder returns a one-dimensional array, but DSS expects a 2-D array. As you can see the modification of the encoder just reshapes the output.