You now have until September 15th to submit your use case or success story to the 2022 Dataiku Frontrunner Awards!ENTER YOUR SUBMISSION

Multiclass to two-class / binary classification problem

pal
Level 1
Multiclass to two-class / binary classification problem

I have a class-imbalance in data.

A - 59%

B - 33%

C and D each 4%

I want to use these as target classes and convert this into two-class problem and classify as A or not A (will have B, C, D)

I tried using " manually edit the mapping" on the target tab in model design. I assigned 1 to A and 0 to B,C,D.

However this does not work. It continues to give error as

Training failed

Read the logs
Failed to train : <class 'ValueError'> : This is not a binary classification, found 4 classes
 
The only solution was to create a new table, add a column and populate it with 1 and 0 as required and then use it as target class. This required a new recipe and a new dataset.
 
Is there a way to avoid this? Any setting in the design page with allows me to map multiclass values of a target class to 0 and 1 so that I can proceed to two-class classification on the same table/dataset?
 
Many thanks
Pallavee

Operating system used: VM runing on Windows
0 Kudos
1 Reply
Emma
Dataiker
Dataiker

Hey @pal,

You can accomplish this type of data preparation for modelling from within the AutoML Lab which eliminates the need for an additional recipe/output dataset. Navigate to the Script Tab along the top bar and then use a 'Find and Replace' processor to map A = 1, and B,C,D = 0. 

Screenshots attached. 

Hope that helps,

Emma

0 Kudos