Community Conundrums are live! Learn more

Difference between 'evaluate' recipe and python functions

Level 1
Difference between 'evaluate' recipe and python functions
Hi,

I have a question about the metric output of the 'evaluate' recipe. I created a RF model and then I make a prediction on 'new data'. Then I use the 'evaluate' recipe to create the extra 'prediction_correct' column and to get the output file with the metrics. In this file you can find the accuracy, precision, recall, auc etc. The scores were much higher then expected and when I calculated the accuracy etc in a Jupyter Notebook the scores where completely different. What am I doing wrong?
0 Kudos
4 Replies
Dataiker
Dataiker
EDIT: The bug is now fixed since DSS 4.1.3

We have identified a bug and are working to resolve the issue.
Dataiker
Dataiker
Hi,

The bug is now fixed in DSS 4.1.3. Thanks for reporting this !
0 Kudos
Dataiker
Dataiker

Hi Tamara, could you explain what exactly you did when you created the model, and how you used Jupyter in this context?



I want to make you sure you used the same model and the same input data. 



You may even want to post your code here, if it does not contain anything too sensitive.



 

0 Kudos
Level 1
Author
I will try to give an overview of the steps I took. However the data is sensitive, so I cannot post any screenshots or code.
- I created a model (RF) in the analyses-menu
- I deployed the model to the flow (I selected a train set and gave the model a name)
- Then I go to the dataset I want to predict on, click on the dataset and select the 'predict' recipe
- I choose the input dataset (the dataset I want to predict on) and then select the name of the model. The recipe is created
- Then I click on the scored dataset, and use the 'evaluate' recipe. I select the model and I use the scored dataset as an input. There is no difference if I select the scored dataset or the orginal dataset here.
- The 'evaluate' recipe shows two datasets, one containing the metrics (recall, accuracy, etc).
- Against our expectations, these metrics were quite high. So I investigated the other dataset that the evaluate recipe gives. I loaded this dataset in the Jupyter Notebook and used the 'recall_score(), precision_score() etc from sklearn. The scores are then different from the metrics. This is also the case if I export the file to excel and calculate the confusionmatrix there.

I hope you can help me 🙂
0 Kudos
Labels (1)