Difference between 'evaluate' recipe and python functions

Solved!
tamarapepping
Level 1
Difference between 'evaluate' recipe and python functions
Hi,

I have a question about the metric output of the 'evaluate' recipe. I created a RF model and then I make a prediction on 'new data'. Then I use the 'evaluate' recipe to create the extra 'prediction_correct' column and to get the output file with the metrics. In this file you can find the accuracy, precision, recall, auc etc. The scores were much higher then expected and when I calculated the accuracy etc in a Jupyter Notebook the scores where completely different. What am I doing wrong?
0 Kudos
1 Solution
matthias_funke
Dataiker Alumni
EDIT: The bug is now fixed since DSS 4.1.3

We have identified a bug and are working to resolve the issue.

View solution in original post

4 Replies
matthias_funke
Dataiker Alumni

Hi Tamara, could you explain what exactly you did when you created the model, and how you used Jupyter in this context?



I want to make you sure you used the same model and the same input data. 



You may even want to post your code here, if it does not contain anything too sensitive.



 

0 Kudos
tamarapepping
Level 1
Author
I will try to give an overview of the steps I took. However the data is sensitive, so I cannot post any screenshots or code.
- I created a model (RF) in the analyses-menu
- I deployed the model to the flow (I selected a train set and gave the model a name)
- Then I go to the dataset I want to predict on, click on the dataset and select the 'predict' recipe
- I choose the input dataset (the dataset I want to predict on) and then select the name of the model. The recipe is created
- Then I click on the scored dataset, and use the 'evaluate' recipe. I select the model and I use the scored dataset as an input. There is no difference if I select the scored dataset or the orginal dataset here.
- The 'evaluate' recipe shows two datasets, one containing the metrics (recall, accuracy, etc).
- Against our expectations, these metrics were quite high. So I investigated the other dataset that the evaluate recipe gives. I loaded this dataset in the Jupyter Notebook and used the 'recall_score(), precision_score() etc from sklearn. The scores are then different from the metrics. This is also the case if I export the file to excel and calculate the confusionmatrix there.

I hope you can help me ๐Ÿ™‚
0 Kudos
matthias_funke
Dataiker Alumni
EDIT: The bug is now fixed since DSS 4.1.3

We have identified a bug and are working to resolve the issue.
Clรฉment_Stenac
Hi,

The bug is now fixed in DSS 4.1.3. Thanks for reporting this !
0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku