Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Pearson coefficients for Partitioned Models

tgb417
Pearson coefficients for Partitioned Models

When working partitioned models in DSS, during training examples, as described at this URL

https://academy-content.dataiku.com/latest/courses/partitioned-models/partitioned-models.html#evalua...

I see that the Pearson Coefficients for "all partitions" is really small in comparison to each of the subpartitions.  I don't know why this would be.  If they were close I'd get it.  But this big a difference in values 0.19 for the All Partitions value, vs 0.90 for each independent partition seems odd.

Dataiku DSS Screenshot model summary page showing two partitions florida & california each with a Pearson coeff of 0.90.  However the All partition value for the Pearson coeff. is showing ~0.19.  There is an added text box asking "Why is the All Partitions, Pearson Coefficient for “All Partitions” so small in comparison to the values of each of the sub partitions?"Dataiku DSS Screenshot model summary page showing two partitions florida & california each with a Pearson coeff of 0.90. However the All partition value for the Pearson coeff. is showing ~0.19. There is an added text box asking "Why is the All Partitions, Pearson Coefficient for “All Partitions” so small in comparison to the values of each of the sub partitions?"

 I'm wondering if I don't know the math well enough, or if I've discovered a bug.

This problem is evident in the image in the training materials.

cc: @Alex_Reutter 

--Tom
0 Kudos
0 Replies