Many small models

cwcomiskey
Level 2
Many small models

I am trying to create a python recipe for anomaly detection using rrcf (robust random cut forest package). I need many models (~20K, customers), each for a small time series (~200 observations). In Dataiku what is the best way to fit MANY models to small (time series) datasets? For example, one option is to partition on customer, but this seems to be too many partitions. 

Thanks.

0 Kudos
1 Reply
pmasiphelps
Dataiker

Hi,

You could indeed partition on customer - see how it runs!

Or you could loop through each customer' time series observations and fit a separate RRCF one at a time. 

Best,

Pat

0 Kudos