Many small models
cwcomiskey
Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 4 ✭✭✭
I am trying to create a python recipe for anomaly detection using rrcf (robust random cut forest package). I need many models (~20K, customers), each for a small time series (~200 observations). In Dataiku what is the best way to fit MANY models to small (time series) datasets? For example, one option is to partition on customer, but this seems to be too many partitions.
Thanks.
Answers
-
Hi,
You could indeed partition on customer - see how it runs!
Or you could loop through each customer' time series observations and fit a separate RRCF one at a time.
Best,
Pat