Many small models

Options
cwcomiskey
cwcomiskey Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 4 ✭✭✭

I am trying to create a python recipe for anomaly detection using rrcf (robust random cut forest package). I need many models (~20K, customers), each for a small time series (~200 observations). In Dataiku what is the best way to fit MANY models to small (time series) datasets? For example, one option is to partition on customer, but this seems to be too many partitions.

Thanks.

Answers

  • pmasiphelps
    pmasiphelps Dataiker, Dataiku DSS Core Designer, Registered Posts: 33 Dataiker
    Options

    Hi,

    You could indeed partition on customer - see how it runs!

    Or you could loop through each customer' time series observations and fit a separate RRCF one at a time.

    Best,

    Pat

Setup Info
    Tags
      Help me…