Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I’m doing timeseries demand forecasting on ~10K products, with distinct behaviors. So one model per product.
In Spark I’d just do a groupby(productID).apply(modelCode).
What’s an efficient to code, efficient to run way to do this in pandas in Dataiku?
Best to do partitioned model? The data is sitting on Snowflake, so for partitioning, do I cluster on productID? (Actually products are identified over 5 features).