Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on July 22, 2015 9:21PM
Likes: 7
Replies: 7
I’m doing timeseries demand forecasting on ~10K products, with distinct behaviors. So one model per product.
In Spark I’d just do a groupby(productID).apply(modelCode).
What’s an efficient to code, efficient to run way to do this in pandas in Dataiku?
Best to do partitioned model? The data is sitting on Snowflake, so for partitioning, do I cluster on productID? (Actually products are identified over 5 features).
Please post a new thread. This thread is 8 years old.