How to Automate Clustering with Anomaly Detection for Each Partition in Dataiku?

raihanhd · January 2025

Hello Dataiku Community,

I’m working on a project where I’ve partitioned my dataset by category and year. For example, my partitions look like this:

Category A | 2021
Category A | 2022
Category A | 2023
Category A | 2024
Category B | 2021
Category B | 2022
Category B | 2023
Category B | 2024
Category C | 2021
Category C | 2022
Category C | 2023
Category C | 2024

Now, I want to apply anomaly detection clustering automatically for each partition (e.g., one clustering model for “Category A | 2021,” another for “Category B | 2022,” and so on).

My Questions:

Is it possible to automate clustering with anomaly detection for each partition directly in Dataiku without doing it manually for each combination of category and year?
If automation is possible, what’s the best approach to set this up? For example:
- Can I leverage the Partitioning feature for anomaly detection clustering?
- Are there specific plugins, visual recipes, or scripting options to streamline this process?

I’d appreciate any guidance or examples to help me efficiently cluster my data while handling multiple partitions.

Thank you in advance for your help!

Alexandru · December 2025

Hi,
Yes using a partitioned model should work here you can train a model for each partition and score partitons later with the relevant model

https://doc.dataiku.com/dss/latest/machine-learning/partitioned.html#limitations

How to Automate Clustering with Anomaly Detection for Each Partition in Dataiku?

My Questions:

Answers

Categories

Setup Info

Tags