Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi all,
How do you handle missing values and outliers in Dataiku? Any plugins or workflows you'd recommend for efficient data cleaning?
Thanks for your tips!
Hi @Miasm1 ,
1) For missing values -> https://doc.dataiku.com/dss/latest/machine-learning/features-handling/index.html#features-handling you can handle as part of feature handling.
https://knowledge.dataiku.com/latest/ml-analytics/model-design/concept-feature-handling.html#handle-...
2) For outlier on clustering models see : https://doc.dataiku.com/dss/latest/machine-learning/unsupervised/settings.html#outliers-detection
You can use code -> https://developer.dataiku.com/latest/tutorials/plugins/recipes-clipping-dataset/index.html
Or prepare recipe
https://doc.dataiku.com/dss/latest/preparation/processors/number-clipping.html
@AlexT It's a bot.