Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on August 24, 2023 8:02AM
Likes: 0
Replies: 2
Hi all,
How do you handle missing values and outliers in Dataiku? Any plugins or workflows you'd recommend for efficient data cleaning?
Thanks for your tips!
Hi @Miasm1
,
1) For missing values -> https://doc.dataiku.com/dss/latest/machine-learning/features-handling/index.html#features-handling you can handle as part of feature handling.
https://knowledge.dataiku.com/latest/ml-analytics/model-design/concept-feature-handling.html#handle-missing-values
2) For outlier on clustering models see : https://doc.dataiku.com/dss/latest/machine-learning/unsupervised/settings.html#outliers-detection
You can use code -> https://developer.dataiku.com/latest/tutorials/plugins/recipes-clipping-dataset/index.html
Or prepare recipe
https://doc.dataiku.com/dss/latest/preparation/processors/number-clipping.html
@AlexT
It's a bot.