Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Isolation forest

seema
Level 1
Isolation forest

I used anomaly detection from AutoML on my dataset to create a model. How to interpret metrics for isolation forest? e.g. what do the values for silhouette = .437 and inertia  = 0 signify?

2 Replies
aggelitoo
Level 1

I would love an answer to this as well.

Googling how to generally interpret anomaly scores using isolation forest seems to indicate that values close to one are anomalous, but using the dataiku isolation forest for anomaly detection seems to return anomalies with values closer to zero. How does dataiku's isolation forest method differ from the more general approach?

//August

0 Kudos
CoreyS
Community Manager
Community Manager
These two metrics, silhouette and inertia, give us a notion of distance between the clusters and within the clusters. They are designed for more traditional clustering algorithms (like k-means). You can read more details about them in the links above.
In case of Isolation forest, which is used for anomaly detection, the idea behind is that an anomaly is easier to separate using random split trees than other points. So each sample is scored using this notion of number of split needed and then a threshold is used to determine whether or not it is an anomaly. We use the Isolation Forest coming from scikit, and their threshold is based on the contamination ratio (which is the expected portion of anomalies in the data).
You can thus see that for this particular algorithm, the two metrics above are not very helpful as there are not really a notion of "clusters" here. As for how to evaluate your result, as of any other use case in unsupervised learning, you will need to visualise and "manually" examine the detected anomalies using your domain knowledge to judge the quality of the predictions.
I hope that this is now clearer for you, do not hesitate to reply if you have further questions.
Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
0 Kudos