## Sign up to take part

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

This website uses cookies. By clicking OK, you consent to the use of cookies. Read our cookie policy.

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Decision Tree Interpretation

Solutions shown first - Read whole discussion

5 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

Probabilities are the probabilities of each class as predicted by tree, whereas target classes is the distribution of data in the training set corresponding to the given tree node.

Hope this helps!

Kim

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Could you detail the explanation a bit more?

To get a probability of a class I would have to give an input X that contains values for the features (I assume you use a predict_proba(X) method on a DecisionTreeClassifier from sklearn under the hood). When you are on a tree node, which values/inputs are used to calculate this class probability? Are these all the datapoints that the node contains and its averaged prediction probabilities, or some averaged values for X based on the datapoints that gives one prediction class probability?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

So the probabilities that you see under TARGET CLASSES are derived from the proportion of samples in the node that belong to each class.

The probabilities under PROBABILITIES are what the model would predict if the node was final (i.e, a leaf). All the observations falling into that node would receive the same probability prediction so there is no need to take any average.

I hope this helps!

Best,

Regards

Jean-Yves

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for the explanations about the interpretation of the tree itself.

But in the case of the random forest model, where it considers several trees for the final decision. Why does the DSS show only 2 trees? Are they just examples of trees used?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for your question! DSS will only show a limited number of trees when the number of nodes in the trees is too high. Note that it's not always 2.

Jean-Yves