What are the options for reporting feature importance from a decision-tree based model?

For any decision-tree based method, feature importance can be measured in a couple of ways. The most common approach is based on how much an attribute contributes to the construction of each decision tree during the training process. The most important features are used in the top split points on a given decision tree. The numeric measure of purity/impurity depends on the loss function used, but the same general intuition holds for both regression and classification. An overall measure of importance for each feature is found from averaging their importance across the entire ensemble.

Values for feature importances can be extracted from a fitted GBM model in most software packages, such as using the feature_importances_ attribute in Python. Another way to interpret variable importance is a permutation-based approach, where after the model is fit, the values of each attribute are randomly shuffled, and the most influential features are those in which altering its values leads to the largest drop-off in model performance. The permutation method is a model agnostic approach for identifying important predictors and is an asset in terms of improving interpretation in black box machine learning. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute