What is the purpose of feature selection, and what are some common approaches?

Feature selection seeks to reduce the feature space by eliminating candidate predictors that have little predictive power for the target of interest. In most situations, a parsimonious model is preferred, provided it still performs at a satisfactory level of accuracy. A simpler model usually results in clearer interpretation and shorter training times, in addition to having a reduced risk of overfitting. The following are some of the most common feature selection options:

  • Filter methods: These approaches use a statistical measure, such as correlation or p-value, between an input variable and the target as the criteria for inclusion or exclusion. A threshold is usually set by the researcher to eliminate variables that fall outside of the criteria to be retained, such as a correlation below 0.5. 
  • Wrapper methods: Wrapper methods describe approaches that create models from many different subsets of the candidate predictors and then ultimately choose the combination of variables that results in the best model among those tried. A common wrapper method is Recursive Feature Elimination (RFE), which seeks to recursively eliminate the least important features from the model until the desired number or proportion out of the original set of features remains. As wrapper methods fit multiple models using different combinations of the feature space, they can take a long time to complete if there are many input features. Forward/backward/stepwise selection in the regression context are also examples of wrapper methods. 
  • Automatic feature selection: Some algorithms, like LASSO, have a built-in feature selection approach by shrinking features that have no predictive ability all the way to zero, hence eliminating them from the model. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute