How does SVM adjust for classes that cannot be linearly separated?

While the maximum margin classifier is optimal in theory, in practice, observations cannot be perfectly separated in most classification problems. Therefore, expecting to define a hyperplane that perfectly separates between classes with no misclassifications is not realistic. Instead of using a hard margin classifier, SVM uses a soft margin that allows for some misclassifications in the training process. The extent to which the algorithm is allowed to have misclassifications is controlled by a regularization parameter C and is typically tuned during cross validation. This issue is another example of the bias/variance tradeoff that occurs throughout machine learning, as the soft margin classifier introduces some bias with the hope of reducing variance when classifying future observations. In the case of a soft margin classifier, the support vector includes observations both on and within the margins. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad