In the case described above where a transformation is necessary in order to define a hyperplane to separate between classes, the kernel trick allows SVM to form a decision boundary in higher dimensional space without actually going through the computation of transforming the original data. It does this by using a similarity measure between observations that is found from the kernel, or dot product, of observations in the original feature space. This allows SVM to maintain its computational efficiency when there are many features.
What is the kernel trick in SVM?
Help us improve this post by suggesting in comments below:
– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic
Partner Ad