Computer Vision (1)
Generative AI (2)
Machine Learning Basics (18)
Deep Learning (52)
- DL Basics (16)
- DL Architectures (17)
  - Feedforward Network / MLP (2)
  - Sequence models (6)
  - Transformers (9)
- DL Training and Optimization (17)
Natural Language Processing (27)
- NLP Data Preparation (18)
Supervised Learning (115)
- Regression (41)
  - Linear Regression (26)
  - Generalized Linear Models (9)
  - Regularization (6)
- Classification (70)
  - Logistic Regression (10)
  - Support Vector Machine (9)
  - Ensemble Learning (24)
  - Other Classification Models (9)
  - Classification Evaluations (9)
Unsupervised Learning (55)
- Clustering (37)
  - Distance Measures (9)
  - K-Means Clustering (9)
  - Hierarchical Clustering (3)
  - Gaussian Mixture Models (5)
  - Clustering Evaluations (6)
- Dimensionality Reduction (9)
Statistics (34)
Data Preparation (35)
- Feature Engineering (30)
- Sampling Techniques (5)

What is the basic idea of Support Vector Machine (SVM) and Maximum Margin?

Updated: March 29, 2024

SVM is an algorithm typically used for classification (though it can also be extended to regression) that seeks to determine a decision boundary by maximizing the distance between points of different classes. In one-dimensional space, the boundary is formed by a point; in 2D space, a line; and in higher dimensional space, a hyperplane. Once a classification boundary is determined, future observations that reside on one side of the boundary are classified in the first class, and points on the other side are assigned to the other class.

The distance between the line (or hyperplane) and the observations closest to it is referred to as the maximum margin, and in the case where observations are linearly separable, the maximum margin defines the optimal decision boundary. The points closest to the hyperplane are called support vectors, as they are the observations that ultimately support, or define, the hyperplane, hence the name “Support Vector Machine.” If they were shifted in the higher dimensional space by a small amount, the hyperplane would adjust accordingly.

In cases where the data is not linearly separable, SVMs can still be effective through the use of kernel functions. A kernel function transforms the input data into a higher-dimensional space where a linear separation is possible. Common kernels include gaussian, polynomial, radial basis function (RBF), and sigmoid.

The regularization parameter in SVM balances the margin maximization and loss. A lower value of regularization strength implies a larger margin, allowing more misclassifications (softer margin), while a higher value implies fewer misclassifications (harder margin).

SVMs are known for their effectiveness in high-dimensional spaces and when there is a clear margin of separation in the data. They are less effective on very large datasets due to their higher computational cost and do not perform well on highly overlapping classes or when the data has more noise.

Video Explanation

This compilation of Dr. Andrew Ng lecture videos on Support Vector Machines (SVM) consist of everything that you need to know about SVM – it includes the intuition behind the SVM model, optimization objective, maximum margin intuition, kernels and its different choices. In this 1.5 hour video, you will be learn everything about SVM for an effective practical application. (Runtime: 1 hr 37 mins)

YouTube video — Compilation of Dr. Andrew Ng lecture videos on Support Vector Machines

Author

AIML.com

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment (Cancel Reply)

You must be logged in to post a comment.

Partner Ad

Join us on:

Find out all the ways that you can

Contribute