Support Vector Machines

Quick Reference

Core Concept

Find the decision boundary with the widest possible margin. Like parking in the center of a space, maximizing distance to both neighbors.

Philosophy

Perceptron: Satisfied with any separating line SVM: Perfectionist, searches for the best line with maximum margin

The Margin

Distance between decision boundary and nearest data points from either class.

Support Vectors: Nearest points that "support" or define the boundary. Why margin matters: Wide margin = robustness to noisy data. Boundary has breathing room.

SVM Objective

Find decision boundary that:

1. Maximizes the margin

2. Correctly classifies all training points

Hinge Loss

L(y, f(x)) = max(0, 1 - y · f(x))

Where f(x) = θᵀx + θ₀ and y ∈ {-1, +1}

Three Cases: Wrong (y · f(x) < 0): Barely Right (0 < y · f(x) < 1): Confidently Right (y · f(x) ≥ 1): Key Insight: Penalizes points too close to boundary even if technically correct. Forces wide margin.

Optimization Problem

minimize: (1/2)||θ||² + C · Σ max(0, 1 - y⁽ⁱ⁾(θᵀx⁽ⁱ⁾ + θ₀))

Two Competing Terms: (1/2)||θ||²: Regularization C · Σ Hinge Loss: Data fitting

Hyperparameter C

Large C: Small C:

The Kernel Trick

Map data to higher-dimensional space where it becomes linearly separable, without computing transformation explicitly.

Common Kernels: Linear: K(x, x') = xᵀx' Polynomial: K(x, x') = (xᵀx' + c)ᵈ RBF (Radial Basis Function): K(x, x') = exp(-γ||x - x'||²)

Enables learning complex, nonlinear boundaries with linear optimization elegance.

Quick Facts

← Previous: Perceptron 📖 Read Deep Dive Next: Features →