# Understanding SVM(1)

SVM & Active Learning
NN & Deep Learning
Random Forests
Spectral Methods
Matrix Factorization
Community Finding in Social Networks

Linear Classification

General: find a hyperplane to split two classes.
w: the weight vector which controls the orientation
b: the displacement from the origin $f\left( x \right) ={ w }^{ T }x+b$

Support Vector Machines (SVM) for binary classification: find a hyperplane between two classes that maximises the margin between the clases.

Support Vectors
Support vectors, a subset of the training set.
Weight vector w is a linear combination of the support vectors.

Soft Margin Classification
A trade-off method. A wrong separation is accepted.
Hard Margin Classification
All examples are correctly classified.

Use a parameter to control the cost. —> allowing training errors and forcing rigid margins.

Non Linear? (Original 2D is not separable)
Move to 3D. $({ x }_{ 1 },{ x }_{ 2 },{ x }_{ 1 }+{ x }_{ 2 })$ Kernal Function: Inner products between pairs of examples..to be specified by some funtion. Transform the data into implicit high dimensional feature space. —> Similarity Measure (Same class samples have a higher value)
Sequence Kernels: text classification (words), protein classification.

Kernal Representation:
A Kernal matrix (Gram matrix). n times n, with k(i,j).
A new input q, calculate K(q,x) for each of the support vectors x.

Tips:

1.Linear SVMs can find the optimal hyperplane for linearly separable classes.
2.Non-linear separation could be done by a kernel method.

3.Kernel methods are new ways to calculate similarities, so with SVMs, they provide ways to solve non-linear problems.

4.Active learning can be used in conjunction with SVMs to
minimise the number of training examples required.