230 likes | 404 Views
Le Do Hoang Nam – CNTN08. Support Vector Machine. Linear Programming. General Form with x in R n Linear objective, Linear constraints, …. Linear Programming. An example: The Diet Problem How to come up with a cheapest meal that meets all nutrition standards ?. Linear Programming.
E N D
Le Do Hoang Nam – CNTN08 Support Vector Machine
Linear Programming • General Form with x in Rn • Linear objective, Linear constraints, …
Linear Programming • An example: The Diet Problem • How to come up with a cheapest meal that meets all nutrition standards?
Linear Programming • Let x1, x2 and x3 be the amount in kilos of carrot, cabbage and cucumber in the dish. • Mathematically,
Linear Programming • In canonical form: • How to solve? • Simplex. • Newton method. • Gradient descend.
LP and Classification • Given a set of N samples (mi, li) • mi is the feature set. • li = -1 or 1 is the label. • If a sample is correctly classified by a hyper-plane wTx + c then: li (wTmi + c) ≥ 1 linear function
LP and Classification • (w, c) is a good classification if it satisfies: li (wTmi + c) ≥ 1 , i = 1..n which are linear constraints LP form:
LP and Classification • Without any objective function, we have ALL possible solutions: Class 2 Class 2 Class 1 Class 1
LP and Classification • If data is not linearly separable: • Minimize number of errors Class 2 Class 1
LP and Classification • Our objective becomes: • But, cardinal function is non-linear not an LP
LP and Classification • Cardinal function: f(x) • Solution:Approximate it with Hinge-loss function. 1 1 O x
LP and Classification • Hinge-loss function: • Or: f(x) 1 1 O x
LP and Classification • Classification problem now becomes: which can be solved as an LP
LP and Classification • Geometry view: wTx + c = 1 Class 2 εi mi wTx + c = -1 mj εj wTx + c = 0 Class 1
LP and Classification • Another problem: Some samples are uncertain Class 2 Class 1
LP and Classification • Solution: Maximum the margin d. Class 2 d Class 1
LP and Classification • All samples are outside the margin • All the distances from samples to boundary are bigger than d/2. That means:
LP and Classification • Because hyper-plane is homogenous, we choose w such as: • The objective function:
LP and Classification • The problem now becomes:
Support Vector Machine • Together with the error minimization, we have the SVM: • λ means the trade-off between error and robustness