Download presentation
Presentation is loading. Please wait.
Published byJack Shelton Modified over 9 years ago
1
Generalization Error of pac Model Let be a set of training examples chosen i.i.d. according to Treat the generalization error as a r.v. depending on the random selection of Find a bound of the trail of the distribution of in the form r.v. is a function of and,where is the confidence level of the error bound which is given by learner
2
Probably Approximately Correct We assert: The error made by the hypothesis then the error bound will be less that is not depend on the unknown distribution or
3
Find the Hypothesis with Minimum Expected Risk? Let the trainingexamples chosen i.i.d. according to withthe probability density be The expected misclassification error made by is The ideal hypothesis should has the smallest expected risk Unrealistic !!!
4
Empirical Risk Minimization (ERM) Find the hypothesis with the smallest empirical risk and are not needed)( Replace the expected risk over by an average over the training example The empirical risk: Only focusing on empirical risk will cause overfitting
5
VC Confidence (The Bound between ) The following inequality will be held with probability C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery 2 (2) (1998), p.121-167
6
Capacity (Complexity) of Hypothesis Space :VC-dimension A given training set is shattered by if for every labeling of with this labeling if and only consistent Three (linear independent) points shattered by a hyperplanes in
7
Shattering Points with Hyperplanes in Theorem: Consider some set of m points in. Choose a point as origin. Then the m points can be shattered by oriented hyperplanes if and only if the position vectors of the rest points are linearly independent. Can you always shatter three points with a line in ?
8
Definition of VC-dimension (A Capacity Measure of Hypothesis Space ) The Vapnik-Chervonenkis dimension,, of hypothesis spacedefined over the input space is the size of the (existent) largest finite subset shattered by If arbitrary large finite set of can be shattered by, then of Let then
9
Optimization Problem Formulation Problem setting: Given functions and, defined on a domain subject to whereis called the objective function and are called constraints.
10
Definitions and Notation Feasible region: where A solution of the optimization problem is a point such thatfor which and is called a global minimum.
11
Definitions and Notation A point is called a local minimum of the optimization problem if such that At the solution, an inequality constraint is said to be active if, otherwise it is called an inactive constraint. where is called the slack variable
12
Definitions and Notation Remove an inactive constraint in an optimization problem will NOT affect the optimal solution Very useful feature in SVM If then the problem is called unconstrained minimization problem SSVM formulation is in this category Difficult to find the global minimum without convexity assumption Least square problem is in this category
13
Gradient and Hessian Let be a differentiable function. The gradient of functionat a point is defined as If is a twice differentiable function. The Hessian matrix ofat a point is defined as
14
Algebra of the Classification Problem 2-Category Linearly Separable Case Given m points in the n dimensional real space Represented by an matrix or Membership of each point in the classes is specified by an diagonal matrix D : if and if Separate and by two bounding planes such that: More succinctly:, where
15
Robust Linear Programming (RLP) where : nonnegative slack (error) vector The term, 1-norm measure of error vector, is called the training error. s.t. (LP) For the linearly separable case, at solution of (LP): Preliminary Approach to Support Vector Machines
16
Support Vector Machines Formulation Solve the quadratic program for some : min s. t.,, denotes where or membership. Different error functions and measures of margin will lead to different SVM formulations. Margin is maximized by minimizing reciprocal of margin.
17
Linear Program and Quadratic Program An optimization problem in which the objective function and all constraints are linear functions is called a linear programming problem If the objective function is convex quadratic while the constraints are all linear then the problem is called convex quadratic programming problem Standard SVM formulation is in this category formulation is in this category
18
The Most Important Concept in Optimization (minimization) A point is said to be an optimal solution of a unconstrained minimization if there exists no decent direction A point is said to be an optimal solution of a constrained minimization if there exists no feasible decent direction There might exist decent direction but move along this direction will leave out the feasible region
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.