Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimization objective

Similar presentations


Presentation on theme: "Optimization objective"— Presentation transcript:

1 Optimization objective
Support Vector Machines Optimization objective Machine Learning

2 Alternative view of logistic regression
If , we want , If , we want ,

3 Alternative view of logistic regression
Cost of example: If (want ): If (want ):

4 Support vector machine
Logistic regression: Support vector machine:

5 SVM hypothesis Hypothesis:

6 Large Margin Intuition
Support Vector Machines Large Margin Intuition Machine Learning

7 If , we want (not just ) If , we want (not just )
Support Vector Machine -1 1 If , we want (not just ) If , we want (not just )

8 SVM Decision Boundary Whenever : -1 1 Whenever : -1 1

9 SVM Decision Boundary: Linearly separable case
x2 x1 Large margin classifier

10 Large margin classifier in presence of outliers
x2 x1

11 The mathematics behind large margin classification (optional)
Support Vector Machines The mathematics behind large margin classification (optional) Machine Learning

12 Vector Inner Product

13 SVM Decision Boundary

14 SVM Decision Boundary

15 Support Vector Machines 커널(Kernels) I Machine Learning

16 비선형 결정 경계 x2 x1 특징 의 또 다른/ 더 나은 선택이 있는가?

17 Kernel 주어진 에 대하여, 랜드마크 의 근접도에 의존하는 새로운 특징을 계산한다. x2 x1

18 커널과 유사도(Similarity)

19 Example:

20 x2 x1

21 Support Vector Machines 커널(Kernels) II Machine Learning

22 Choosing the landmarks
Given : x2 x1 Predict if Where to get ?

23 SVM with Kernels Given choose Given example : For training example :

24 SVM with Kernels Hypothesis: Given , compute features Predict “y=1” if Training:

25 SVM parameters: C ( ). Large C: Lower bias, high variance. Small C: Higher bias, low variance. Large : 특징 가 좀 더 부드럽게 변화한다. Higher bias, lower variance. Small : 특징 가 좀 덜 부드럽게 변화한다. Lower bias, higher variance.

26 Support Vector Machines
Using an SVM Machine Learning

27 Use SVM software package (e. g
Use SVM software package (e.g. liblinear, libsvm, …) to solve for parameters . Need to specify: Choice of parameter C. Choice of kernel (similarity function): E.g. No kernel (“linear kernel”) Predict “y = 1” if Gaussian kernel: , where Need to choose

28 Kernel (similarity) functions:
function f = kernel(x1,x2) return x1 x2 주의 : 가우시안 커널을 사용하기 전에 특징 스케일링을 수행한다.

29 Other choices of kernel
Note: 모든 유사도 함수 가 유효한 커널은 아니다. ( SVM 패키지의 최적화가 정확히 실행되고 발산하지 않으려면, “Mercer’s Theorem”라는 기술적 조건을 만족할 필요가 있다). 많은 이용 가능한 커널들: Polynomial kernel: 좀 더 난해한: String kernel, chi-square kernel, histogram intersection kernel, …

30 Multi-class classification
많은 SVM 패키지들이 이미 내장된 다중-클래스 분류 기능을 갖는다. 그렇지 않으면, 일-대-모두 방법을 사용한다 (Train SVMs, one to distinguish from the rest, for ), get Pick class with largest

31 Logistic regression vs. SVMs
number of features ( ), number of training examples If is large (relative to ): Use logistic regression, or SVM without a kernel (“linear kernel”) If is small, is intermediate: Use SVM with Gaussian kernel If is small, is large: Create/add more features, then use logistic regression or SVM without a kernel Neural network likely to work well for most of these settings, but may be slower to train.


Download ppt "Optimization objective"

Similar presentations


Ads by Google