Download presentation
Presentation is loading. Please wait.
Published byVirginia Hodges Modified over 9 years ago
1
6.S093 Visual Recognition through Machine Learning Competition Image by kirkh.deviantart.com Joseph Lim and Aditya Khosla Acknowledgment: Many slides from David Sontag and Machine Learning Group of UT Austin
2
What is Machine Learning???
3
According to the Wikipedia, Machine learning concerns the construction and study of systems that can learn from data.
4
What is ML? Classification – From data to discrete classes
5
What is ML? Classification – From data to discrete classes Regression – From data to a continuous value
6
What is ML? Classification – From data to discrete classes Regression – From data to a continuous value Ranking – Comparing items
7
What is ML? Classification – From data to discrete classes Regression – From data to a continuous value Ranking – Comparing items Clustering – Discovering structure in data
8
What is ML? Classification – From data to discrete classes Regression – From data to a continuous value Ranking – Comparing items Clustering – Discovering structure in data Others
9
Classification: data to discrete classes Spam filtering SpamRegularUrgent
10
Classification: data to discrete classes Object classification FriesHamburgerNone
11
Regression: data to a value Stock market
12
Regression: data to a value Weather prediction
13
Clustering: Discovering structure Set of Images
14
Clustering
15
Unsupervised learning Requires data, but NO LABELS Detect patterns – Group emails or search results – Regions of images Useful when don’t know what you’re looking for
16
Clustering Basic idea: group together similar instances Example: 2D point patterns
17
Clustering Basic idea: group together similar instances Example: 2D point patterns
18
Clustering Basic idea: group together similar instances Example: 2D point patterns
19
Clustering Basic idea: group together similar instances Example: 2D point patterns What could “similar” mean? – One option: small Euclidean distance – Clustering is crucially dependent on the measure of similarity (or distance) between “points”
20
K-Means An iterative clustering algorithm – Initialize: Pick K random points as cluster centers – Alternate: Assign data points to closest cluster center Change the cluster center to the average of its assigned points – Stop when no points’ assignments change Animation is from Andrey A. Shabalin’s website
21
K-Means An iterative clustering algorithm – Initialize: Pick K random points as cluster centers – Alternate: Assign data points to closest cluster center Change the cluster center to the average of its assigned points – Stop when no points’ assignments change
22
Properties of K-means algorithm Guaranteed to converge in a finite number of iterations Running time per iteration: – Assign data points to closest cluster center O(KN) – Change the cluster center to the average of its assigned points O(N)
23
Example: K-Means for Segmentation K=2 Original Goal of Segmentation is to partition an image into regions each of which has reasonably homogenous visual appearance.
24
Example: K-Means for Segmentation K=2 K=3 K=10 Original
25
Pitfalls of K-Means K-means algorithm is heuristic – Requires initial means – It does matter what you pick! K-means can get stuck K=1 should be better K=2 should be better
26
Classification
27
Typical Recognition System features classify +1 bus -1 not bus Extract features from an image A classifier will make a decision based on extracted features x F(x)y
28
Typical Recognition System features classify +1 bus -1 not bus Extract features from an image A classifier will make a decision based on extracted features x F(x)y
29
Classification Supervised learning Requires data, AND LABELS Useful when you know what you’re looking for
30
Linear classifier Which of these linear separators is optimal?
31
Maximum Margin Classification Maximizing the margin is good according to intuition and PAC theory. Implies that only support vectors matter; other training examples are ignorable.
32
Classification Margin Distance from example x i to the separator is Examples closest to the hyperplane are support vectors. Margin ρ of the separator is the distance between support vectors. r ρ
33
Linear SVM Mathematically Let training set {(x i, y i )} i=1..n, x i R d, y i {-1, 1} be separated by a hyperplane with margin ρ. Then for each training example (x i, y i ): For every support vector x s the above inequality is an equality. After rescaling w and b by ρ/2 in the equality, we obtain that distance between each x s and the hyperplane is Then the margin can be expressed through (rescaled) w and b as: w T x i + b ≤ - ρ/2 if y i = -1 w T x i + b ≥ ρ/2 if y i = 1 y i (w T x i + b) ≥ ρ/2
34
Linear SVMs Mathematically (cont.) Then we can formulate the quadratic optimization problem: Find w and b such that is maximized and for all (x i, y i ), i=1..n : y i (w T x i + b) ≥ 1 Find w and b such that Φ(w) = ||w|| 2 =w T w is minimized and for all (x i, y i ), i=1..n : y i (w T x i + b) ≥ 1
35
Is this perfect?
37
Soft Margin Classification What if the training set is not linearly separable? Slack variables ξ i can be added to allow misclassification of difficult or noisy examples, resulting margin called soft. ξiξi ξiξi
38
Soft Margin Classification Mathematically The old formulation: Modified formulation incorporates slack variables: Parameter C can be viewed as a way to control overfitting: it “ trades off ” the relative importance of maximizing the margin and fitting the training data. Find w and b such that Φ(w) =w T w is minimized and for all (x i,y i ), i=1..n : y i (w T x i + b) ≥ 1 Find w and b such that Φ(w) =w T w + C Σ ξ i is minimized and for all (x i,y i ), i=1..n : y i (w T x i + b) ≥ 1 – ξ i,, ξ i ≥ 0
39
Exercise Exercise 1. Implement K-means Exercise 2. Play with SVM’s C-parameter
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.