Presentation is loading. Please wait.

Presentation is loading. Please wait.

Object recognition. Object Classes Individual Recognition.

Similar presentations


Presentation on theme: "Object recognition. Object Classes Individual Recognition."— Presentation transcript:

1 Object recognition

2 Object Classes

3 Individual Recognition

4 Is this a dog?

5 Variability of Airplanes Detected

6 Variability of Horses Detected

7 ClassNon-class

8

9

10 Recognition with 3-D primitives Geons

11 Visual Class: Common Building Blocks

12 Optimal Class Components? Large features are too rare Small features are found everywhere Find features that carry the highest amount of information

13 Entropy Entropy: x =01H p =0.50.5? 0.10.90.47 0.010.990.08

14 Mutual Information I(x,y) X alone: p(x) = 0.5, 0.5H = 1.0 X given Y: Y = 0 Y = 1 p(x) = 0.8, 0.2 H = 0.72 p(x) = 0.1, 0.9 H = 0.47 H(X|Y) = 0.5*0.72 + 0.5*0.47 = 0.595 H(X) – H(X|Y) = 1 – 0.595 = 0.405 I(X,Y) = 0.405

15 Mutual information H(C) when F=1H(C) when F=0 I(C;F) = H(C) – H(C/F) F=1 F=0 H(C)

16 Mutual Information II

17 Computing MI from Examples Mutual information can be measured from examples: 100 Faces100 Non-faces Feature: 44 times 6 times Mutual information:0.1525 H(C) = 1, H(C|F) = 0.8475

18 Full KL Classification Error FC p(F|C) q(C|F) p(C)

19 Optimal classification features Theoretically: maximizing delivered information minimizes classification error In practice: informative object components can be identified in training images

20 Selecting Fragments

21 Adding a New Fragment (max-min selection) ? MIΔ MI = MI [ Δ ; class ] - MI [ ; class ] Select: Max i Min k ΔMI (Fi, Fk) (Min. over existing fragments, Max. over the entire pool)

22 Highly Informative Face Fragments

23 Intermediate Complexity

24 Decision Combine all detected fragments F k : ∑w k F k > θ

25 Optimal Separation SVMPerceptron ∑wk Fk = θ is a hyperplane

26 Combining fragments linearly Conditional independence: P(F1,F2 | C) = p(F1|C) p(F2|C) > θ> θ > θ> θ W(Fi) = log Σw(Fi) > θ

27 If Fi=1 take log If Fi=0 take log Instead: Σ w i > θ On all the detected fragments only With: w i = w(Fi=1) – w(Fi=0)

28 Class II

29 ClassNon-class

30 Fragments with positions ∑w k F k > θ On all detected fragments within their regions

31 Horse-class features

32 Examples of Horses Detected

33 Interest points (Harris) SIFT Descriptors I x 2 I x I y I x I y I y 2 ∑

34 Harris Corner Operator < I x I y < H = Averages within a neighborhood. Corner: The two eigenvalues λ1, λ2 are large Indirectly: ‘Corner’ = det(H) – k trace 2 (H)

35 Harris Corner Examples

36 SIFT descriptor David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110 Example: 4*4 sub-regions Histogram of 8 orientations in each V = 128 values: g 1,1,…g 1,8,……g 16,1,…g 16,8

37 SIFT

38 Constellation of Patches Using interest points Fegurs, Perona, Zissermann 2003 Six-part motorcycle model, joint Gaussian,

39 Bag of words and Unsupervised Classification

40 Bag of visual words A large collection of image patches –

41 Each class has its words historgram – – –

42 pLSA Classify document automatically, find related documents, etc. based on word frequency. Documents contain different ‘topics’ such as Economics, Sports, Politics, France… Each topic has its typical word frequency. Economics will have high occurrence of ‘interest’, ‘bonds’ ‘inflation’ etc. We observe the probabilities p(w i | d n ) of words and documents Each document contains several topics, z k A word has different probabilities in each topic, p(w i | z k ). A given document has a mixture of topics: p(z k | d n ) The word-frequency model is: p(w i | d n ) = Σ k p(w i |z k ) p(z k | d n ) pLSA was used to discover topics, and arrange documents according to their topics.

43 pLSA The word-frequency model is: p(w i | d n ) = Σ k p(w i |z k ) p(z k | d n ) We observe p(w i | d n ) and find the best p(w i |z k ) and p(z k | d n ) to explain the data pLSA was used to discover topics, and then arrange documents according to their topics.

44 Discovering objects and their location in images Sivic, Russel, Efros, Freedman & Zisserman CVPR 2005 Uses simple ‘visual words’ for classification Not the best classifier, but obtains unsupervised classification, using pLSA

45 Visual words – unsueprvised classification Four classes: faces, cars, airplanes, motorbikes, and non-class. Training images are mixed. Allowed 7 topics, one per class, the background includes 3 topics. Visual words: local patches using SIFT descriptors. –(say local 10*10 patches) codewords dictionary

46 Learning Data: the matrix D ij = p(w i | I j ) During learning – discover ‘topics’ (classes + background) p(w i | I j ) = Σ p(w i | T k ) p(T k | I j ) Optimize over p(w i | T k ), p(T k | I j ) The topics are expected to discover classes Got mainly one topic per class image.

47 Results of learning

48 Classifying a new image New image I: Measure p(w i | I) Find topics for the new image: p(w i | I) = Σ p(w i | T k ) p(T k | I) Optimize over the topics T k Find the largest (non-background) topic

49 Classifying a new image

50 On general model learning The goal is to classify C using a set of features F. F have been selected (must have high MI(C;F)) The next goal is to use F to decide on the class C. Probabilistic approach: Use observations to learn the joint distribution p(C,F) In a new image, F is observed, find the most likely C, Max (C) p(C,F)

51 General model learning To learn the joint distribution p(C,F): The model is of the form p θ (C,F) –Or: p θ (C,X,F) For example we had – words in documents: –p(w,D) = Πp(w i,D) –p(w i | D) = Σ p(w i | T k ) p(T k | D) Training examples used to determine optimal θ by maximizing p θ (data) – max (C,X, θ) p θ (C,X,F) When θ known, classify new example: –max (C,X) p θ (C,X,F)


Download ppt "Object recognition. Object Classes Individual Recognition."

Similar presentations


Ads by Google