Presentation is loading. Please wait.

Presentation is loading. Please wait.

Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba 1.

Similar presentations


Presentation on theme: "Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba 1."— Presentation transcript:

1 Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba 1

2  How to deal with image complexity  A general framework for different vision tasks  Rich representation and tractable computation 2 Pattern Theory. Grenander 94 Compositionality. Geman 02, 06 Stochastic Grammar. Zhu and Mumford 06

3  Representation Recursive Compositional Models (RCMs)  Inference Recursive Optimization  Learning Supervised Parameter Estimation Unsupervised Recursive Dictionary Learning  RCM-1: Deformable Object  RCM-2: Articulated Object  RCM-3: Scene (Entire Image) 3

4  Flat MRF Nodes: object parts Edges: spatial relations  Limitations: Short range interaction Sparse 4

5 5

6 6 x: image y: (position, scale, orientation) graph=(nodes, edges) a: index of node b: child of a f: appearances on node a g: potentials on edges (a,b)

7 7 Recursion x: image ; y: (position, scale, orientation); Vertical independency; Self-similarity;

8  Representation Recursive Compositional Models (RCMs)  Inference Recursive Optimization  Learning Supervised Parameter Estimation Unsupervised Recursive Dictionary Learning 8

9  Inference task:  Recursive Optimization: 9 Recursion  Polynomial-time Complexity:

10  Supervised learning Perceptron algorithm (MLE, max margin – svm) Parameter estimation needs fast inference. 10 Collins 02. Taskar et al. 04

11  Goal:  Input: a set of training images with ground truth. Initialize parameter vector.  Training algorithm (Collins 02): Loop over training samples: i = 1 to N Step 1: find the best using inference: Step 2: Update the parameters: End of Loop. 11 Inference is critical for learning where

12  Representation Recursive Compositional Models (RCMs)  Inference Recursive Optimization (Polynomial-time)  Learning Supervised Parameter Estimation  RCM-1: Deformable Object 12

13  Potentials for appearance 13 * = [ Gabor, Edge, …]

14  Potentials for shape: triplet descriptors 14 (position, scale, orientation)

15 15

16 16

17 17

18  Segmentation (Accuracy of pixel labeling) The proportion of the correct pixel labels (object or non- object)  Parsing (Average Position Error of matching) The average distance between the positions of leaf nodes of the ground truth and those estimated in the parse tree 18 MethodsTestingSegmentationParsingSpeed RCM-122894.71623s Ren (Berkeley)17291 Winn (LOCUS)20093 Levin and Weiss N/A95 Kumar (OBJ CUT)596

19  Multi-level Precision-Recall curves quantify the recognition performance of object parts.  High-level regularity (more parts) help recognition (remove ambiguity). 19

20  Modeling: (Representation) Recursive Compositional Models (RCMs)  Inference: (Computing) Recursive Optimization (Polynomial-time)  Learning: Supervised Parameter Estimation Unsupervised Recursive Learning  RCM-1: deformable object 20

21  Task: given 10 training images, n o labeling, no alignment, highly ambiguous features. Induce the structure (nodes and edges) Estimate the parameters. 21 ? Combinatorial Explosion problem Correspondence is unknown

22  Multi-level dictionary (layer-wise greedy)  Bottom-Up and Top-Down recursive procedure  Three Principles: Recursive Composition Suspicious Coincidence Competitive Exclusion 22 Barlow 94. Recursion

23 23

24 24 Composition Clustering Suspicious Coincidence Competitive Exclusion

25  Unified representation (RCMs) and learning  Bridge the gap between the generic features and specific object structures 25

26 26 LevelCompositionClustersSuspicious Coincidence Competitive Exclusion Seconds 041 1167,43114,68426248117 22,034,851741,662995116254 32,135,4671,012,7773055399 4236,95572,6203029 More Sharing

27 27

28 28

29 29

30 30  Fill in missing parts  Examine every node from top to bottom

31 31

32 32 MethodsTestingSegmentationParsingSpeed Unsupervised31693.317s Supervised22894.71623s

33  More classes/viewpoints -> more training/detection cost 33

34  No enough data for rare viewpoints/classes 34

35  Joint multi-class multi-view learning  Appearance sharing  Part sharing 35

36  120 templates: 5 viewpoints & 26 classes 36

37 37

38 38

39 39

40 40

41 41

42 42

43 43

44 44

45 45

46  Representation Recursive Compositional Models (RCMs)  Inference Recursive Optimization (Polynomial-time)  Learning Supervised Parameter Estimation  RCM-1: Deformable Object  RCM-2: Articulated Object 46

47 47 y=(switch, position, scale, orientation) Composition Switch multiple poses

48 48

49 49

50  Representation Recursive Compositional Models (RCMs)  Inference Recursive Optimization (Polynomial-time)  Learning Supervised Parameter Estimation  RCM-1: Deformable Object  RCM-2: Articulated Object  RCM-3: Scene (Entire Image) 50

51  Task: Image Segmentation and Labeling 51

52 52 Geman and Geman 84. L Zhu et al. NIPS 08  Flat MRF: object labeling (recognition only).  Lack of long-range interactions.  Lack of region-level properties.  High-order potentials -> heavy computation

53 53 Geman and Geman 84. L Zhu et al. NIPS 08  Flat MRF: object labeling (recognition only).  Joint segmentation-recognition template

54  (segmentation, object) pair: chicken-and-egg of segmentation and recognition.  Multi-level low-dimensional abstraction 54 Global: gist of scene object layout Local: concurrent shape and appearance coarse to fine

55 55 f: appearance likelihood g:object layout prior homogeneitylayer-wise consistency object texture color object co- occurrence segmentation prior Recursion y=(segmentation, object) Horse Grass

56  State space: C=21 classes; D=30 templates; K=3 classes / per template  Inference (recursive optimization):  Supervised learning (perceptron ) 56

57 57

58 58

59  Implementation Details  Comparisons 59 TextonBoost Shotton et al. 04 PLSA-MRF Berbeek and Trigg AutoContext Tu 08 Classifier only RCM-3 Average57.7646867.274.5 Global72.2 69 (Classifier) 73.577.775.981.4 DatasetClassesSizeTraining Size Training Time Testing Time MSRC2159145%55h30s

60 60 RCM-1 RCM-2 RCM-3 Triplets of Parts Triplets of Segments Boundary only Region + Boundary

61  Principle: Recursive Composition Composition -> complexity decomposition Recursion -> Universal rules (self-similarity) Recursion and Composition -> sparseness  One formula for different tasks.  Key: the representation of visual patterns, i.e. y.  Low dimension, simple potentials  Scaling up: practical Image Understanding System 61

62  Long Zhu, Yuanhao Chen, Antonio Torralba, William Freeman, AlanYuille. Part and Appearance Sharing: Recursive Compositional Models for Multi- View Multi-Object Detection. CVPR. 2010.  Long Zhu, Yuanhao Chen, Yuan Lin, Chenxi Lin, Alan Yuille. Recursive Segmentation and Recognition Templates for 2D Parsing. NIPS 2008.  Long Zhu, Chenxi Lin, Haoda Huang, Yuanhao Chen, Alan Yuille. Unsupervised Structure Learning: Hierarchical Recursive Composition, Suspicious Coincidence and Competitive Exclusion. ECCV 2008.  Long Zhu, Yuanhao Chen, Yifei Lu, Chenxi Lin, Alan Yuille. Max Margin AND/OR Graph Learning for Parsing the Human Body. CVPR 2008.  Long Zhu, Yuanhao Chen, Xingyao Ye, Alan Yuille. Structure-Perceptron Learning of a Hierarchical Log-Linear Model. CVPR 2008.  Yuanhao Chen, Long Zhu, Chenxi Lin, Alan Yuille, Hongjiang Zhang. Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing. NIPS 2007.  Long Zhu, Alan L. Yuille. A Hierarchical Compositional System for Rapid Object Detection. NIPS 2005 62

63 63

64  Polynomial-time inference:  Supervised learning Perceptron algorithm (MLE, max margin – svm) Parameter estimation needs fast inference. 64 Recursion Collins 02. Taskar et al. 04

65 65

66 66

67  Task: find a small dictionary D (sparse coding).  Multi-level dictionary (layer-wise greedy)  Bottom-Up and Top-Down recursive procedure 67 Barlow 94. Recursion

68 68


Download ppt "Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba 1."

Similar presentations


Ads by Google