Download presentation
Presentation is loading. Please wait.
1
1 Discovering Collocation Patterns: from Visual Words to Visual Phrases Junsong Yuan, Ying Wu and Ming Yang CVPR’07
2
2 Discovering Visual Collocation
3
3 An exciting idea: detour Related Work: J. Sivic et al. CVPR04, B. C. Russell et al. CVPR06, G. Wang et al. CVPR06, T. Quack et al. CIVR06, S. C. Zhu et al. IJCV05, …
4
4 Confrontation Spatial characteristics of images –over-counting co-occurrence frequency Uncertainty in visual patterns –Continuous visual feature quantized word –Visual synonym and polysemy
5
5 Our Approach
6
6 Selecting visual phrases Visual collocations may occur by chance Selecting phrases by a likelihood ratio test: –H 0 : occurrence of phrase P is randomly generated –H 1 : phrase P is generated by a hidden pattern Prior: Likelihood: Check if words are co-located together by chance or statistically meaningful
7
7 Frequent Word-sets ( |P|>=2 ) AB ABFABE CD CDE DECEAE BEBF AF Discovery of visual phrases ABFP CDES ABFT CDEX ABDK …… Closed FIM pair-wise student t-test ranked by L(P) AB ABF BF AF likelihood ratio 15.7 14.3 12.2 10.9 9.7 CD
8
8 Frequent Itemset Mining (FIM) If an itemset is frequent then all of its subsets must also be frequent
9
9 Phrase Summarization Measuring the similarity between visual phrases by KL-divergence Yan et al., SIGKDD 05 Clustering visual phrases by Normalized-cut
10
10 Pattern Summarization Results Face database: summarizing top-10 phrases into 6 semantic phrase patterns Car database: summarizing top-10 phrases into 2 semantic phrase patterns
11
11 Partition of visual word lexicon Metric learning method: Neighborhood component analysis (NCA). Goldberger, et al., NIPS05 –improve the leave-one-out performance of the nearest neighbor classifier
12
12 Evaluation K-NN spatial group: K=5 Two image category database: car (123 images) and face (435 images) Precision of visual phrase lexicon – the percentage of visual phrases P i ∈ Ψ that are located in the foreground object Precision of background word lexicon – the percentage of background words W i ∈ Ω − that are located in the background Percentage of images that are retrieved:
13
13 Results: visual phrases from car category Visual phrase pattern 1: wheels different colors represent different semantic meanings Visual phrase pattern 2: car bodies
14
14 Results: visual phrases from face category
15
15 Comparison
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.