Download presentation
Presentation is loading. Please wait.
Published byJanis Carpenter Modified over 9 years ago
1
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008 Shimin Chen Big Data Reading Group
2
Motivation Important part of Commercial Search Engines Based on the text of the pages from the images are linked. –Anchor Text –Quality of the anchor page –Etc.
3
Why? Text-based search is well studied. General object detection/recognition in images remains an open problem. Image processing is much more expensive than text processing Discussions (by shimin)
4
Search Result (Eiffel Tower)
5
Search Result (d80)
6
Search Result (McDonalds)
7
Search Result (coca cola)
8
Image Search (Integrated Results) Search quality is more important
9
Contribution Extending PageRank to image search Visual-hyperlinks estimated from local feature patches Most comprehensive experiment to date –Limited and Noisy real- world images –Large number of user evaluations –2000 queries
10
Limitations of prior works: Visual Category Recognition Filters (Fergus et al. ECCV 2004) –Probabilistic Graphical Model with hidden layers »Susceptible to data noise »Large number of parameters to estimate »High dimensionality in feature space »Limited training data. –Limited Experiment »11 hand-selected, hand-labeled queries (bottles, etc) –Can not handle multiple visual-concept –Computationally Expensive »
11
Our observation Due to the high dimensionality of feature space, learning feature correlations can be difficult with limited and noisy data Estimating image similarities is a slightly easier task. Visual Image Ranking != Object Category model –Modeling the relationship among images, instead of the features –As most users rarely look beyond the first page of results,
12
Outline VisualRank –Robust estimation of image similarities (Visual-Hyperlinks). –Random-walk on visual-hyperlinks to find “visual authority.” Experiments –2000 product queries –150 user evaluation –Click analysis
13
Idea Extract local features of an image Construct a graph with images as nodes, similarity as edge weights Use PageRank to generate the ranking Visual-hyperlinks discussions
14
Visual-hyperlinks Step 1) Generate Visual-hyperlinks via robust image similarity estimation Find similar patches (L2 distance) Geometric Verification (Affine Transformation) Interest point selection + descriptor representation SIFT: 128 dimensional vectors Similarity= (# similar patches)/ average # patches
15
Step 1) Generate Visual-hyperlinks via robust image similarity estimation Visual-hyperlinks
16
Query Dependent Ranking Too expensive to construct a graph for all images Construct a graph for images returned from a (text-based) search In other words, the purpose is to better rank images returned from a text-based search discussions
17
Visual-hyperlinks Generated from the top 1000 results of “mona-lisa”
18
Visual-hyperlinks Lincoln MemorialTop 5 Images with the highest weighted “neighbors.”
19
Visual-hyperlinks + PageRank Intuition Eigen-centrality Visual “authority” Random Surfer Principle Eigenvector of weighted similarity matrix
20
SPAM! Visual-hyperlinks + PageRank PageRankWithout PageRank
21
Outline VisualRank –Robust estimation of image similarities (Visual-Hyperlinks). –Random-walk on visual-hyperlinks to find “visual authority.” Experiments –2000 product queries –150 user evaluation –Click analysis
22
Experiment/Results Selection of queries –2000 most popular product search queries Product items are popular set of queries Well suited for the patch-based features we are studying. 153 user evaluation –Combined both results, and ask which images are irrelevant to the query? –User click analysis Back testing. Lower bound on the improvement Alternative experiment method considered –Mark our own Groundtruth data –Ask user to rank results –Ask users to compare groups of results
23
Experiment/Results wii picasso Microsoft zune ipod
24
Experiment/Results 1) 85% of the irrelevant images are removed. 2) 10% increase in user clicks on the top 20 results.
25
Mistakes Dell Playstation USB keychain
26
Click Study Idea: images clicked after a search are good Given click stats for top 40 images of 130 common product queries Examine: # of clicks of the first 20 images ImageRank: 17.5% more clicks than default ranking More results
27
Conclusion/Future Work Conclusion –Robust visual-hyperlinks + graph algorithms are pragmatic choice for web images Future work –How to make local feature matching efficient –Incorporate more features into the construction of visual-hyperlink. –Incorporate Google Initial ranking into PageRank
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.