UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.

Slides:



Advertisements
Similar presentations
Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.
Advertisements

Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Three things everyone should know to improve object retrieval
Tamara Berg Retrieval Language and Vision.
Detecting Categories in News Video Using Image Features Slav Petrov, Arlo Faria, Pascal Michaillat, Alex Berg, Andreas Stolcke, Dan Klein, Jitendra Malik.
Image Processing David Kauchak cs160 Fall 2009 Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi.
Internet Vision - Lecture 3 Tamara Berg Sept 10. New Lecture Time Mondays 10:00am-12:30pm in 2311 Monday (9/15) we will have a general Computer Vision.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Computer Vision Group University of California Berkeley Shape Matching and Object Recognition using Shape Contexts Jitendra Malik U.C. Berkeley (joint.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Presented by Marlene Shehadeh Advanced Topics in Computer Vision ( ) Winter
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Morris LeBlanc.  Why Image Retrieval is Hard?  Problems with Image Retrieval  Support Vector Machines  Active Learning  Image Processing ◦ Texture.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Region-based Voting Exemplar 1 Query 1 Exemplar 2.
T.Sharon 1 Internet Resources Discovery (IRD) Introduction to MMIR.
Supervised Distance Metric Learning Presented at CMU’s Computer Vision Misc-Read Reading Group May 9, 2007 by Tomasz Malisiewicz.
UCB Computer Vision Animals on the Web Tamara L. Berg SUNY Stony Brook.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Time-Sensitive Web Image Ranking and Retrieval via Dynamic Multi-Task Regression Gunhee Kim Eric P. Xing 1 School of Computer Science, Carnegie Mellon.
Image Processing David Kauchak cs458 Fall 2012 Empirical Evaluation of Dissimilarity Measures for Color and Texture Jan Puzicha, Joachim M. Buhmann, Yossi.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Information Retrieval in Practice
AdvisorStudent Dr. Jia Li Shaojun Liu Dept. of Computer Science and Engineering, Oakland University 3D Shape Classification Using Conformal Mapping In.
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Computer Vision James Hays, Brown
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Human abilities Presented By Mahmoud Awadallah 1.
Multimedia Databases (MMDB)
Image and Video Retrieval INST 734 Doug Oard Module 13.
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
I could tell you about the love I feel for my first granddaughter. Or, I could show you the photo:
Category Discovery from the Web slide credit Fei-Fei et. al.
SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.
CREATING A POWERPOINT PRESENTATION. Planning a presentation Create a presentation Rearrange and delete text and slides Add animations Add transitions.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison February 2, 2010 Acknowledgments:
Unsupervised Learning of Visual Sense Models for Polysemous Words Kate Saenko Trevor Darrell Deepak.
Image Search using Google Image Outreach Activity July 2011 Virginia Tech.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Computer Vision Group Department of Computer Science University of Illinois at Urbana-Champaign.
Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC Relevance Feedback for Image Retrieval.
NEIL: Extracting Visual Knowledge from Web Data Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta Carnegie Mellon University CS381V Visual Recognition -
Cross-modal Hashing Through Ranking Subspace Learning
A Personal Tour of Machine Learning and Its Applications
The topic discovery models
Data Driven Attributes for Action Detection
Link Label Text Label… Click Here… Image Image Lorem Ipsum Lorem Ipsum
Personalized Social Image Recommendation
The topic discovery models
Project Implementation for ITCS4122
a chicken and egg problem…
The topic discovery models
Multimedia Information Retrieval
Introduction to Information Retrieval
Presented by Wanxue Dong
Information Retrieval and Web Design
Unsupervised learning of visual sense models for Polysemous words
Presentation transcript:

UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures

UCB Computer Vision I want to find lots of good pictures of monkeys…  What can I do?

UCB Computer Vision Google Image Search -- monkey Circa 2006

UCB Computer Vision Google Image Search -- monkey

UCB Computer Vision Google Image Search -- monkey

UCB Computer Vision Google Image Search -- monkey Words alone won’t work

UCB Computer Vision Flickr Search - monkey  Even with humans doing the labeling, the data is extremely noisy -- context, polysemy, photo sets  Words alone still won’t work!

UCB Computer Vision Our Results

UCB Computer Vision General Approach - Vision alone won’t solve the problem. - Text alone won’t solve the problem. -> Combine the two!

UCB Computer Vision Previous Work - Words & Pictures Barnard et al, CVPR 2001 Clustering Art Barnard et al, JMLR 2003 Labeling Regions

UCB Computer Vision Animals on the Web Extremely challenging visual categories. Free text on web pages. Take advantage of language advances. Combine multiple visual and textual cues.

UCB Computer Vision Goal: Classify images depicting semantic categories of animals in a wide range of aspects, configurations and appearances. Images typically portray multiple species that differ in appearance.

UCB Computer Vision Animals on the Web Outline: Harvest pictures of animals from the web using Google Text Search. Select visual exemplars using text based information. Use visual and textual cues to extend to similar images.

UCB Computer Vision Harvested Pictures 14,051 images for 10 animal categories. 12,886 additional images for monkey category using related monkey queries (primate, species, old world, science…)

UCB Computer Vision Text Model Latent Dirichlet Allocation (LDA) on the words in collected web pages to discover 10 latent topics for each category. Each topic defines a distribution over words. Select the 50 most likely words for each topic. 1.) frog frogs water tree toad leopard green southern music king irish eggs folk princess river ball range eyes game species legs golden bullfrog session head spring book deep spotted de am free mouse information round poison yellow upon collection nature paper pond re lived center talk buy arrow common prince Example Frog Topics: 2.) frog information january links common red transparent music king water hop tree pictures pond green people available book call press toad funny pottery toads section eggs bullet photo nature march movies commercial november re clear eyed survey link news boston list frogs bull sites butterfly court legs type dot blue

UCB Computer Vision Animals on the Web Outline: Harvest pictures of animals from the web using Google Text Search. Select visual exemplars using text based information. Use vision and text cues to extend to similar images.

UCB Computer Vision Select Exemplars Rank images according to whether they have these likely words near the image in the associated page (word score) Select up to 30 images per topic as exemplars. 2.) frog information january links common red transparent music king water hop tree pictures pond green people available book call press... 1.) frog frogs water tree toad leopard green southern music king irish eggs folk princess river ball range eyes game species legs golden bullfrog session head...

UCB Computer Vision Senses There are multiple senses of a category within the Google search results. Ask the user to identify which of the 10 topics are relevant to their search. Merge. Optional second step of supervision – ask user to mark erroneously labeled exemplars.

UCB Computer Vision Image Model Match Pictures of a category

UCB Computer Vision Geometric Blur Shape Feature Sparse SignalGeometric Blur (A.) Berg & Malik ‘01 Captures local shape, but allows for some deformation. Robust to differences in intra category object shape. Used in current best object recognition systems Zhang et al, CVPR 2006 Frome et al, NIPS 2006

UCB Computer Vision Image Model (cont.) Color Features: Histogram of what colors appear in the image Texture Features: Histograms of 16 filters * =

UCB Computer Vision Animals on the Web Outline: Harvest pictures of animals from the web using Google Text Search. Select visual exemplars using text based information. Use vision and text cues to extend to similar images.

UCB Computer Vision * * * * * * * * * * * * * * * * * * * + * * * ? * * ? + + Scoring Images Relevant Features Irrelevant Features Relevant Exemplar For each query feature apply a 1-nearest neighbor classifier. Sum votes for relevant class. Normalize. Combine 4 cue scores (word, shape, color, texture) using a linear combination. Query * Irrelevant Exemplar

UCB Computer Vision Classification Comparison Words Words + Picture

UCB Computer Vision Cue Combination: Monkey

UCB Computer Vision Cue Combination: FrogGiraffe

UCB Computer Vision Re-ranking Precision Classification Performance Google

UCB Computer Vision Re-ranking Precision Monkey Category Classification Performance Google Monkey

UCB Computer Vision Ranked Results: