UCB Computer Vision Animals on the Web Tamara L. Berg SUNY Stony Brook.

Slides:



Advertisements
Similar presentations
Collections Management Software for Museums and Archives r e d i s c o v e r y s o f t w a r e. c o m O V E R V I E W P R E S E N T A T I O N.
Advertisements

Shape Matching and Object Recognition using Low Distortion Correspondence Alexander C. Berg, Tamara L. Berg, Jitendra Malik U.C. Berkeley.
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.
Tamara Berg Retrieval Language and Vision.
Microsoft Office Illustrated Fundamentals Unit N: Polishing and Running a Presentation.
The Flat Shape “Everything around us is shaped”. The flat shapes are visual elements that are used to create images. The simple flat shapes are triangle,
Image Retrieval Basics Uichin Lee KAIST KSE Slides based on “Relevance Models for Automatic Image and Video Annotation & Retrieval” by R. Manmatha (UMASS)
Internet Vision - Lecture 3 Tamara Berg Sept 10. New Lecture Time Mondays 10:00am-12:30pm in 2311 Monday (9/15) we will have a general Computer Vision.
1 Web Search and Web Search Overlap: What the Deal? Amanda Spink Queensland University of Technology.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
T.Sharon 1 Internet Resources Discovery (IRD) Introduction to MMIR.
Intro to Microsoft Word.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Copyright and Digital Images May I copy that image?
SBU Digital Media CSE 595 Words and Pictures Tamara L. Berg SUNY Stony Brook.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Information Retrieval in Practice
An Introduction to Microsoft Word. Microsoft Word This program allows you to type letters, papers, reports and even books. It is available through the.
AdvisorStudent Dr. Jia Li Shaojun Liu Dept. of Computer Science and Engineering, Oakland University 3D Shape Classification Using Conformal Mapping In.
1 Announcements Research Paper due today Research Talks –Nov. 29 (Monday) Kayatana and Lance –Dec. 1 (Wednesday) Mark and Jeremy –Dec. 3 (Friday) Joe and.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Lecture #32 WWW Search. Review: Data Organization Kinds of things to organize –Menu items –Text –Images –Sound –Videos –Records (I.e. a person ’ s name,
Human abilities Presented By Mahmoud Awadallah 1.
Tutorial. What is Instagram? Instagram is a free, online photo sharing, video sharing and social networking service that enables users to take pictures.
Multimedia Databases (MMDB)
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
An Introduction to Microsoft Word. Microsoft Word This program allows you to type letters, papers, and other documents. This program allows you to type.
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
TRECVID Evaluations Mei-Chen Yeh 05/25/2010. Introduction Text REtrieval Conference (TREC) – Organized by National Institute of Standards (NIST) – Support.
I could tell you about the love I feel for my first granddaughter. Or, I could show you the photo:
Category Discovery from the Web slide credit Fei-Fei et. al.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison February 2, 2010 Acknowledgments:
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
Unsupervised Learning of Visual Sense Models for Polysemous Words Kate Saenko Trevor Darrell Deepak.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Image Search using Google Image Outreach Activity July 2011 Virginia Tech.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.
By: Mary Loveless EDEL 2200 Section 02
Information Retrieval and Web Search Course overview Instructor: Rada Mihalcea.
Design Principles EDUC 6307: Design of Print Based Media Summer 2009 Ranelle Woolrich.
How are we smart? Using multiple intelligences to learn about ourselves By Sarah Lovelidge.
Computer Vision Group Department of Computer Science University of Illinois at Urbana-Champaign.
Printing methods – what you need to know 1.Colour separation 2.Process colours 3.CMYK 4.Registration marks and the order of application of colours 5.Digital.
Introducing Scratch Learning resources for the implementation of the scenario
Session 5: How Search Engines Work. Focusing Questions How do search engines work? Is one search engine better than another?
Searching the Web for academic information Ruth Stubbings.
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
The Smarter Balanced Assessment Consortium
Personalized Social Image Recommendation
Understanding My Paper Score Report
The Smarter Balanced Assessment Consortium
Project Implementation for ITCS4122
Advanced Techniques for Automatic Web Filtering
The Smarter Balanced Assessment Consortium
Advanced Techniques for Automatic Web Filtering
The Smarter Balanced Assessment Consortium
a chicken and egg problem…
Multimedia Information Retrieval
Introduction to Information Retrieval
Presented by Wanxue Dong
The Smarter Balanced Assessment Consortium
Unsupervised learning of visual sense models for Polysemous words
The Smarter Balanced Assessment Consortium
Presentation transcript:

UCB Computer Vision Animals on the Web Tamara L. Berg SUNY Stony Brook

UCB Computer Vision I want to find lots of good pictures of monkeys…  What can I do?

UCB Computer Vision Google Image Search -- monkey

UCB Computer Vision Google Image Search -- monkey

UCB Computer Vision Google Image Search -- monkey

UCB Computer Vision Google Image Search -- monkey Words alone won’t work

UCB Computer Vision Flickr Search - monkey  Even with humans doing the labeling, the data is extremely noisy -- context, polysemy, photo sets  Words alone still won’t work!

UCB Computer Vision Our Results

UCB Computer Vision  Consumer Photo Collections Over the hills and far away Road, Hills, Germany, Hoffenheim, Outstanding Shots, specland, Baden- Wuerttemberg Heavenly Peacock, AlbinoPeacock, WhiteBeauty, Birds, Wildlife, FeathredaleWildlifePark, PictureAustralia, ImpressedBeauty End of the world - Verdens Ende - The lighthouse 1 Verdens ende, end of the world, norway, lighthouse, ABigFave, vippefyr, wood, coal Flickr – 3 billion photographs, several million uploaded per day

UCB Computer Vision Museum and Library Collections  Fine Arts Museum of San Francisco (82,000 images) Woman of Head Howard H G Mrs Gift America North bust States United Sculpture marble bowl stemmed small Irridescent glass  New York Public Library  Digital Collection The new board walk, Rockaway, Long Island Part of New England, New York, east New Iarsey and Long Iland.

UCB Computer Vision Web Collections Billions of Web Pages

UCB Computer Vision Video OUTSIDE IN THE RAIN THE SENATOR WEARING HIS UH BASEBALL CAP A BOSTON RED SOX CAP AS HE TALKED TO HIS SUPPORTERS HERE IN THE RAIN THE UH SENATOR THEY'RE DOING HIS BEST TO TRY TO MAKE HIS CASE THAT HE WILL BE THE MAN FOR THE MIDDLE CLASS AND UH TRY TO CONVINCE HIS SUPPORTERS TO EXPRESS THEIR SUPPORT THROUGH A VOTE ON TUESDAY IN THERE WE ARE TWENTY FOUR HOURS FROM THE GREAT MOMENT THAT THE WORLD IN AMERICA IS WAITING FOR IT I NEED TO YOU IN THESE HOURS TO GO OUT AND DO THE HARD WORK NOT ON THOSE DOORS MAKE THOSE PHONE CALLS TO TALK TO FRIENDS TAKE PEOPLE TO THE POLLS HELP US CHANGE THE DIRECTION OF THIS GREAT NATION FOR THE BETTER CAN YOU IMAGINE A UH SENATOR BEGINNING HIS DAY IN FLORIDA TODAY TrecVid 2006

UCB Computer Vision Consumer Products Marc by Marc Jacobs Adorable peep-toe pumps, great for any occasion. Available in an array of uppers. Metallic fabric trim and bow detail. Metallic leather lined footbed. Lined printed design. Leather sole. 3 3/4" heel. Zappos.com soft and glassy patent calfskin trimmed with natural vachetta cowhide, open top satchel for daytime and weekends, interior double slide pockets and zip pocket, seersucker stripe cotton twill lining, kate spade leather license plate logo, imported 2.8" drop length 14"h x 14.2"w x 6.9"d Katespade.com It's the perfect party dress. With distinctly feminine details such as a wide sash bow around an empire waist and a deep scoopneck, this linen dress will keep you comfortable and feeling elegant all evening long. * Measures 38" from center back, hits at the knee. * Scoopneck, full skirt. * Hidden side zip, fully lined. * 100% Linen. Dry clean. bananarepublic.com E-commerce transactions in 2004, 2005, 2006 of $145 billion, $168 billion, and $198 billion (Forrester Research).

UCB Computer Vision General Approach - Vision alone won’t solve the problem. - Text alone won’t solve the problem. -> Combine the two!

UCB Computer Vision Previous Work - Words & Pictures Li and Wang, PAMI 2003 Barnard et al, CVPR 2001 Clustering Art Auto-Annotation Barnard et al, JMLR 2003 Labeling Regions Yanai et al, MIR 2005 Image Classification

UCB Computer Vision Animals on the Web Extremely challenging visual categories. Free text on web pages. Take advantage of language advances. Combine multiple visual and textual cues.

UCB Computer Vision Words alone won’t work Image Search

UCB Computer Vision Goal: Classify images depicting semantic categories of animals in a wide range of aspects, configurations and appearances. Images typically portray multiple species that differ in appearance.

UCB Computer Vision Animals on the Web Outline: Harvest pictures of animals from the web using Google Text Search. Select visual exemplars using text based information. Use visual and textual cues to extend to similar images.

UCB Computer Vision Harvested Pictures 14,051 images for 10 animal categories. 12,886 additional images for monkey category using related monkey queries (primate, species, old world, science…)

UCB Computer Vision Text Model Latent Dirichlet Allocation (LDA) on the words in collected web pages to discover 10 latent topics for each category. Each topic defines a distribution over words. Select the 50 most likely words for each topic. 1.) frog frogs water tree toad leopard green southern music king irish eggs folk princess river ball range eyes game species legs golden bullfrog session head spring book deep spotted de am free mouse information round poison yellow upon collection nature paper pond re lived center talk buy arrow common prince Example Frog Topics: 2.) frog information january links common red transparent music king water hop tree pictures pond green people available book call press toad funny pottery toads section eggs bullet photo nature march movies commercial november re clear eyed survey link news boston list frogs bull sites butterfly court legs type dot blue

UCB Computer Vision Select Exemplars Rank images according to whether they have these likely words near the image in the associated page (word score) Select up to 30 images per topic as exemplars. 2.) frog information january links common red transparent music king water hop tree pictures pond green people available book call press... 1.) frog frogs water tree toad leopard green southern music king irish eggs folk princess river ball range eyes game species legs golden bullfrog session head...

UCB Computer Vision Senses There are multiple senses of a category within the Google search results. Ask the user to identify which of the 10 topics are relevant to their search. Merge. Optional second step of supervision – ask user to mark erroneously labeled exemplars.

UCB Computer Vision Image Model Match Pictures of a category

UCB Computer Vision Geometric Blur Shape Feature Sparse SignalGeometric Blur (A.) Berg & Malik ‘01 Captures local shape, but allows for some deformation. Robust to differences in intra category object shape. Used in current best object recognition systems Zhang et al, CVPR 2006 Frome et al, NIPS 2006

UCB Computer Vision Image Model (cont.) Color Features: Histogram of what colors appear in the image Texture Features: Histograms of 16 filters * =

UCB Computer Vision * * * * * * * * * * * * * * * * * * * + * * * ? * * ? + + Scoring Images Relevant Features Irrelevant Features Relevant Exemplar For each query feature apply a 1-nearest neighbor classifier. Sum votes for relevant class. Normalize. Combine 4 cue scores (word, shape, color, texture) using a linear combination. Query * Irrelevant Exemplar

UCB Computer Vision Classification Comparison Words Words + Picture

UCB Computer Vision Cue Combination: Monkey

UCB Computer Vision Cue Combination: FrogGiraffe

UCB Computer Vision Re-ranking Precision Classification Performance Google

UCB Computer Vision Re-ranking Precision Monkey Category Classification Performance Google Monkey

UCB Computer Vision Ranked Results:

UCB Computer Vision Summary  Enormous amounts of data.  How to deal with it is still an open question.  We should combine words & pictures  Data opens up lots of new research problems!

UCB Computer Vision Acknowledgements  NSF Fellowship  Office of Naval Research  California Digital Library Project  Yahoo! Research  David Forsyth