Large Scale Visual Recognition Challenge 2011 Alex BergStony Brook Jia DengStanford & Princeton Sanjeev SatheeshStanford Hao SuStanford Fei-Fei LiStanford.

Slides:

Advertisements

Similar presentations

Classification spotlights

Advertisements

3 Small Comments Alex Berg Stony Brook University I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical.

Large-Scale Object Recognition using Label Relation Graphs Jia Deng 1,2, Nan Ding 2, Yangqing Jia 2, Andrea Frome 2, Kevin Murphy 2, Samy Bengio 2, Yuan.

ImageNet Classification with Deep Convolutional Neural Networks

ImageCLEF breakout session Please help us to prepare ImageCLEF2010.

Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.

Clustering approaches for high- throughput data Sushmita Roy BMI/CS 576 Nov 12 th, 2013.

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

Large-Scale Object Recognition with Weak Supervision

Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.

Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.

Recognition: A machine learning approach

Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010.

Machine Learning Case study. What is ML ?  The goal of machine learning is to build computer systems that can adapt and learn from their experience.”

Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.

The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.

Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Lessons Learned from Information Retrieval Chris Buckley Sabir Research

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

1 Web Query Classification Query Classification Task: map queries to concepts Application: Paid advertisement 问题：百度 /Google 怎么赚钱？

Visual Object Recognition Rob Fergus Courant Institute, New York University

Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston.

Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.

Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)

Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.

Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The.

ImageNet: A Large-Scale Hierarchical Image Database

Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.

Human abilities Presented By Mahmoud Awadallah 1.

Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Why Categorize in Computer Vision ?. Why Use Categories? People love categories!

Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.

Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.

Artificial Intelligence Project 1 Neural Networks Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Collective Vision: Using Extremely Large Photograph Collections Mark Lenz CameraNet Seminar University of Wisconsin – Madison February 2, 2010 Acknowledgments:

Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.

Google’s Deep-Web Crawl By Jayant Madhavan, David Ko, Lucja Kot, Vignesh Ganapathy, Alex Rasmussen, and Alon Halevy August 30, 2008 Speaker : Sahana Chiwane.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.

Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.

1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.

Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.

Annotation Framework & ImageCLEF 2014 JAN BOTOREK, PETRA BUDÍKOVÁ

SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.

Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Recent developments in object detection

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

CNN-RNN: A Uniﬁed Framework for Multi-label Image Classiﬁcation

The Relationship between Deep Learning and Brain Function

Object detection with deformable part-based models

Mini Places Challenge Adrià Recasens, Nov 21.

The Problem: Classification

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Daniel Bevis William King Villanova University Spring 2006 CS9010

Accounting for the relative importance of objects in image retrieval

Finding Clusters within a Class to Improve Classification Accuracy

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Image Classification.

“The Truth About Cats And Dogs”

Approaching an ML Problem

Outline Background Motivation Proposed Model Experimental Results

Active AI Projects at WIPO

Presentation transcript:

Large Scale Visual Recognition Challenge 2011 Alex BergStony Brook Jia DengStanford & Princeton Sanjeev SatheeshStanford Hao SuStanford Fei-Fei LiStanford

LSVRC 2011 Car Categorization Localization Ca r Large Scale Recognition Millions to billions of images Hundreds of thousands of possible labels Recognition for indexing and retrieval Complement current Pascal VOC competitions LSVRC 2010 Car

Source for categories and training data ImageNet –14,192,122 million images, thousand categories –Image found via web searches for WordNet noun synsets –Hand verified using Mechanical Turk –Bounding boxes for query object labeled –New data for validation and testing each year WordNet –Source of the labels –Semantic hierarchy –Contains large fraction of English nouns –Also used to collect other datasets like tiny images (Torralba et al) –Note that categorization is not the end/only goal, so idiosyncrasies of WordNet may be less critical

ILSVRC 2011 Data Training data 1,229,413 images in 1000 synsets Min = 384, median = 1300, max = 1300 (per synset) 315,525 images have bounding box annotations Min = 100 / synset 345,685 bounding box annotations Validation data 50 images / synset 55,388 bounding box annotations Test data 100 images / synset 110,627 bounding box annotations * Tree and some plant categories replaced with other objects between 2010,2011

Jia Deng (lead student)

is a knowledge ontology Taxonomy Partonomy The “social network” of visual concepts – Hidden knowledge and structure among visual concepts – Prior knowledge – Context

is a knowledge ontology Taxonomy Partonomy The “social network” of visual concepts – Hidden knowledge and structure among visual concepts – Prior knowledge – Context

diversity Caltech101 Diversit y

Classification Challenge Given an image predict categories of objects that may be present in the image 1000 “leaf” categories from ImageNet Two evaluation criteria based on cost averaged over test images –Flat cost – pay 0 for correct category, 1 otherwise –Hierarchical cost – pay 0 for correct category, height of least common ancestor in WordNet for any other category (divide by max height for normalization) Allow a shortlist of up to 5 predictions –Use the lowest cost prediction each test image –Allows for incomplete labeling of all categories in an image

Participation 15 submissions 96 registrations Top Entries Xerox Research Centre Europe Univ. Amsterdam & Univ. Trento ISI Lab Univ. Tokyo NII Japan

Classification Results Flat Cost, 5 Predictions per Image Baselin e 0.80 Flat Cost # Entries Probably evidence of some self selection in submissions.

Best Classification Results 5 Predictions / Image

Classification Winners 1)XRCE ( 0.26 ) 2)Univ. Amsterdam & Univ. Trento ( 0.31 ) 3)ISI Lab Tokyo University ( 0.34 )

Easiest synsets web site, website, internet site, site0.067 jack-o'-lantern0.117 odometer, hodometer,0.127 manhole cover0.127 bullet train, bullet0.147 electric locomotive0.150 zebra0.163 daisy0.170 pickelhaube0.170 freight car0.180 nematode, nematode worm, roundworm0.180 * Numbers indicate the mean flat cost from the top 5 predictions from all submissions

Toughest Synsets water jug0.940 cassette player0.940 weasel0.943 sunscreen, sunblock, sun blocker0.943 plunger, plumber's helper0.947 syringe0.950 wooden spoon0.953 mallet0.957 spatula0.963 paintbrush0.967 power drill0.973 * Numbers indicate the mean flat cost from the top 5 predictions from all submissions

Water-jugs are hard!

But wooden spoons?

Easiest Subtrees Synset# of leavesAverage flat cost furniture, piece of furniture vehicle bird food vertebrate, craniate

Hardest Subtrees Synset# of leavesAverage flat cost implement tool vessel reptile dog

Most difficult …..?

Most difficult paintbrushes!

Easiest paintbrushes

Localization Challenge

Entries Two Brave Submissions TeamFlat costHierarchical cost University of Amsterdam & University of Trento ISI lab., the Univ. of Tokyo

Precision BestWorst jack-o'-lanternpaintbrush web site, website, internet site, sitemuzzle monarch, monarch butterfly,power drill rock beauty [tricolored fish]water jug golf ballmallet daisyspatula airlinergravel, crushed rock

Recall BestWorst jack-o'-lanternpaintbrush web site, website, internet site, sitemuzzle monarch, monarch butterfly,power drill rock beauty [tricolored fish]water jug golf ballmallet manhole coverspatula airlinergravel, crushed rock

Detection performance coupled to classification –All of {paintbrush, muzzle, power drill, water jug, mallet, spatula,gravel} and many others are difficult classification synsets The best detection synsets those with the best classification performance –E.g., Tend to occupy the entire image Rough Analysis

Highly accurate localizations from the winning submission

Other correct localizations from the winning submission

2012 Large Scale Visual Recognition Challenge! Stay tuned…