EECS 442 – Computer vision Segments of this lectures are courtesy of Prof F. Li, R. Fergus and A. Zisserman Databases for object recognition and beyond.

Slides:



Advertisements
Similar presentations
Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint.
Advertisements

Poselets: Body Part Detectors trained Using 3D Human Pose Annotations Lubomir Bourdev & Jitendra Malik ICCV 2009.
Semantic Contours from Inverse Detectors Bharath Hariharan et.al. (ICCV-11)
Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.
3 Small Comments Alex Berg Stony Brook University I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical.
Wrap Up. We talked about Filters Edges Corners Interest Points Descriptors Image Stitching Stereo SFM.
Large Scale Visual Recognition Challenge 2011 Alex BergStony Brook Jia DengStanford & Princeton Sanjeev SatheeshStanford Hao SuStanford Fei-Fei LiStanford.
Internet Vision - Lecture 3 Tamara Berg Sept 10. New Lecture Time Mondays 10:00am-12:30pm in 2311 Monday (9/15) we will have a general Computer Vision.
LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Recognition: A machine learning approach
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework Li-Jia Li, Richard Socher, Li Fei- Fei 1.
Lecture 28: Bag-of-words models
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce.
Bag-of-features models
Unsupervised discovery of visual object class hierarchies Josef Sivic (INRIA / ENS), Bryan Russell (MIT), Andrew Zisserman (Oxford), Alyosha Efros (CMU)
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Agenda Introduction Bag-of-words model Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Visual Object Recognition Rob Fergus Courant Institute, New York University
Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)
Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.
Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The.
ImageNet: A Large-Scale Hierarchical Image Database
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
The Beauty of Local Invariant Features
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Salient Object Detection by Composition
Object Detection Sliding Window Based Approach Context Helps
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
Last part: datasets and object collections. CMU/MIT frontal facesvasc.ri.cmu.edu/idb/html/face/frontal_images cbcl.mit.edu/software-datasets/FaceData2.html.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Why is computer vision difficult?
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.
11/26/2015 Copyright G.D. Hager Class 2 - Schedule 1.Optical Illusions 2.Lecture on Object Recognition 3.Group Work 4.Sports Videos 5.Short Lecture on.
Object detection, deep learning, and R-CNNs
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
Jigsaws: joint appearance and shape clustering John Winn with Anitha Kannan and Carsten Rother Microsoft Research, Cambridge.
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Using the Forest to see the Trees: A computational model relating features, objects and scenes Antonio Torralba CSAIL-MIT Joint work with Aude Oliva, Kevin.
Jo˜ao Carreira, Abhishek Kar, Shubham Tulsiani and Jitendra Malik University of California, Berkeley CVPR2015 Virtual View Networks for Object Reconstruction.
CALTECH 256 Greg Griffin, Alex Holub and Pietro Perona.
Object detection with deformable part-based models
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Finding Clusters within a Class to Improve Classification Accuracy
Recognizing and Learning Object categories
Rob Fergus Computer Vision
Learning to Combine Bottom-Up and Top-Down Segmentation
“The Truth About Cats And Dogs”
Faster R-CNN By Anthony Martinez.
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Semantic Segmentation
Presentation transcript:

EECS 442 – Computer vision Segments of this lectures are courtesy of Prof F. Li, R. Fergus and A. Zisserman Databases for object recognition and beyond

Caltech 101 Pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. The size of each image is roughly 300 x 200 pixels.

Caltech 101 images

Smallest category size is 31 images: Too easy? –left-right aligned –Rotation artifacts –Saturated performance Caltech-101: Drawbacks

Results up to recent methods obtain almost 100%

Caltech-256 Smallest category size now 80 images About 30K images Harder –Not left-right aligned –No artifacts –Performance is halved –More categories New and larger clutter category

Caltech 256 images baseball-bat basketball-hoop dog kayac traffic light

The PASCAL Visual Object Classes (VOC) Dataset and Challenge Mark Everingham Luc Van Gool Chris Williams John Winn Andrew Zisserman

Dataset Content 20 classes: aeroplane, bicycle, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, train, TV Real images downloaded from flickr, not filtered for “quality” Complex scenes, scale, pose, lighting, occlusion,...

Annotation Complete annotation of all objects Annotated in one session with written guidelines Truncated Object extends beyond BB Occluded Object is significantly occluded within BB Pose Facing left Difficult Not scored in evaluation

Examples Aeroplane Bus BicycleBirdBoatBottle CarCatChairCow

History New dataset annotated annually –Annotation of test set is withheld until after challenge ImagesObjectsClassesEntries 20052,2322, Collection of existing and some new data ,3049, Completely new dataset from flickr (+MSRC) 20079,96324, Increased classes to 20. Introduced tasters ,77620,73920 Added “occlusion” flag. Reuse of taster data. Release detailed results to support “meta-analysis”

Other recent datasets ESP [Ahn et al, 2006] LabelMe [ Russell et al, 2005] TinyImage Torralba et al Lotus Hill [ Yao et al, 2007] MSRC [Shotton et al. 2006]

3D object dataset [Savarese & Fei-Fei 07]

Instances Poses 1012 … 1 72 … ……… … … …… 8 azimuth angles 3 zenith 3 distances ~ 7000 images!

~20K categories; 14 million images; ~700im/categ; free to public at 16 Largest dataset for object categories up to date J. Deng, H. Su. K. Li, L. Fei-Fei,

is a knowledge ontology Taxonomy Partonomy The “social network” of visual concepts –Hidden knowledge and structure among visual concepts –Prior knowledge –Context

More Datasets…. UIUC Cars (2004) S. Agarwal, A. Awan, D. Roth 3D Textures (2005) S. Lazebnik, C. Schmid, J. Ponce CuRRET Textures (1999) K. Dana B. Van Ginneken S. Nayar J. Koenderink CAVIAR Tracking (2005) R. Fisher, J. Santos-Victor J. Crowley FERET Faces (1998) P. Phillips, H. Wechsler, J. Huang, P. Raus CMU/VASC Faces (1998) H. Rowley, S. Baluja, T. Kanade MNIST digits ( ) Y LeCun & C. Cortes KTH human action (2004) I. Leptev & B. Caputo Sign Language (2008) P. Buehler, M. Everingham, A. Zisserman Segmentation (2001) D. Martin, C. Fowlkes, D. Tal, J. Malik. Middlebury Stereo (2002) D. Scharstein R. Szeliski COIL Objects (1996) S. Nene, S. Nayar, H. Murase

CMU/MIT frontal facesvasc.ri.cmu.edu/idb/html/face/frontal_images cbcl.mit.edu/software-datasets/FaceData2.html PatchesFrontal faces Graz-02 Databasewww.emt.tugraz.at/~pinz/data/GRAZ_02/Segmentation masksBikes, cars, people UIUC Image Databasel2r.cs.uiuc.edu/~cogcomp/Data/Car/Bounding boxesCars TU Darmstadt Databasewww.vision.ethz.ch/leibe/data/Segmentation masksMotorbikes, cars, cows LabelMe datasetpeople.csail.mit.edu/brussell/research/LabelMe/intro.htmlPolygonal boundary>500 Categories Caltech 101www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.htmlSegmentation masks101 categories Caltech 256 COIL www1.cs.columbia.edu/CAVE/research/softlib/coil-100.html Bounding Box Patches 256 Categories 100 instances NORBwww.cs.nyu.edu/~ylclab/data/norb-v1.0/Bounding box50 toys Databases for object localization Databases for object recognition On-line annotation tools ESP gamewww.espgame.orgGlobal image descriptionsWeb images LabelMepeople.csail.mit.edu/brussell/research/LabelMe/intro.htmlPolygonal boundaryHigh resolution images The next tables summarize some of the available datasets for training and testing object detection and recognition algorithms. These lists are far from exhaustive. Links to datasets Collections PASCALhttp:// boxesvarious