Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Components of GIS.

James Hays and Alexei A. Efros Carnegie Mellon University CVPR IM2GPS: estimating geographic information from a single image Wen-Tsai Huang.

Aggregating local image descriptors into compact codes

電腦視覺 Computer and Robot Vision I

SPONSORED BY SA2014.SIGGRAPH.ORG Annotating RGBD Images of Indoor Scenes Yu-Shiang Wong and Hung-Kuo Chu National Tsing Hua University CGV LAB.

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

Automatically Annotating and Integrating Spatial Datasets Chieng-Chien Chen, Snehal Thakkar, Crail Knoblock, Cyrus Shahabi Department of Computer Science.

Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.

Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.

Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.

CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.

Video summarization by video structure analysis and graph optimization M. Phil 2 nd Term Presentation Lu Shi Dec 5, 2003.

Let Computer Draw Qingyuan Kong. Goal Give me a picture “Obama stands in front of a pyramid”

Highlights Lecture on the image part (10) Automatic Perception 16

Scale Invariant Feature Transform (SIFT)

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Advanced GIS Using ESRI ArcGIS 9.3 Arc ToolBox 5 (Spatial Statistics)

Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.

Creating and Exploring a Large Photorealistic Virtual Space INRIA / CSAIL / Adobe First IEEE Workshop on Internet Vision, associated with CVPR 2008.

Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.

Producing Images and Movies in Chimera An introduction to several useful tools for creating publication-quality images and movies.

Yuping Lin and Gérard Medioni.  Introduction  Method  Register UAV streams to a global reference image ▪ Consecutive UAV image registration ▪ UAV to.

Performance Evaluation of Grouping Algorithms Vida Movahedi Elder Lab - Centre for Vision Research York University Spring 2009.

«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,

Shape Recognition and Pose Estimation for Mobile Augmented Reality Author ： N. Hagbi, J. El-Sana, O. Bergig, and M. Billinghurst Date ： Speaker.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Human abilities Presented By Mahmoud Awadallah 1.

SPIE'01CIRL-JHU1 Dynamic Composition of Tracking Primitives for Interactive Vision-Guided Navigation D. Burschka and G. Hager Computational Interaction.

Object Bank Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 4 th, 2013.

ALIGNMENT OF 3D ARTICULATE SHAPES. Articulated registration Input: Two or more 3d point clouds (possibly with connectivity information) of an articulated.

Labeling Images for FUN!!! Yan Cao, Chris Hinrichs.

NLPainter “Text Analysis for picture/movie generation” David Leoni Eduardo C á rdenas 12/01/2012.

Scale-less Dense Correspondences Tal Hassner The Open University of Israel ICCV’13 Tutorial on Dense Image Correspondences for Computer Vision.

Developing Trust Networks based on User Tagging Information for Recommendation Making Touhid Bhuiyan et al. WISE May 2012 SNU IDB Lab. Hyunwoo Kim.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

ECE 172A SIMPLE OBJECT DETECTOR WITH INDICATOR WHEN A NEW OBJECT HAS BEEN ADDED TO OR MISSING IN A ROOM Presented by by Hugo Groening.

UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.

Character Identification in Feature-Length Films Using Global Face-Name Matching IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 11, NO. 7, NOVEMBER 2009 Yi-Fan.

C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.

Scene Completion Using Millions of Photographs James Hays, Alexei A. Efros Carnegie Mellon University ACM SIGGRAPH 2007.

INTERACTIVELY BROWSING LARGE IMAGE DATABASES Ronald Richter, Mathias Eitz and Marc Alexa.

1 Image Matching using Local Symmetry Features Daniel Cabrini Hauagge Noah Snavely Cornell University.

Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.

2004 謝俊瑋 NTU, CSIE, CMLab 1 A Rule-Based Video Annotation System Andres Dorado, Janko Calic, and Ebroul Izquierdo, Senior Member, IEEE.

A Statistical Approach to Texture Classification Nicholas Chan Heather Dunlop Project Dec. 14, 2005.

Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.

Carl Vondrick, Aditya Khosla, Tomasz Malisiewicz, Antonio Torralba Massachusetts Institute of Technology

MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.

Tentative Future Courses Fall `11 : Computer Vision – emphasis on recognition Spring `11 : Graduate seminar Fall `12 : Computational Photography.

A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:

1 Shape Descriptors for Maximally Stable Extremal Regions Per-Erik Forss´en and David G. Lowe Department of Computer Science University of British Columbia.

Biologically Inspired Vision-based Indoor Localization Zhihao Li, Ming Yang

A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.

Effect of Hough Forests Parameters on Face Detection Performance: An Empirical Analysis M. Hassaballah, Mourad Ahmed and H.A. Alshazly Department of Mathematics,

Jo˜ao Carreira, Abhishek Kar, Shubham Tulsiani and Jitendra Malik University of California, Berkeley CVPR2015 Virtual View Networks for Object Reconstruction.

BMVC 2010 Sung Ju Hwang and Kristen Grauman University of Texas at Austin.

The design of smart glasses for VR applications The CU-GLASSES

Probabilistic Data Management

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

Modeling the world with photos

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Recognizing and Learning Object categories

Rob Fergus Computer Vision

Geometric Hashing: An Overview

Introduction What IS computer vision?

Presentation transcript:

Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT

Outline Introduction Web Annotation and Data Statistics -A. Data Set Evolution and Distribution of Objects -B. Study of Online Labelers The Space of LabelMe Images -A. Distribution of Scene Types -B. The Space of Images -C. Recognition by Scene Alignment Beyond 2-D Images -A. From Annotations to 3-D -B. Video Annotation Conclusion

Introduction From small data set to large data set In 2005, an online tool LabelMe is created LabelMe provides functionalities for drawing polygons to outline the spatioal extent of object in images

Web Annotation and Data Statistics A. Data Set Evolution and Distribution of Objects B. Study of Online Labelers

The Features of LabelMe Database Object class recognition Learning about objects embedded in a scene High-quality labeling Many diverse object classes Many diverse images Many noncopyrighted images Open and dynamic

Data Set Evolution and Distribution of Objects(1/2) (a)Number of annotated objects (b)Number of images with at least one annotated object (c)Number of unique object descriptions

Data Set Evolution and Distribution of Objects(2/2)

Study of Online Labelers From July 7, 2008 to March 19, 2009 (a)Number of new annotations provided by individual users (b)Distribution of the length of time it takes to label an object

The Space of LabelMe Images A. Distribution of Scene Types B. The Space of Images C. Recognition by Scene Alignment

Distribution of Scene Types(1/1) Let’s start from cognitive psychology Next we study how many configurations of 4 objects are presented The distribution follows a power law ( n=1,2,4,8 )

The Space of Images(1/3)

Process of Defining Semantic Distance(2/3)

The Space of Images(3/3) A visualization of images that are fully annotated

Recognition by Scene Alignment When giving a new image as input, we use GIST descriptor to compute the distance

The Power of a Large Scale Database An algorithm provides an upper bound: find the nearest neighbor of input image as a labeling of the input image This result gives us a hint about “How many more images do we need to label”?

Beyond 2-D Images A. From Annotations to 3-D B. Video Annotation

From Annotations to 3-D(1/7) The label of objects now contains some implicit information observed by analyzing the overlap between object boundaries Object types Ground Objects Standing Objects Attached objects Relations between objects Supported-by Part-of

From Annotations to 3-D(2/7) Learning the relationship between objects 1) part-of : evaluate the frequency of high relative overlap between polygons 2)supported-by : have the bottom part of its polygon live inside the supporting object

From Annotations to 3-D(3/7)

From Annotations to 3-D(4/7) Reconstructing a 3D model for input image 1) define object type 2) define polygon edge type 3) compute the real distance between objects Object typeEdge type Ground objects(green)Contact(white) Standing objects(red)Attached(gray) Attached objects(yellow)Occlusion(black)

From Annotations to 3-D(5/7)

From Annotations to 3-D(6/7) The more labeling makes the quality better However, if the labeling goes wrong

From Annotations to 3-D(7/7)

Video Annotation(1/1)

Conclusion A web-based tool that allows the labeling of objects and their location in images LabelMe has collected a large annotated database of images with many different scene and object class LabelMe can recover the 3-D description of an image The next goal is expending the database of video and offering a promising direction of computer vision and computer graphics

References

There are a lot more references …