Human abilities Presented By Mahmoud Awadallah 1.

Slides:

Advertisements

Similar presentations

A PowerPoint Presentation

Advertisements

What makes an image memorable?

UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.

Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.

Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Recognition: A machine learning approach

Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.

Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,

Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.

Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.

CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.

Information Retrieval in Practice

 Manmatha MetaSearch R. Manmatha, Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts, Amherst.

Large Image Databases and Small Codes for Object Recognition Rob Fergus (NYU) Antonio Torralba (MIT) Yair Weiss (Hebrew U.) William T. Freeman (MIT)

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)

Opportunities of Scale Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.

Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.

Database Design IST 7-10 Presented by Miss Egan and Miss Richards.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Lab 1: The Stroop Effect (1935) Measuring interference effects

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

Multimedia Databases (MMDB)

Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.

Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab

Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.

Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.

Labeling Images for FUN!!! Yan Cao, Chris Hinrichs.

Recognizing Activities of Daily Living from Sensor Data Henry Kautz Department of Computer Science University of Rochester.

Summary Marie Yarbrough. Introduction History of Image Forgery Method Segmentation Classification Common-Sense Reasoning Conclusion.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.

ECE 172A SIMPLE OBJECT DETECTOR WITH INDICATOR WHEN A NEW OBJECT HAS BEEN ADDED TO OR MISSING IN A ROOM Presented by by Hugo Groening.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.

Visual Data on the Internet With slides from Alexei Efros, James Hays, Antonio Torralba, and Frederic Heger : Computational Photography Jean-Francois.

VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR

CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.

1 Database Basics: Filemaker 7 Introduction Center for Faculty Development, SJSU Steve Sloan

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Yixin Chen and James Z. Wang The Pennsylvania State University

The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.

CS 548 Spring 2016 Model and Regression Trees Showcase by Yanran Ma, Thanaporn Patikorn, Boya Zhou Showcasing work by Gabriele Fanelli, Juergen Gall, and.

Tentative Future Courses Fall `11 : Computer Vision – emphasis on recognition Spring `11 : Graduate seminar Fall `12 : Computational Photography.

Making Research Tools Accessible for All AI Students Zach Dodds, Christine Alvarado, and Sara Sood Though a compelling area of research with many applications,

Li Fei-Fei, Stanford Rob Fergus, NYU Antonio Torralba, MIT Recognizing and Learning Object Categories: Year 2009 ICCV 2009 Kyoto, Short Course, September.

Information Retrieval in Practice

Histograms CSE 6363 – Machine Learning Vassilis Athitsos

A Visualization Tool for fMRI Data Mining

From: What do we perceive in a glance of a real-world scene?

Lecture 25: Introduction to Recognition

Opportunities of Scale, Part 2

Recognition using Nearest Neighbor (or kNN)

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Lecture 25: Introduction to Recognition

Recognizing and Learning Object categories

Rob Fergus Computer Vision

Ramesh Jain Visual Intelligence Ramesh Jain

Domingo Mery Department of Computer Science

a chicken and egg problem…

International Marketing and Output Database Conference 2005

What can humans see with a single glance?

Introduction Computer vision is the analysis of digital images

Domingo Mery Department of Computer Science

Visual Question Answering

Presentation transcript:

Human abilities Presented By Mahmoud Awadallah 1

What do we perceive in a glance of a real-world scene? Bryan Russell

Motivation Much can be recognized quickly Investigate the early computations of an image Analyze real-world, complicated scenes

Stimuli: outdoor images

Stimuli: indoor images

Experiment specifications 5 naïve scorers 105 attributes assessed for each description 2 scoring fields for each attribute: – whether the attribute is described – if yes, whether it is accurate

Computation of score Attribute: building, Image: 52, PT: 500ms Subject Correctly described? Yes No Yes Score: 0.67 For image 52, normalize by max score across all PT

How the scorers perform Building attribute

The “content” of a single fixation Animate objects

The “content” of a single fixation Inanimate objects

The “content” of a single fixation Scene

The “content” of a single fixation Social events

Outdoor vs. indoor bias

Summary plots

Sensory vs. object/scene

Correlation of object/scene perception

Scene vs. objects

Conclusions Outdoor scene bias Less information needed for shape/sensory recognition Weak correlation between scene and object perception

80 million tiny images: a large dataset for non-parametric object and scene recognition

A.I. for the postmodern world: All questions have already been answered…many times, in many ways Google is dumb, the “intelligence” is in the data

How about visual data? The key question here in this paper is: How big does the image dataset need to be to robustly perform recognition using simple nearest-neighbor schemes? Complex classification methods don’t extend well Can we use a simple classification method?

Past and future of image datasets in computer vision Lena a dataset in one picture Number of pictures Human Click Limit (all humanity taking one picture/second during 100 years) Time COREL billion 2020? Slide by Antonio Torralba

How big is Flickr? Credit: Franck_Michel ( 100M photos updated daily 6B photos as of August 2011! ~3B public photos

How Annotated is Flickr? (tag search) Party – 23,416,126 Paris – 11,163,625 Pittsburgh – 1,152,829 Chair – 1,893,203 Violin – 233,661 Trashcan – 31,200

Noisy Output from Image Search Engines

Thumbnail Collection Project Collected 80M images

Thumbnail Collection Project Collect images for ALL objects List obtained from WordNet 75,378 non-abstract nouns in English

Web image dataset 79.3 million images Collected using image search engines List of nouns taken from Wordnet Save all images in 32x32 resolution

How Much is 80M Images? One feature-length movie: 105 min = 151K 24 FPS For 80M images, watch 530 movies How do we store this? 1k * 80M = 80 GB Actual storage: 760GB

Powers of 10 Number of images on my hard drive: 10 4 Number of images seen during my first 10 years: 10 8 (3 images/second * 60 * 60 * 16 * 365 * 10 = ) Number of images seen by all humanity: ,456,367,669 humans 1 * 60 years * 3 images/second * 60 * 60 * 16 * 365 = 1 from Number of photons in the universe: Number of all 8-bits 32x32 images: *32*3 ~

Are 32x32 images enough?

Statistics of database of tiny images 46

Lots Of Images A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008

Lots Of Images A. Torralba, R. Fergus, W.T.Freeman. PAMI 2008

Lots Of Images

First Attempt Used SSD++ to find nearest neighbors of query image Used first 19 principal components

SSD says these are not similar ?

Another similarity measure

Wordnet Voting Scheme Ground truth One image – one vote

Classification at Multiple Semantic Levels Votes: Animal6 Person33 Plant5 Device3 Administrative4 Others22 Votes: Living44 Artifact9 Land3 Region7 Others10

Person Recognition 23% of all images in dataset contain people Wide range of poses: not just frontal faces

Person Recognition – Test Set 1016 images from Altavista using “person” query High res and 32x32 available Disjoint from 79 million tiny images

Person Recognition Task: person in image or not? (c) shows the recall-precision curves for all 1018 images gathered from Altavista, and (d) shows curves for the subset of 173 images where people occupy at least 20% of the image

Scene classification yellow = 7,900 image training set; red = 790,000 images; blue = 79,000,000 images

What If we have Labels…