Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR 2009 2015-11-11ImageNet1.

Similar presentations


Presentation on theme: "Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR 2009 2015-11-11ImageNet1."— Presentation transcript:

1 Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR 2009 2015-11-11ImageNet1 Jiewen Lei jiewenle@usc.edu

2  This paper is mainly an introduction to ImageNet. The paper is organized as follows: 1. shows properties of ImageNet 2. Compare ImageNet with current related datasets 3. Constructing ImageNet - describes without concrete steps 4. ImageNet Applications  mainly focus on the constructing ImageNet.  It mostly relatives to Crawling and PageRank. 2015-11-11ImageNet2

3  A dataset - Datasets and Computer Vision  Based on WordNet - Each node is depicted by images  A knowledge ontology - Taxonomy - Partonomy 2015-11-11ImageNet3

4  2-step process 2015-11-11ImageNet4 Step 1 : Collect candidate images Via the Internet Step 2 : Clean up the candidate Images by humans

5  For each synset, the queries are the set of WordNet synonyms  Accuracy of Internet Image search results: 10 % - For 500-1000 clean images, needs 10K images  Query expansion - Synonyms: German police dog, German shepherd dog - Appending words form ancestors: sheepdog, dog  Multiple Languages - Italian, Dutch, Spanish, Chinese e.g. 德国牧羊犬, pastore tedesco  More engines: Yahoo!, flickr, Google  Parallel downloading 2015-11-11ImageNet5

6  Rely on humans to verify each candidate image collected for a given synset  Amazon Mechanical Turk (AMT) - used for labeling vision data - 300 images: 0.02 dollar - 14,197,122 images: 946 dollars - 10 repetition: 9460 dollars - Jul 2008 -Apr 2010:11 million images  Present the users with a set of candidate images and the definition of the target synset  let users select the best match ones 2015-11-11ImageNet6

7 2015-11-11ImageNet7 Workers do annotation on AMT -Multiple annotations for each images Annotation Results - An average of > 97% accuracy

8  Users Enhancement - Provide wiki and google links for definitions - Make sure workers read the definition - Definition quiz - Allow more feedback. E.g. “unimagable synset” expert opinion 2015-11-11ImageNet8

9  Human users make mistakes  Not all users follow the instructions  Users do not always agree with each other - Subtle or confusing synsets, e.g. Burmese cat  Quality Control System 2015-11-11ImageNet9

10  randomly sample an initial subset of image to users - Have multiple users independently label same image  obtain a confidence score table, indicating the probability of an image being a good image given the user votes - Different categories requires different levels of consensus  Proceed until a pre-determined confidence score threshold reached 2015-11-11ImageNet10

11  Scale: 12 subtrees,3,2 million images,5247 categories  Hierarchy: densely populated semantic hierarchy, based on WordNet 2015-11-11ImageNet11

12  Accuracy: clean dataset at all level 2015-11-11ImageNet12  Diversity: variable appearances, positions, view points, poses, background clutter, occlusions.

13  Non-parametric Object Recognition 1. NN-voting + noisy ImageNet 2. NN-voting + clean ImageNet 3. Naive Bayesian Nearest Neighbor (NBNN) 4. NBNN-100  Tree Based Image Classification  Automatic Object Localization 2015-11-11ImageNet13

14  Pros 1. Crowdsourcing 2. Benchmarking 3. Open: Download Original Images, URLs, Features, Object Attributes, API  Cons 1. Improve algorithm: PageRank 2. AMT: hierarchical users based on their ability 3. Only one tag per image 2015-11-11ImageNet14


Download ppt "Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR 2009 2015-11-11ImageNet1."

Similar presentations


Ads by Google