Image Retrieval Discussion

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
3 Small Comments Alex Berg Stony Brook University I work on recognition: features – action recognition – alignment – detection – attributes – hierarchical.
VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE Shumeet Baluja, Member, IEEE IEEE TRANSACTIONS ON PATTERN ANALYSIS AND.
VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE.
Discriminative Relevance Feedback With Virtual Textual Representation For Efficient Image Retrieval Suman Karthik and C.V.Jawahar.
Multimedia Answer Generation for Community Question Answering.
1 Overview of Image Retrieval Hui-Ying Wang. 2/42 Reference Smeulders, A. W., Worring, M., Santini, S., Gupta, A.,, and Jain, R “Content-based.
ARNOLD SMEULDERS MARCEL WORRING SIMONE SANTINI AMARNATH GUPTA RAMESH JAIN PRESENTERS FATIH CAKIR MELIHCAN TURK Content-Based Image Retrieval at the End.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
Large dataset for object and scene recognition A. Torralba, R. Fergus, W. T. Freeman 80 million tiny images Ron Yanovich Guy Peled.
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Multimedia Indexing and Retrieval Kowshik Shashank Project Advisor: Dr. C.V. Jawahar.
Bag of Features Approach: recent work, using geometric information.
CMPT-884 Jan 18, 2010 Video Copy Detection using Hadoop Presented by: Cameron Harvey Naghmeh Khodabakhshi CMPT 820 December 2, 2010.
Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Visual Information Retrieval Chapter 1 Introduction Alberto Del Bimbo Dipartimento di Sistemi e Informatica Universita di Firenze Firenze, Italy.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Presented by Zeehasham Rasheed
Student: Kylie Gorman Mentor: Yang Zhang COLOR-ATTRIBUTES- RELATED IMAGE RETRIEVAL.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
ImageNet: A Large-Scale Hierarchical Image Database
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
DOG I : an Annotation System for Images of Dog Breeds Antonis Dimas Pyrros Koletsis Euripides Petrakis Intelligent Systems Laboratory Technical University.
Presenting by, Prashanth B R 1AR08CS035 Dept.Of CSE. AIeMS-Bidadi. Sketch4Match – Content-based Image Retrieval System Using Sketches Under the Guidance.
AdvisorStudent Dr. Jia Li Shaojun Liu Dept. of Computer Science and Engineering, Oakland University 3D Shape Classification Using Conformal Mapping In.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Human abilities Presented By Mahmoud Awadallah 1.
Multimedia Databases (MMDB)
Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Content-Based Image Retrieval
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
IMAGINATION: A Robust Image-based CAPTCHA Generation System Ritendra Datta, Jia Li, and James Z. Wang The Pennsylvania State University – University Park.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
CANONICAL IMAGE SELECTION FROM THE WEB ACM International Conference on Image and Video Retrieval, 2007 Yushi Jing Shumeet Baluja Henry Rowley.
IEEE Int'l Symposium on Signal Processing and its Applications 1 An Unsupervised Learning Approach to Content-Based Image Retrieval Yixin Chen & James.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
1 A Compact Feature Representation and Image Indexing in Content- Based Image Retrieval A presentation by Gita Das PhD Candidate 29 Nov 2005 Supervisor:
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Module 10a: Display and Arrangement IMT530: Organization of Information Resources Winter, 2008 Michael Crandall.
Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC
Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.
Yixin Chen and James Z. Wang The Pennsylvania State University
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Advisors for this Project: Dr. Shu-Ching Chen & Dr. Masoud Sadjadi Distributed.
SAPIR Search in Audio-Visual Content using P2P Information Retrival For more information visit: Support.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
A Genetic Algorithm-Based Approach to Content-Based Image Retrieval Bo-Yen Wang( 王博彥 )
Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Image Retrieval and Ranking using L.S.I and Cross View Learning Sumit Kumar Vivek Gupta
Visual Information Retrieval
Fast nearest neighbor searches in high dimensions Sami Sieranoja
Multimedia Content-Based Retrieval
Multimedia Information Retrieval
Presentation transcript:

Image Retrieval Discussion Andrew Chi Brian Cristante COMP 790-133: January 27, 2015

Image Retrieval AI / Vision Problem Systems Design / Software Engineering Problem Domain Sensory Gap: “What features should we use?” Semantic Gap: “How should we index the images and retrieve them?” Design issues Broad Narrow Complex architecture Scalability Text Attributes SIFT Intention Gap: Type of search and user Integrity of results Visual Keywords Query-Dependent? Semantic Hierarchies Centrality Measures Efficiency PageRank Specific Image Set of Images Exploratory Search

What is “Image Retrieval?” Have a specific image stored somewhere and want to find it again Dig around in a collection for some image that suits our needs Just browsing and want guidance, helpful hints, and logical organization Query can be text or another image, or both

First Attempts Web searches: use text associated with the image, in context, on its webpage Captions Surrounding text Other metadata IBM QBIC (1995) QBIC = Query By Image Content Use low-level features to tag images with “visual keywords” Search “red,” “square shaped,” “metallic”

The Landscape of the Problem Around the year 2000, researchers began to break down the problem: Sensory Gap: the difference between a real-world object and how it looks in an image Semantic Gap: the difference between low-level features and the actual content of the image Intention Gap: what does the user even want?

The Landscape of the Problem This leads to our central questions: How should we represent our images? How should we index and organize our images? How should we interpret a natural-language query by the user? What are some algorithms we can use to actually retrieve an image in response to a query?

Image Retrieval AI / Vision Problem Systems Design / Software Engineering Problem Domain Sensory Gap: “What features should we use?” Semantic Gap: “How should we index the images and retrieve them?” Design issues Broad Narrow Complex architecture Scalability Text Attributes SIFT Intention Gap: Type of search and user Integrity of results Visual Keywords Query-Dependent? Semantic Hierarchies Centrality Measures Efficiency PageRank Specific Image Set of Images Exploratory Search

Sensory Gap Addressing the sensory gap means choosing appropriate features that give us the information we need from an image. Some information is naturally lost in creating an image. This can’t be helped. (Or can it?)

What’s a Good Feature? So we have billions of images that don’t necessarily share visual characteristics. How should we represent them to highlight their similarities and differences? There’s no clear-cut answer to this …

What’s a Good Feature? SIFT (Gradient-based) Bag of (visual) words

What’s a Good Feature? VisualRank paper: Search by web text to narrow the number of images under consideration Want to find the most “important” image in terms of its similarity to the other images Local features can capture more subtle differences Choose SIFT features, which are rather robust (scale, rotation, illumination, but not color)

What’s a Good Feature? VisualRank paper asks: could we make the choice of features adapt to users’ queries? We’ll save this for discussion.

Image Retrieval AI / Vision Problem Systems Design / Software Engineering Problem Domain Sensory Gap: “What features should we use?” Semantic Gap: “How should we index the images and retrieve them?” Design issues Broad Narrow Complex architecture Scalability Text Attributes SIFT Intention Gap: Type of search and user Integrity of results Visual Keywords Query-Dependent? Semantic Hierarchies Centrality Measures Efficiency PageRank Specific Image Set of Images Exploratory Search

Semantic Gap To cross the semantic gap for retrieval, we have to make links between the features we’ve extracted and what a user would be searching for That’s why, in our concept map, we say that the semantic gap makes us think about how to index and retrieve images (represented in whatever way) Think of building a data structure and devising an algorithm to traverse that data structure

Attributes Elements of semantic significance Farhadi et al., 2009 Elements of semantic significance Descriptive (“furry”), subcomponents (“has nose”), or discriminative (something a dog has but a cat does not)

Attributes Lie inside the semantic gap, between low-level features and the full semantic interpretation of the image. Category “Car” Semantic Gap Red, has 4 wheels, has engine Attributes Features [0, -0.5, 1.3, 1.6, 0.1, -0.2, …, 0.3] Image (255, 0, 31)

Slide credit: Behjat Siddiquie

Slide credit: Behjat Siddiquie

Slide credit: Behjat Siddiquie

Slide credit: Behjat Siddiquie

Slide credit: Behjat Siddiquie

Semantic Hierarchies Organize images in a tree of increasingly more specific categories IS-A relationships Need a large number of images for this to be non-trivial This can be used for a variety of vision tasks, including retrieval Exploratory search Finding representatives of some category Building datasets Find images that contain semantically similar objects -- but not necessarily visually similar! ImageNet (www.image-net.org) Big crossover with NLP (WordNet)

Retrieval with Semantic Hierarchies Semantic hierarchies and attributes can be used together for efficient retrieval methods Compute similarity (“image distance”) by comparing attributes Use the hierarchy to weight the co-occurrence of attributes That is, the hierarchy accounts for prior knowledge For you math nerds … 𝑺𝒊𝒎𝒊𝒍𝒂𝒓𝒊𝒕𝒚 𝑨, 𝑩 = 𝒊,𝒋 𝑺 𝒊𝒋 ∙𝜹 𝒊 𝑨 ∙𝜹 𝒋 (𝑩) Where: A, B are images i, j index attributes δi(A) is the “indicator function” Sij is the co-occurrence score

Retrieval with Semantic Hierarchies Use hashing to retrieve images in sub-linear time with respect to the size of the collection (Deng, A. Berg, Fei-Fei, 2011) Highly parallelizable

Image Retrieval AI / Vision Problem Systems Design / Software Engineering Problem Domain Sensory Gap: “What features should we use?” Semantic Gap: “How should we index the images and retrieve them?” Design issues Broad Narrow Complex architecture Scalability Text Attributes SIFT Intention Gap: Type of search and user Integrity of results Visual Keywords Query-Dependent? Semantic Hierarchies Centrality Measures Efficiency PageRank Specific Image Set of Images Exploratory Search

Narrow domain: medical image search http://openi.nlm.nih.gov/ Simultaneous phrase and image- based search Image retrieval: Extract low-level features (color, texture, shape) Transform features into visual keywords, annotations Compute similarity between query image and database images

Types of Centrality Degree centrality 𝑥 𝑖 = 𝑗 𝐴 𝑖𝑗 Eigenvector centrality 𝑥 𝑖 = 𝑗 𝐴 𝑖𝑗 𝑥 𝑗 Katz centrality 𝑥 𝑖 =𝛼 𝑗 𝐴 𝑖𝑗 𝑥 𝑗 +𝛽 PageRank 𝑥 𝑖 =𝛼 𝑗 𝐴 𝑖𝑗 𝑥 𝑗 𝑘 𝑗 𝑜𝑢𝑡 +𝛽 A B A and B have eigenvector centrality 0, but non-zero Katz centrality. From Networks: An Introduction, by M.E.J. Newman, 2010.

Image Rank at Web Scale 1.8 billion photos shared per day How long would it take to just compute the similarity matrix? N = 1.8 x 109, 100 cycles per similarity, 1000 CPUs @ 3GHz (N2/2)*100 / (3*109) / 1000 / 86400 / 365 = 1.7 years O(n2) is far too slow. Source: KPCB, Internet Trends 2014

Locality-Sensitive Hashing (LSH) Key idea: avoid computing the entire distance matrix Most pairs of images will extremely dissimilar Find a way to compare only the images that have a good chance of being similar Hashing Normally usually used to spread data uniformly. LSH does the opposite. Used for dimensionality reduction.

LSH on Sets (MinHash) Similarity of two sets (of features, n-grams, etc.) Jaccard similarity: 𝐽 𝑆,𝑇 = 𝑆∩𝑇 𝑆∪𝑇 MinHash: Use a normal hash function to hash every element of both sets. Now, assign each set to the bucket denoted by the minimum (numerical) hash of any of its elements. What is the probability two sets S and T will be assigned to the same bucket?

LSH on n-Dimensional Feature Vectors

LSH for VisualRank: Algorithm Extract local (SIFT) features from images A-D Hash features using many LSH functions of the form: ℎ 𝑎 ,𝑏 𝑉 = 𝑎 ∙𝑉+𝑏 𝑊 Features match if they hash to the same bucket in >3 tables. Images match if they share >3 matching features. Estimate similarity of matching images using #matches/#avgtotal

LSH for VisualRank: Performance Time to compute single similarity matrix (single CPU): 1,000 images 15 minutes Large scale estimate (if your name is Google): 1,000 CPUs Top 100,000 queries Use top 1000 images for each query Less than 30 hours Specifics not published, but MapReduce is the likely platform.

MapReduce (sort of)

MapReduce (more accurate) (split) Map (shuffle) Reduce (collect) all, 2 the, 1 world, 1 and, 1 stage, 1 all, 2 all, 2 all, 4 All the world's a stage, and all… the, 1 the, 1 the, 2 Complete works of Shakespeare and, 2 all, 2 my, 2 soul, 1 and, 1 and, 2 and, 1 And all my soul, and all my… Hist and, 4 world, 1 world, 1 and, 1 this, 1 the, 1 hand, 1 that, 1 slew, 1 stage, 1 stage, 1 And this the hand that slew… my, 1 my, 1 soul, 1 soul, 1 this, 1 this, 1

Questions Suggest a method for implementing the VisualRank LSH algorithm at a large scale. MapReduce: what are the mappers and reducers? UNC Kure/Killdevil: what would each 12-core node do? Say you are not Google  How would you approach this problem without knowing the 100,000 most likely queries beforehand?

Questions Why might you wish to use graph centrality as a ranking mechanism for image retrieval? Why might you prefer to use a semantic hierarchy instead? (Open-ended) If you were a large search engine, how might you learn and deploy query-dependent feature representations of images? Could you also leverage the information in a semantic hierarchy?

References (Survey paper) Datta, Ritendra, Dhiraj Joshi, Jia Li, and James Z. Wang. “Image Retrieval: Ideas, Influences, and Trends of the New Age.” ACM Comput. Surv. 40, no. 2 (May 2008): 5:1–5:60. doi:10.1145/1348246.1348248. Deng, Jia, Wei Dong, R. Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. “ImageNet: A Large-Scale Hierarchical Image Database.” In IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, 248–55, 2009. doi:10.1109/CVPR.2009.5206848. Ghosh, P., S. Antani, L.R. Long, and G.R. Thoma. “Review of Medical Image Retrieval Systems and Future Directions.” In 2011 24th International Symposium on Computer-Based Medical Systems (CBMS), 1–6, 2011. doi:10.1109/CBMS.2011.5999142. Kurtz, Camille, Adrien Depeursinge, Sandy Napel, Christopher F. Beaulieu, and Daniel L. Rubin. “On Combining Image-Based and Ontological Semantic Dissimilarities for Medical Image Retrieval Applications.” Medical Image Analysis 18, no. 7 (October 2014): 1082–1100. doi:10.1016/j.media.2014.06.009. Siddiquie, B., R.S. Feris, and L.S. Davis. “Image Ranking and Retrieval Based on Multi-Attribute Queries.” In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 801–8, 2011. doi:10.1109/CVPR.2011.5995329. Zhang, Hanwang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao, and Tat-Seng Chua. “Attribute-Augmented Semantic Hierarchy: Towards a Unified Framework for Content-Based Image Retrieval.” ACM Trans. Multimedia Comput. Commun. Appl. 11, no. 1s (October 2014): 21:1–21:21. doi:10.1145/2637291.