Unsupervised learning of visual sense models for Polysemous words

Slides:

Advertisements

Similar presentations

Background Knowledge for Ontology Construction Blaž Fortuna, Marko Grobelnik, Dunja Mladenić, Institute Jožef Stefan, Slovenia.

Advertisements

Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.

Yansong Feng and Mirella Lapata

Learning visual representations for unfamiliar environments Kate Saenko, Brian Kulis, Trevor Darrell UC Berkeley EECS & ICSI.

Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.

Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.

UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.

A review on “Answering Relationship Queries on the Web” Bhushan Pendharkar ASU ID

Computer Information Technology – Section 3-2. The Internet Objectives: The Student will: 1. Understand Search Engines and how they work 2. Understand.

Personalized Search Result Diversification via Structured Learning

6/16/20151 Recent Results in Automatic Web Resource Discovery Soumen Chakrabartiv Presentation by Cui Tao.

Latent Dirichlet Allocation a generative model for text

MusicSense: Contextual Music Recommendation using Emotional Allocation Modeling Rui Cai, Chao Zhang, Chong Wang, Lei Zhang, and Wei-Ying Ma Proceedings.

J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.

Example 16,000 documents 100 topic Picked those with large p(w|z)

Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.

1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.

PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.

Boris Babenko Department of Computer Science and Engineering University of California, San Diego Semi-supervised and Unsupervised Feature Scaling.

Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.

UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.

Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.

CANONICAL IMAGE SELECTION FROM THE WEB ACM International Conference on Image and Video Retrieval, 2007 Yushi Jing Shumeet Baluja Henry Rowley.

Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.

Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.

Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.

Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.

Unsupervised Learning of Visual Sense Models for Polysemous Words Kate Saenko Trevor Darrell Deepak.

1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)

Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.

Kylie Gorman WEEK 1-2 REVIEW. CONVERTING AN IMAGE FROM RGB TO HSV AND DISPLAY CHANNELS.

LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor ： Dr. Koh Jia-Ling Speaker ： Tu.

Latent Dirichlet Allocation

Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:

Using Social Annotations to Improve Language Model for Information Retrieval Shengliang Xu, Shenghua Bao, Yong Yu Shanghai Jiao Tong University Yunbo Cao.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,

Topic Modeling for Short Texts with Auxiliary Word Embeddings

Large-Scale Content-Based Audio Retrieval from Text Queries

Sentiment analysis algorithms and applications: A survey

CS262: Computer Vision Lect 09: SIFT Descriptors

The topic discovery models

An Additive Latent Feature Model

Search Engines and Link Analysis on the Web

Multimodal Learning with Deep Boltzmann Machines

Measuring Sustainability Reporting using Web Scraping and Natural Language Processing Alessandra Sozzi

Mining the Data Charu C. Aggarwal, ChengXiang Zhai

Text & Web Mining 9/22/2018.

The topic discovery models

Latent Dirichlet Analysis

Multi-Dimensional Data Visualization

a chicken and egg problem…

David Cyphert CS 2310 – Software Engineering

The topic discovery models

Michal Rosen-Zvi University of California, Irvine

Junghoo “John” Cho UCLA

Topic Models in Text Processing

Presented by Wanxue Dong

Presentation transcript:

Unsupervised learning of visual sense models for Polysemous words By David Arredondo

Polysemy: a problem Many English words have multiple meanings. In particular, many are “visually” polysemous--their sense differences can be explained visually. The paper analyzes five polysemous words: “bass”, “face”, “mouse”, “speaker”, and “watch”.

Method and Model: Intuition The paper ranks images as a preprocessing step for classification. The basis of the reranking is the probability of the “sense” of each image. The sense of these images is derived from text surrounding the image link. Unfortunately, the quality and quantity of data surrounding the the image link is lacking, and need to be supported rom text-only web pages from a regular web search Latent dimensions or categories in a topic are formulated from this bag of words approach using Latent Dirichlet Allocation. LDA finds hidden topics, or distributions, over discrete data using a bayesian formulation.

Method and Model:LDA for topics The LDA model used in particular is pictured below. Each document is a mixture of z topics, 1 to K. Each document can be thought of as a bag of N words, which is assumed to be generated from the product of two multinomial distributions with dirichlet priors. Note that w represents a word, and that phi and theta are the distributions for the words and topics respectively. Note that only the distribution of topics is used rank the images according to sense.

Method and Model: Dictionary Sense In order to rank images by sense, they calculated the probability of a sense given a topic by using each dictionary entry in Wordnet, plus synonyms (pitch for “bass”), any hyponyms, and first levle hypernyms (sound property for “bass”) as a bag of words. This involved summing up the probabilities of each word in the bag given the topic, as below With d as the associated text of a web image, the probability of a sense is calculated as: Where P(z|d) is calculated as follows: Note: they also normalising for the length of the text context.

Visual Sense Model The full model uses images ranked by the probability of a given sense. In particular, it uses the N highest-ranked images as positive training data for their given classifier of choice, and RBF kernel SVM. A baseline model was also run; it uses the images returned from a search of a sense, a sense and its synonyms, and a sense and its first level hypernyms (“mouse”, “computer mouse”, “mouse electronic device”). These images are then run through the svm.

Data Note: keyword data was collected via Yahoo Image Search with keywords: “bass”, “face”, “speaker”, “mouse”, and “watch”. These images were human labeled as 0:unrelated, 1:partial, or 2:good, but only for testing.

Features Text data is pruned of: all HTML tags, words that appear only once, stop words and the actual query word. A Porter stemmer is then applied Image features come after resizing all images to 300x300 pixels in grayscale. Both edge features (Canny edge) and scale-invariant salient points (Harris-Laplace) were used, with a 128 dimensional SIFT descriptor used to describe the area around each interest point.

Experiments: Running LDA Note:The number of topics K is set to 8, which roughly matches the average number of senses per keyword.

Classification 1-SENSE has negative class as only the ground truth of the other objects. MIX-SENSE includes the other non-applicable senses of the given keyword in the negative class.

Critique and Discussion As stated in the paper, not all senses have a clear visual interpretation. This might explain the varying degree of improvement in their model over the baseline for some keywords. They only use results from one search engine, and they did not use the search results from google, the dominant in the industry. Their method compared applied and compared to other search engines would have more thoroughly grounded the relevance of their work Future work could explore other unsupervised learning algorithms, such as Non-negative Matrix Factorization (NMF) Applying this same idea with modern search engine results adn RNNS as classifiers