Word sense disambiguation with pictures Kobus Barnard, Matthew Johnson presented by Milan Iliev.

Slides:



Advertisements
Similar presentations
Clustering Art & Learning the Semantics of Words and Pictures Manigantan Sethuraman.
Advertisements

LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
What is Statistical Modeling
Multiple Instance Learning
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Segmentation Divide the image into segments. Each segment:
WORD-PREDICTION AS A TOOL TO EVALUATE LOW-LEVEL VISION PROCESSES Prasad Gabbur, Kobus Barnard University of Arizona.
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
Statistics for the Social Sciences Psychology 340 Spring 2005 Course Review.
Chapter 9: Introduction to the t statistic
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Information Retrieval in Practice
Image Annotation and Feature Extraction
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Combining Statistical Language Models via the Latent Maximum Entropy Principle Shaojum Wang, Dale Schuurmans, Fuchum Peng, Yunxin Zhao.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Basic Measurement and Statistics in Testing. Outline Central Tendency and Dispersion Standardized Scores Error and Standard Error of Measurement (Sm)
Measures of Central Tendency: The Mean, Median, and Mode
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Object Recognition Part 2 Authors: Kobus Barnard, Pinar Duygulu, Nado de Freitas, and David Forsyth Slides by Rong Zhang CSE 595 – Words and Pictures Presentation.
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Psychology 202a Advanced Psychological Statistics November 12, 2015.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
Statistics & Evidence-Based Practice
Automatic Writing Evaluation
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.
Statistical NLP: Lecture 7
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
12. Principles of Parameter Estimation
Microsoft Office Access 2010 Lab 2
Section 2: Statistics and Models
Introduction Multimedia initial focus
Linear Filters and Edges Chapters 7 and 8
The Normal Distribution
Section 2: Statistics and Models
Descriptive Statistics I REVIEW
INTRODUCTION.
Tips for Writing Free Response Questions on the AP Statistics Exam
Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:
Tips for Writing Free Response Questions on the AP Statistics Exam
Estimating with PROBE II
PSY 325 TUTOR Education for Service-- psy325tutor.com.
CSCI 5832 Natural Language Processing
Introduction to Inferential Statistics
Informatics 121 Software Design I
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Ashley Johnson LTEC 4100 Section 6 February 4th 2009
Statistical NLP: Lecture 9
Statistics and Science
The t distribution and the independent sample t-test
WordNet WordNet, WSD.
Matching Words with Pictures
Statistical NLP: Lecture 4
Multimedia Information Retrieval
Chapter 5 Describing Data with z-scores and the Normal Curve Model
Introduction to Text Analysis
Skills 5. Skills 5 Standard deviation What is it used for? This statistical test is used for measuring the degree of dispersion. It is another way.
12. Principles of Parameter Estimation
Unsupervised learning of visual sense models for Polysemous words
Statistical NLP : Lecture 9 Word Sense Disambiguation
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Word sense disambiguation with pictures Kobus Barnard, Matthew Johnson presented by Milan Iliev

Overview ● Why disambiguate? ● Where disambiguate? ● How disambiguate?

Why disambiguate? Disambiguation is one of the biggest problems in natural language processing. And, natural language processing is very important.

Why disambiguate with pictures? ● Contextual disambiguation often does little better than choosing the most common sense ● Contextual disambiguation is sometimes difficult, even impossible: 'He eats river by the bank'

Where could picture disambiguation be used? ● In image-containing websites ● In vision-capable robots [catch the ball] ● In text supplemented with media, such as encyclopedias

Modes of disambiguation ● Image-only: standalone image disambiguation with no available contextual information. ● Image-enhanced textual disambiguation: available document text and/or additional text domain plus image data ● Fall back on textual-only disambiguation when no image data is available.

The Core: Image-based Word Prediction Algorithm A new method for predicting words from images. Based on a statistical model for joint probability distribution of expected words and image region features. Learning model: trained on images with associated text. Large sets of images. Caveat: large amounts of words and region feature fuzziness makes for large error rate.

The Core, continued However: When disambiguating, we can limit possible words to meanings of the word we are disambiguating. With small number of word choices, error rate is low. Also: Where to get all these images with associated words? Why, the Corel image database, of course.

Textual context disambiguation Assumption: the word to be disambiguated is semantically linked to other words in its context. Approach: statistical analysis of co-occurences. For example, 'flop' meaning 'fail' occurs often near the words 'attempt', 'disaster', 'genius', etc. 'flop' meaning 'floating point operations per second' occurs near 'gigahertz', 'PowerPC', 'transistor', etc.

Who's got all that data? WordNet. WordNet is a machine-readable dictionary with a large (152,000 words) portion of the English language organized into synsets, most commonly with a 'hypernym' relationship ( 'A is a B' ). Also, 'sense numbers' indicate which sense is most often useful – obviously, very useful for disambiguation.

Linking Images to Words

Images to Words: Criteria ● Size (pixel percentage over image) ● Position (region center of mass, relative to image) ● Color (average and std. dev. over each of R, G, and B) ● Texture (average and variance of filter responses. Gaussian filters are involved.) ● Shape (area/perimeter, center of mass/MOI, convex ratio) ● Color context (adjacent colors, 90 degrees)

The Formula The hidden variables are called 'concepts'. They generate both words and blobs. And so, multiple concepts exist on the image. P(word, blob) = { FrequencyTable(word, concept) * GaussianDistribution(blob, concept) * PriorP(concept) } for all concepts. Some independence is assumed.

Finally, The Point: Image Disambiguation Assumptions: Humans have a disambiguated vocabulary For every word w (like 'bank') in our 'normal' vocabulary W, there are a number of senses s1, s2, etc, in our 'disambiguated' vocabulary S (bank_1, bank_2, etc). We get a posterior probability P( s | w, B ) s – sense, w – word, B – image context On demand, combine with textual context P( s | w, W ): P( s | w, B, W ) = c * P( s | w, B) * P( s, | w, W )

And then, they trained: Building ImCor The Corel image DB was not very ambiguous The researchers built an image-to- passage linked database, much like illustrated news articles or websites. ● Modified SMUaw / SenseLearner textual algorithm for more softness/fluidity ● Add image data ● Ask humans to rate appropriateness of images to passages of text ● Mark similar passages with the same keywords

Results The performance tests indicated that for a small, friendly domain, pure image-based disambiguation exceed the performance of two text-based algorithms. Also, combined image-and-text disambiguation provided further improvement.