Frequency Estimates for Statistical Word Similarity Measures Presenter: Cosmin Adrian Bejan Egidio Terra and C.L.A. Clarke School of Computer Science University.

Slides:



Advertisements
Similar presentations
ADBIS 2007 Discretization Numbers for Multiple-Instances Problem in Relational Database Rayner Alfred Dimitar Kazakov Artificial Intelligence Group, Computer.
Advertisements

Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Managerial Decision Making and Problem Solving Computer Lab Notes 1.
How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares.
 2 Test of Independence. Hypothesis Tests Categorical Data.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Sampling: Final and Initial Sample Size Determination
SI485i : NLP Set 11 Distributional Similarity slides adapted from Dan Jurafsky and Bill MacCartney.
Chapter 4 Probability.
Measures of Distributional Similarity Presenter: Cosmin Adrian Bejan Lillian Lee Department of Computer Science Cornell University.
Smoothing Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.
1 Lecture 8 Measures of association: chi square test, mutual information, binomial distribution and log likelihood ratio.
A Comparative Study on Feature Selection in Text Categorization (Proc. 14th International Conference on Machine Learning – 1997) Paper By: Yiming Yang,
Dimension of Meaning Author: Hinrich Schutze Presenter: Marian Olteanu.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
1 Incorporating N-gram Statistics in the Normalization of Clinical Notes By Bridget Thomson McInnes.
Chapter 5. Operations on Multiple R. V.'s 1 Chapter 5. Operations on Multiple Random Variables 0. Introduction 1. Expected Value of a Function of Random.
Using Information Content to Evaluate Semantic Similarity in a Taxonomy Presenter: Cosmin Adrian Bejan Philip Resnik Sun Microsystems Laboratories.
1 Introduction to Computational Natural Language Learning Linguistics (Under: Topics in Natural Language Processing ) Computer Science (Under:
Collocations 09/23/2004 Reading: Chap 5, Manning & Schutze (note: this chapter is available online from the book’s page
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Multinomial Distribution
Managerial Decision Making Facilitator: René Cintrón MBA / 510.
Discrete Multivariate Analysis Analysis of Multivariate Categorical Data.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.
Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Erasmus University Rotterdam Introduction Content-based news recommendation is traditionally performed using the cosine similarity and TF-IDF weighting.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
Chapter 2: Getting to Know Your Data
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Chapter Outline Goodness of Fit test Test of Independence.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Iterative Translation Disambiguation for Cross-Language.
Week 41 How to find estimators? There are two main methods for finding estimators: 1) Method of moments. 2) The method of Maximum likelihood. Sometimes.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Making Comparisons All hypothesis testing follows a common logic of comparison Null hypothesis and alternative hypothesis – mutually exclusive – exhaustive.
Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,
NLP.
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Mean and Standard Deviation Lecture 23 Section Fri, Mar 3, 2006.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Outline Time series prediction Find k-nearest neighbors Lag selection Weighted LS-SVM.
An Adaptive Learning with an Application to Chinese Homophone Disambiguation from Yue-shi Lee International Journal of Computer Processing of Oriental.
1 Tracking Dynamics of Topic Trends Using a Finite Mixture Model Satoshi Morinaga, Kenji Yamanishi KDD ’04.
True/False questions (3pts*2)
Random Variables and Probability Distribution (2)
Statistical Analysis Professor Lynne Stokes
Multiplication table. x
Discrete Multivariate Analysis
Mean and Standard Deviation
Maximum Likelihood Estimation
Chapter 12 Using Descriptive Analysis, Performing
CSC 594 Topics in AI – Natural Language Processing
Vector-Space (Distributional) Lexical Semantics
Goodness of Fit Tests The goal of χ2 goodness of fit tests is to test is the data comes from a certain distribution. There are various situations to which.
Introduction to Probability & Statistics The Central Limit Theorem
Statistical NLP: Lecture 4
Categorical Data Analysis
Mean and Standard Deviation
Mean and Standard Deviation
CHI SQUARE TEST OF INDEPENDENCE
Mean and Standard Deviation
Spatial Statistics A 15 minute Tour….
CS150 Introduction to Computer Science 1
Mean and Standard Deviation
Chapter Outline Goodness of Fit test Test of Independence.
Topic 7: Visualization Lesson 2 – Frequency Charts in Excel
Presentation transcript:

Frequency Estimates for Statistical Word Similarity Measures Presenter: Cosmin Adrian Bejan Egidio Terra and C.L.A. Clarke School of Computer Science University of Waterloo

2 Introduction  A comparative study of two methods for estimating word cooccurence frequencies required by word similarity measures to solve human-oriented language tests.  Example of such tests:  determine the best synonym in a set of alternatives A={A 1, A 2, A 3, A 4 } for a specific target word TW in a context C={w 1 ’, w 2 ’, … w n ’} \ TW.  determine the best synonym when no context is available

3 Measuring Word Similarity  the notion for cooccurence of two words can be depicted by a contingency table:  each dimension represents a random discrete variable W i with range A = {w i,  w i };  each cell represent the joint frequency where N max is the maximum number of cooccurences.

4 Similarity between two words Pointwise Mutual Information Χ 2 - test Likelihood ratio Average Mutual Information

5 Context supported similarity Cosine of Pointwise Mutual Information L1 norm Contextual Average Mutual Information Contextual Jensen- Shanon Digergence Pointwise Mutual Infor- mation of Multiple words

6 Window-oriented approach  f w_i – frequency of w i  f w_1,w_2 – cooccurence frequency of w 1 and w 2  N – size of the corpus in words  P(w i ) = f w_i /N  f w_1,w_2 is estimated by the number of windows where the two words cooccur.  N wt – number of windows of size t  P(w 1, w 2 ) = f w_1,w_2 / N wt

7 Document-oriented approach  df w_i – frequency of a word w i. It corresponds to the number of documents in which the words appears.  D – the number of documents  P(w i ) = df w_i / D  df w_1,w_2 – cooccurence frequency of two words – is the number of documents where the words cooccur.  P(w 1, w 2 ) = df w_1,w_2 / D

8 Results for TOEFL test set

9 Results for TS1 and context