Towards the automatic identification of adjectival scales: clustering adjectives according to meaning Authors: Vasileios Hatzivassiloglou and Kathleen.

Slides:



Advertisements
Similar presentations
Chapter 18: The Chi-Square Statistic
Advertisements

Design of Experiments Lecture I
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Variance reduction techniques. 2 Introduction Simulation models should be coded such that they are efficient. Efficiency in terms of programming ensures.
Clustering V. Outline Validating clustering results Randomization tests.
Describing Relationships Using Correlation and Regression
COMPUTER AIDED DIAGNOSIS: FEATURE SELECTION Prof. Yasser Mostafa Kadah –
Sociology 601 Class 13: October 13, 2009 Measures of association for tables (8.4) –Difference of proportions –Ratios of proportions –the odds ratio Measures.
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Evaluating Hypotheses
Predicting the Semantic Orientation of Adjectives
A Clustered Particle Swarm Algorithm for Retrieving all the Local Minima of a function C. Voglis & I. E. Lagaris Computer Science Department University.
Linear Regression and Correlation Analysis
Learning Subjective Adjectives from Corpora Janyce M. Wiebe Presenter: Gabriel Nicolae.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Modeling Consensus: Classifier Combination for WSD Authors: Radu Florian and David Yarowsky Presenter: Marian Olteanu.
Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.
25 YEARS AFTER THE DISCOVERY: SOME CURRENT TOPICS ON LENSED QSOs Santander (Spain), 15th-17th December 2004 Estimation of time delays from unresolved photometry.
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
1 A MONTE CARLO EXPERIMENT In the previous slideshow, we saw that the error term is responsible for the variations of b 2 around its fixed component 
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
AM Recitation 2/10/11.
Correlation By Dr.Muthupandi,. Correlation Correlation is a statistical technique which can show whether and how strongly pairs of variables are related.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
AP Statistics Chapter 9 Notes.
Presented by Tienwei Tsai July, 2005
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Ferenc Havas Tallinn, Introduction to the project: Uralic Typology Database Project website:
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
Noun-Phrase Analysis in Unrestricted Text for Information Retrieval David A. Evans, Chengxiang Zhai Laboratory for Computational Linguistics, CMU 34 th.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
Sampling Distributions and Inference for Proportions(C18-C22 BVD) C18: Sampling Distributions.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Describing Relationships Using Correlations. 2 More Statistical Notation Correlational analysis requires scores from two variables. X stands for the scores.
Collocations and Terminology Vasileios Hatzivassiloglou University of Texas at Dallas.
Introduction Chapter 1 Foundations of statistical natural language processing.
Probabilistic Design Systems (PDS) Chapter Seven.
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
Non-Parametric Distributional Tests for Gravitational Wave Transient Event Detection Yeming Shi 1, Erik Katsavounidis 1, Michele Zanolin 2 1 Massachusetts.
SIMS 296a-4 Text Data Mining Marti Hearst UC Berkeley SIMS.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Ultra-high dimensional feature selection Yun Li
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
Non-parametric Methods for Clustering Continuous and Categorical Data Steven X. Wang Dept. of Math. and Stat. York University May 13, 2010.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
Basic simulation methodology
Introduction to Corpus Linguistics: Exploring Collocation
Lecture 2 – Monte Carlo method in finance
数据的矩阵描述.
Biointelligence Laboratory, Seoul National University
Presentation transcript:

Towards the automatic identification of adjectival scales: clustering adjectives according to meaning Authors: Vasileios Hatzivassiloglou and Kathleen R. McKeown Presenter: Marian Olteanu

Introduction Group adjectives according to their meaning  Semantic relateness – between adjectives which describe the same property Goal  Adjectival scales Method  Statistical  Augmented with linguistic information derived from the corpus

Adjectival scales Linguistic scale – set of words of the same grammatical category that can be ordered by their semantic strength or degree of informativeness Example: lukewarm, warm, hot Adjectives – elements on the scale can be partitioned into 2 groups, in each group – total order  Negative and positive degrees

Adjectival scales Tests for acceptance  Horn: “x even y”  Data sparseness – infrequent patterns in real corpora  Scales vary accros domains

Methodology Four stages  Extract linguistic data from the parsed corpus – word pairs Info processed by morphological component – group together similar pairs  Independent similarity modules – number between 0 and 1

Methodology Four stages (cont)  Module that combines all the similarity measures into one dissimilarity measure  Module that clusters adjectives into groups based on dissimilarity measure Linguistic data  That tell if adjectives are related – adj.-noun pairs  That tell if adjectives are unrelated – adj.-adj. pairs

Methodology Adj.-noun pairs  Distribution of nouns and adjective modifiers  Expectation: similar adjectives tend to modify the same set of nouns Adj.-adj. pairs  Adjectives that describe the same property do not appear in the same minimal NP Antithetical: hot cold, red black Non-antithetical: hot warm Adj. that modifies each other: light blue shirt

Computing similarity between adjectives Adjective-noun pairs  Robust non-parametric method – Kendall’s τ coefficient for two random variables with paired observations (X i,Y i ) and (X j,Y j ) – two pairs of observations for adj. X and Y on the nouns I and j  Concordant if X i >X j and Y i >Y j or X i <X j and Y i <Y j  Discordant, if X i >X j and Y i Y j τ = p c -p d Unbiased estimator:

Methodology Adjective-adjective pairs  Reject pairs that occur in the same NP  High accuracy, low coverage Combining similarity estimates  If pair was rejected by adj.-adj. module: dissimilarity = k (usually 10)  Else, dissimilarity = 1 – (similarity by adj.-noun module)

Clustering the adjectives Goal – optimal partition Algorithm  Non-hierarchical  Number of partitions – input parameter  Exchange method K-means is not applicable  Minimizing the objective function Φ

Clustering the adjectives Algorithm (cont.)  Random partition  Compute the improvement by moving an adjective to a different cluster  Hill-climbing Local minima Call the algorithm multiple times with different starting positions

Results

Clusters #5 and #8 – adjectives that indicate size Clustering discourages large clusters  Cluster #6: 5 words Methods to increase number of pairs  Larger corpus  More syntactical patterns

Evaluation  9 human judges manually created partitions (6 to 11 clusters)  “Cross-validation” for human judges: 49% to 59% for F-measure

Evaluation Lower bound  Monte Carlo analysis  F-measure: 1 in 20,000 trials  Fallout: 4.9%