Semantic Evaluation of Machine Translation Billy Wong, City University of Hong Kong 21 st May 2010.

Slides:



Advertisements
Similar presentations
Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.
Advertisements

Improved TF-IDF Ranker
Evaluation of Text Generation: Automatic Evaluation vs. Variation Amanda Stent, Mohit Singhai, Matthew Marge.
Leveraging Sentiment to Compute Word Similarity GWC 2012, Matsue, Japan Balamurali A R *,+ Subhabrata Mukherjee + Akshat Malu + Pushpak Bhattacharyya +
Semantic News Recommendation Using WordNet and Bing Similarities 28th Symposium On Applied Computing 2013 (SAC 2013) March 21, 2013 Michel Capelle
Using Semantic Similarity Measures in the Biomedical Domain for Computing Similarity between Genes based on Gene Ontology By : Elham Khabiri Adviser :
1 Extended Gloss Overlaps as a Measure of Semantic Relatedness Satanjeev Banerjee Ted Pedersen Carnegie Mellon University University of Minnesota Duluth.
MEANT: semi-automatic metric for evaluating for MT evaluation via semantic frames an asembling of ACL11,IJCAI11,SSST11 Chi-kiu Lo & Dekai Wu Presented.
Word Sense Disambiguation for Machine Translation Han-Bin Chen
A Linguistic Approach for Semantic Web Service Discovery International Symposium on Management Intelligent Systems 2012 (IS-MiS 2012) July 13, 2012 Jordy.
Maurice Hermans.  Ontologies  Ontology Mapping  Research Question  String Similarities  Winkler Extension  Proposed Extension  Evaluation  Results.
1 Text Similarity in NLP and its Applications Instructor: Paul Tarau, based on Rada Mihalcea’s original slides.
Measures of Text Similarity
Scott Wen-tau Yih (Microsoft Research) Joint work with Vahed Qazvinian (University of Michigan)
Geographical and Temporal Similarity Measurement in Location-based Social Networks Chongqing University of Posts and Telecommunications KTH – Royal Institute.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
BLEU, Its Variants & Its Critics Arthur Chan Prepared for Advanced MT Seminar.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Using Maximal Embedded Subtrees for Textual Entailment Recognition Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Written by Alexander Budanitsky Graeme Hirst Retold by.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Intelligent Database Systems Lab Presenter : BEI-YI JIANG Authors : UNIVERSIT´E CATHOLIQUE DE LOUVAIN, BELGIUM ASSOCIATION FOR COMPUTING MACHINERY.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Part 3. Knowledge-based Methods for Word Sense Disambiguation.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
METEOR-Ranking & M-BLEU: Flexible Matching & Parameter Tuning for MT Evaluation Alon Lavie and Abhaya Agarwal Language Technologies Institute Carnegie.
Arthur Chan Prepared for Advanced MT Seminar
METEOR: Metric for Evaluation of Translation with Explicit Ordering An Automatic Metric for MT Evaluation with Improved Correlations with Human Judgments.
A Study on Query Expansion Methods for Patent Retrieval Walid MagdyGareth Jones Centre for Next Generation Localisation School of Computing Dublin City.
Mining and Analysis of Control Structure Variant Clones Guo Qiao.
Querying Structured Text in an XML Database By Xuemei Luo.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
A daptable A utomatic E valuation M etrics for M achine T ranslation L ucian V lad L ita joint work with A lon L avie and M onica R ogati.
1 Sentence-extractive automatic speech summarization and evaluation techniques Makoto Hirohata, Yosuke Shinnaka, Koji Iwano, Sadaoki Furui Presented by.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Clustering Word Senses Eneko Agirre, Oier Lopez de Lacalle IxA NLP group
Semantics-Based News Recommendation with SF-IDF+ International Conference on Web Intelligence, Mining, and Semantics (WIMS 2013) June 13, 2013 Marnix Moerland.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. A semantic similarity metric combining features and intrinsic information content Presenter: Chun-Ping.
Detecting a Continuum of Compositionality in Phrasal Verbs Diana McCarthy & Bill Keller & John Carroll University of Sussex This research was supported.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Using Semantic Relatedness for Word Sense Disambiguation
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Towards the Use of Linguistic Information in Automatic MT Evaluation Metrics Projecte de Tesi Elisabet Comelles Directores Irene Castellon i Victoria Arranz.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
1 Gloss-based Semantic Similarity Metrics for Predominant Sense Acquisition Ryu Iida Nara Institute of Science and Technology Diana McCarthy and Rob Koeling.
Semantics-Based News Recommendation International Conference on Web Intelligence, Mining, and Semantics (WIMS 2012) June 14, 2012 Michel Capelle
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Pastra and Saggion, EACL 2003 Colouring Summaries BLEU Katerina Pastra and Horacio Saggion Department of Computer Science, Natural Language Processing.
WordNet::Similarity Measuring the Relatedness of Concepts Yue Wang Department of Computer Science.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Ling 575: Machine Translation Yuval Marton Winter 2016 February 9: MT Evaluation Much of the materials was borrowed from course slides of Chris Callison-Burch.
METEOR: Metric for Evaluation of Translation with Explicit Ordering An Improved Automatic Metric for MT Evaluation Alon Lavie Joint work with: Satanjeev.
Monoligual Semantic Text Alignment and its Applications in Machine Translation Alon Lavie March 29, 2012.
Bing-SF-IDF+: A Hybrid Semantics-Driven News Recommender
Exploring and Navigating: Tools for GermaNet
بسم الله الرحمن الرحيم.
Bridget McInnes Ted Pedersen Serguei Pakhomov
OMIOTIS: A Thesaurus-based Measure of Semantic Relatedness
WordNet: A Lexical Database for English
Statistical vs. Neural Machine Translation: a Comparison of MTH and DeepL at Swiss Post’s Language service Lise Volkart – Pierrette Bouillon – Sabrina.
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Text-based User-kNN: Measuring user similarity based on text reviews
Presentation transcript:

Semantic Evaluation of Machine Translation Billy Wong, City University of Hong Kong 21 st May 2010

Introduction  Surface text similarity is not a reliable indicator in automatic MT evaluation  Insensitive to variation of translation  Deeper linguistic analysis is preferred  WordNet is widely used for matching synonyms  E.g. METEOR (Banerjee & Lavie 2005), TERp (Snover et al. 2009), ATEC (Wong & Kit 2010)…  Is the similarity of words between MT outputs and references fully described?

Motivation  WordNet  Granularity of sense distinctions is highly fine-grained  Word pairs not in the same sense:  [mom vs mother], [safeguard vs security], [expansion vs extension], [journey vs tour], [impact vs influence]…etc.  Word pairs in similar meaning  Problematic if ignore them in evaluation  What is needed is a word similarity measure  Proposal:  Utilization of word similarity measures in automatic MT evaluation

Word Similarity Measures  Knowledge-based (WordNet)  Wup (Wu & Palmer 1994)  Res (Resnik 1995)  Jcn (Jiang & Conrath 1997)  Hso (Hirst & St-Onge 1998)  Lch (Leacock & Chodorow 1998)  Lin (Lin 1998)  Lesk (Banerjee & Pedersen 2002)  Corpus-based  LSA (Landauer et al. 1998)

Experiment  Three questions:  To what extent two words are considered similar?  Which word similarity measure(s) is/are more appropriate to use?  How much performance gain an MT evaluation metric can obtain by incorporating word similarity measures?

Setting  Data  MetricsMATR08 development data  1992 MT outputs  8 MT systems  4 references  Evaluation metric  Unigram matching  Exact match / synonym / semantically similar  Same weight  Three variants  Precision (p), recall (r) and F-measure (f) where c: MT output t: reference translation

Result (1)  Correlation thresholds of each measure

Result (2)  Correlation of the metric

Conclusion  The importance of semantically similar words in automatic MT evaluation  Two word similarity measures, wup and LSA, perform relatively better  Remaining problems  Semantic similarity vs. Semantic relatedness  E.g. [committee vs chairman] (LSA)  Most WordNet similarity measures run on verbs and nouns only

Thank you