Predicting MT Fluency from IE Precision and Recall Tony Hartley, Brighton, UK Martin Rajman, EPFL, CH.

Slides:



Advertisements
Similar presentations
Translation in the 21 st Century Impacts of MT and social media on language services.
Advertisements

Statistical modelling of MT output corpora for Information Extraction.
Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.
FT228/4 Knowledge Based Decision Support Systems Knowledge Engineering Ref: Artificial Intelligence A Guide to Intelligent Systems, Michael Negnevitsky.
Improving Machine Translation Quality with Automatic Named Entity Recognition Bogdan Babych Centre for Translation Studies University of Leeds, UK Department.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Information Extraction Lecture 4 – Named Entity Recognition II CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
MT Evaluation The DARPA measures and MT Proficiency Scale.
Re-evaluating Bleu Alison Alvarez Machine Translation Seminar February 16, 2006.
MEANT: semi-automatic metric for evaluating for MT evaluation via semantic frames an asembling of ACL11,IJCAI11,SSST11 Chi-kiu Lo & Dekai Wu Presented.
TAP-ET: TRANSLATION ADEQUACY AND PREFERENCE EVALUATION TOOL Mark Przybocki, Kay Peterson, Sébastien Bronsart May LREC 2008 Marrakech, Morocco.
Sheffield at ImageCLEF 2003 Paul Clough and Mark Sanderson.
MT Evaluation: Human Measures and Assessment Methods : Machine Translation Alon Lavie February 23, 2011.
Correlation of Translation Phenomena and Fidelity Measures John White, Monika Forner.
Named Entity Recognition for Digitised Historical Texts by Claire Grover, Sharon Givon, Richard Tobin and Julian Ball (UK) presented by Thomas Packer 1.
CAP and ROC curves.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Dr Graeme A. Jones tools from the vision tool box Kalman Tracker - noise and filter design.
1 I256: Applied Natural Language Processing Marti Hearst Sept 27, 2006.
Information Extraction and Ontology Learning Guided by Web Directory Authors:Martin Kavalec Vojtěch Svátek Presenter: Mark Vickers.
MANISHA VERMA, VASUDEVA VARMA PATENT SEARCH USING IPC CLASSIFICATION VECTORS.
The Use of Corpora for Automatic Evaluation of Grammar Inference Systems Andrew Roberts & Eric Atwell Corpus Linguistics ’03 – 29 th March Computer Vision.
Query Biased Snippet Generation in XML Search Yi Chen Yu Huang, Ziyang Liu, Yi Chen Arizona State University.
Evaluating an MT French / English System Widad Mustafa El Hadi Ismaïl Timimi Université de Lille III Marianne Dabbadie LexiQuest - Paris.
NAACL Workshop on MTE 3 rd in a Series of MTE adventures 3 June, 2001.
“Retrospective vs. concurrent think-aloud protocols: usability testing of an online library catalogue.” Presented by: Aram Saponjyan & Elie Boutros.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
February / March / April 2015 Handicap Workshop The Annual Review.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.
© 2014 The MITRE Corporation. All rights reserved. Stacey Bailey and Keith Miller On the Value of Machine Translation Adaptation LREC Workshop: Automatic.
Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.
Steve Harris SPASH – Biotechnical Engineering Instructor.
Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Approximate Randomization tests February 5 th, 2013.
Proposal for Synergistic Name Extraction from Historical Text Documents.
Andrew Thomson on Generalised Estimating Equations (and simulation studies)
Michael Cafarella Alon HalevyNodira Khoussainova University of Washington Google, incUniversity of Washington Data Integration for Relational Web.
How do we Collect Data for the Ontology? AmphibiaTree 2006 Workshop Saturday 11:30–11:45 J. Leopold.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.
Sensitivity of automated MT evaluation metrics on higher quality MT output Bogdan Babych, Anthony Hartley Centre for Translation.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
A daptable A utomatic E valuation M etrics for M achine T ranslation L ucian V lad L ita joint work with A lon L avie and M onica R ogati.
BizTalk Flat File Parsing Annotations. Flat File Parsing = LL(k) Parser The flat file parser is entirely grammar driven and is implemented as an LL(k)
Predictive Analytics World CONFIDENTIAL1 Predictive Keyword Scores to Optimize PPC Campaigns Vincent Granville, Ph.D. Click Forensics February 19, 2009.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Mapping Regulations to Industry– Specific Taxonomies Chin Pang Cheng, Gloria T. Lau, Kincho H. Law Engineering Informatics Group, Stanford University June.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
© 2010 IBM Corporation Learning to Predict Readability using Diverse Linguistic Features Rohit J. Kate, Xiaoqiang Luo, Siddharth Patwardhan, Martin Franz,
RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.
Ch. Eick: Some Ideas for Task4 Project2 Ideas on Creating Summaries that Characterize Clustering Results Focus: Primary Focus Cluster Summarization (what.
CS : NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 34: Precision, Recall, F- score, Map.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Automatic Categorization of Patent Applications Presentation to the 3rd IPC Workshop, WIPO, Feb , The need for automatic categorization of.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Knowledge and Information Retrieval Dr Nicholas Gibbins 32/4037.
Automatic methods of MT evaluation Lecture 18/03/2009 MODL5003 Principles and applications of machine translation Bogdan Babych.
English for Engineering Management ( 1 ) Detail 1-1 Fill out the missing words on the drawing (5 are missing) Where is the detail located on the building?
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Martin Rajman, Martin Vesely
WALT: DISCUSS ENVIRONMENTAL ISSUES.
Data Integration for Relational Web
Multiple Regression – Split Sample Validation
An Introduction to Automated Record Linkage
Hierarchical, Perceptron-like Learning for OBIE
Presentation transcript:

Predicting MT Fluency from IE Precision and Recall Tony Hartley, Brighton, UK Martin Rajman, EPFL, CH

Hartley & Rajman ISLE Workshop ISSCO April ISLE Keywords Quality of output text as a whole –Fluency (itself as a predictor of fidelity) –Utility of output for IE Automation F-score, Precision, Recall,

Hartley & Rajman ISLE Workshop ISSCO April Data Sources Fluency –DARPA 94 scores for French => English –100 source texts * (1 HT + 5 MT) IE –Output from SRI Highlight engine –

Hartley & Rajman ISLE Workshop ISSCO April Data Sampling Select 3 ‘systems’ –Human expert -- mean fluency: –S1 -- highest mean fluency: –S2 -- lowest mean fluency: Cover ‘best to worse’ for each –Aim -- select 20 texts –Actual texts

Hartley & Rajman ISLE Workshop ISSCO April IE Task

Hartley & Rajman ISLE Workshop ISSCO April IE Output

Hartley & Rajman ISLE Workshop ISSCO April IE Scoring Manual, but using simple, objective rules ‘Gold Standard’ given by expert translation –number of cells filled ‘Strict’ match = identity ‘Relaxed’ match = near miss –quarter :: third school quarter –hoped :: wished –Airbus Industrie :: Airbus industry

Hartley & Rajman ISLE Workshop ISSCO April Precision and Recall Precision P –(correct cells / actual cells) Recall R –(actual cells / reference cells) F-score –2 * P * R / (P + R)

Hartley & Rajman ISLE Workshop ISSCO April Correlations (not): F-score

Hartley & Rajman ISLE Workshop ISSCO April Correlations (not): Precision

Hartley & Rajman ISLE Workshop ISSCO April Correlations (not): Recall

Hartley & Rajman ISLE Workshop ISSCO April Conclusions A significant correlation is observed for S2 (for precision and recall) The same is not true for S1!… For S1, the correlation is higher for the F score than for P and R scores

Hartley & Rajman ISLE Workshop ISSCO April Observations on translation data MT yields structure preservation by default Does ‘free-ness’ of expert translation tarnish Gold Standard? –Finite verb in ST => non-finite verb in TT (IE under-retrieves) –Nominalisation in ST => finite verb in TT (IE over-retrieves)

Hartley & Rajman ISLE Workshop ISSCO April Observations on IE data Poor performance of Highlight –particularly on expert translations? Long distances between Entities and Relator –post-modification of NP –parenthetical material

Hartley & Rajman ISLE Workshop ISSCO April Next Steps Correlate with Adequacy rather than Fluency? Use whole data set (S2 bad choice?) Focus on extracted doubles and triples rather than single cells? Use another IE engine? Use ST to establish Gold Standard?