The Use of Corpora for Automatic Evaluation of Grammar Inference Systems Andrew Roberts & Eric Atwell Corpus Linguistics ’03 – 29 th March Computer Vision.

Slides:



Advertisements
Similar presentations
Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.
Advertisements

A new Machine Learning algorithm for Neoposy: coining new Parts of Speech Eric Atwell Computer Vision and Language group School of Computing University.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
1 Relational Learning of Pattern-Match Rules for Information Extraction Presentation by Tim Chartrand of A paper bypaper Mary Elaine Califf and Raymond.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Omphalos Session. Omphalos Session Programme Design & Results 25 mins Award Ceremony 5 mins Presentation by Alexander Clark25 mins Presentation by Georgios.
CSE 425: Semantic Analysis Semantic Analysis Allows rigorous specification of a program’s meaning –Lets (parts of) programming languages be proven correct.
The Comparison of the Software Cost Estimating Methods
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Erasmus University Rotterdam Frederik HogenboomEconometric Institute School of Economics Flavius Frasincar.
Partial Prebracketing to Improve Parser Performance John Judge NCLT Seminar Series 7 th December 2005.
1 / 26 Supporting Argument in e-Democracy Dan Cartwright, Katie Atkinson, and Trevor Bench-Capon Department of Computer Science, University of Liverpool,
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
1/13 Parsing III Probabilistic Parsing and Conclusions.
1/17 Probabilistic Parsing … and some other approaches.
Växjö University Joakim Nivre Växjö University. 2 Who? Växjö University (800) School of Mathematics and Systems Engineering (120) Computer Science division.
CALL: Computer-Assisted Language Learning. 2/14 Computer-Assisted (Language) Learning “Little” programs Purpose-built learning programs (courseware) Using.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Information Extraction from Documents for Automating Softwre Testing by Patricia Lutsky Presented by Ramiro Lopez.
Principle of Functional Verification Chapter 1~3 Presenter : Fu-Ching Yang.
An expert system is a package that holds a body of knowledge and a set of rules on a subject that has been gained from human experts. An expert system.
MobSched: An Optimizable Scheduler for Mobile Cloud Computing S. SindiaS. GaoB. Black A.LimV. D. AgrawalP. Agrawal Auburn University, Auburn, AL 45 th.
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars Kewei TuVasant Honavar Departments of Statistics and Computer Science University.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Adaptor Grammars Ehsan Khoddammohammadi Recent Advances in Parsing Technology WS 2012/13 Saarland University 1.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.
Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.
Unsupervised learning of Natural languages Eitan Volsky Yasmine Meroz.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Using NLP to Support Scalable Assessment of Short Free Text Responses Alistair Willis Department of Computing and Communications, The Open University,
CONCEPTS AND TECHNIQUES FOR RECORD LINKAGE, ENTITY RESOLUTION, AND DUPLICATE DETECTION BY PETER CHRISTEN PRESENTED BY JOSEPH PARK Data Matching.
Requirements Engineering Southern Methodist University CSE 7316 – Chapter 3.
E.g.: MS-DOS interface. DIR C: /W /A:D will list all the directories in the root directory of drive C in wide list format. Disadvantage is that commands.
IBM Research © Copyright IBM Corporation 2005 | A Development Environment for Configurable Meta-Annotators in a Pipelined NLP Architecture Youssef Drissi,
© University of Manchester Simplifying OWL Learning lessons from Anaesthesia Nick Drummond BioHealth Informatics Group.
An Unsupervised Approach for the Detection of Outliers in Corpora David Guthrie Louise Guthire, Yorick Wilks The University of Sheffield.
MedKAT Medical Knowledge Analysis Tool December 2009.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Corpus-based evaluation of Referring Expression Generation Albert Gatt Ielka van der Sluis Kees van Deemter Department of Computing Science University.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Automatic Grammar Induction and Parsing Free Text - Eric Brill Thur. POSTECH Dept. of Computer Science 심 준 혁.
Towards the Use of Linguistic Information in Automatic MT Evaluation Metrics Projecte de Tesi Elisabet Comelles Directores Irene Castellon i Victoria Arranz.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Building Sub-Corpora Suitable for Extraction of Lexico-Syntactic Information Ondřej Bojar, Institute of Formal and Applied Linguistics, ÚFAL.
Human Computer Interface INT211
Software Quality Assurance and Testing Fazal Rehman Shamil.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.
Of An Expert System.  Introduction  What is AI?  Intelligent in Human & Machine? What is Expert System? How are Expert System used? Elements of ES.
Fast SLAM Simultaneous Localization And Mapping using Particle Filter A geometric approach (as opposed to discretization approach)‏ Subhrajit Bhattacharya.
Using PaQu for language acquisition research Jan Odijk CLARIN 2015 Conference Wroclaw,
EVALUATION SUFFECIENCY Types of Tests Items ( part I)
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
The PLA Model: On the Combination of Product-Line Analyses 강태준.
Natural Language Processing Vasile Rus
Automatically Labeled Data Generation for Large Scale Event Extraction
Computer aided teaching of statistics: advantages and disadvantages
David Mareček and Zdeněk Žabokrtský
Erasmus University Rotterdam
Databases.
Presentation transcript:

The Use of Corpora for Automatic Evaluation of Grammar Inference Systems Andrew Roberts & Eric Atwell Corpus Linguistics ’03 – 29 th March Computer Vision and Language Research Group, University of Leeds

Outline Overview of Grammar Inference Current evaluation methods: Manual vs Automatic Proposed advances Implementation issues Discussion Conclusion

Grammar Inference Task of automatically acquiring a type of grammar from a given text. Most interesting systems are those which are unsupervised, i.e., learn from raw, unannotated text. Examples include: ABL (van Zaanen, 2001) CLL (Watkinson, 2001) EMILE (Adriaans, 1992) GraSp (Henrichsen, 2002)

GI Applications Psychological/cognitive modelling ‘Innateness’ debate Data mining Looking in other datasets, such as bio- informatics Finding unique linguistic features that have not yet been discovered by experts

Grammar Inference (Bootstrapping Structure) Applies structure to a raw text, e.g., whilst analysing a given text sample, will provide simple annotation using brackets: ((the cat) (sat (on (the mat)))) thecatonthematsat

Grammar Inference (Rule Induction) Production of grammar rules, e.g, from a given text sample, will induce rules like: S  NP NP  N VP etc… To use the induced rules, simply plug them into a parser and apply to the text, thus giving structure.

Evaluating GI performance Ensuring they are doing the job correctly Monitoring improvements of a given system Comparing systems

Evaluation “Looks good to me” approach Expert linguist(s) analyse GI output for important linguistic features Advantages: Simple Resource efficient Reliable

Evaluation “Looks good to me” approach Disadvantages: Time consuming Relies on subjectivity of expert Possible bias Difficult to compare systems evaluated by experts Can only say which features do and do not exist – no quantative measures

Automatic Evaluation Objectivity and consistency allow for easy comparison between competing GI systems Quicker – does not require an expert to perform time consuming analysis

Automatic Evaluation A ‘Gold Standard’ corpus Compare the results of GI system with an existing treebank: 1. Make a copy of a treebank and remove all annotation. 2. Apply GI system to raw treebank text 3. Compare GI output with the original treebank Advantages: Objective Automatic – do not need to be an expert Evaluation results are comparable

Automatic Evaluation A ‘Gold Standard’ corpus Disadvantages: Treebanks often restricted to particular domain Translation interface between GI output and treebank annotation Is different the same as being wrong?

Automatic Evaluation A Multi-annotated ‘Gold Standard’ corpus A single treebank that contains more that one set of annotations Each parse will be different, but valid Compare GI output with each parse Reduces chance of system being discriminated just because it is different from the ‘Gold Standard’ Raw text Parse nParse 2Parse 1 …

Automatic Evaluation A Multi-annotated ‘Gold Standard’ corpus Problems: Different parsers will use their own annotation scheme. A ‘gold standard’ annotation scheme would be ideal, but is it feasible? What is a match?

Automatic Evaluation A Multi-annotated-multi-corpora Gold Standard’ corpus A bit of a mouthful! Adds an extra dimension to the last advance by incorporating many corpora Parse 1Parse 2Parse n Parse 1Parse 2Parse n Parse 1Parse 2Parse n

Automatic Evaluation A Multi-annotated-multi-corpora Gold Standard’ corpus Advantages: Importantly, this adds multiple genres Makes for a better, more general purpose corpus Disadvantages: Adds yet more complexity Requires a lot of time & effort to create

Discussion Are automatic and manual approaches equally reliable? Researchers are likely to go for the simplest evaluation method Create a research project aimed solely at thorough automatic evaluation. Build a black box evaluation toolkit, that can be easily added to the end of a GI pipeline

Conclusion Despite critical slant, “Looks good to me” is not inferior In fact, the goal is to emulate it, but using corpus based methods instead Current ‘Gold Standard’ approach is too basic, as it will favour systems that produce results similar to that of the standard itself Expand to create a richer corpus

Thank You! Thanks for attending this presentation Any Questions?