PRESENTED BY: PEAR A BHUIYAN

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Learning Accurate, Compact, and Interpretable Tree Annotation Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Miloš.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
The Unreasonable Effectiveness of Data Alon Halevy, Peter Norvig, and Fernando Pereira Kristine Monteith May 1, 2009 CS 652.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
CS4705 Natural Language Processing.  Regular Expressions  Finite State Automata ◦ Determinism v. non-determinism ◦ (Weighted) Finite State Transducers.
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
1 Josef van Genabith & Andy Way TransBooster ( ) LaDEva: Labelled Dependency-Based MT Evaluation ( ) GramLab ( ) Previous MT Work.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
April 26, 2007Workshop on Treebanking, NAACL-HTL 2007 Rochester1 Treebanks and Parsing Jan Hajič Institute of Formal and Applied Linguistics School of.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Invitation to Computer Science 5th Edition
9/8/20151 Natural Language Processing Lecture Notes 1.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
Spring /22/071 Beyond PCFGs Chris Brew Ohio State University.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi and introduced in Tree-adjoining grammars are somewhat similar to context-free.
Discriminative Syntactic Language Modeling for Speech Recognition Michael Collins, Brian Roark Murat, Saraclar MIT CSAIL, OGI/OHSU, Bogazici University.
A search-based Chinese Word Segmentation Method ——WWW 2007 Xin-Jing Wang: IBM China Wen Liu: Huazhong Univ. China Yong Qin: IBM China.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Coarse-to-Fine Efficient Viterbi Parsing Nathan Bodenstab OGI RPE Presentation May 8, 2006.
Triplet Extraction from Sentences Technical University of Cluj-Napoca Conf. Dr. Ing. Tudor Mureşan “Jožef Stefan” Institute, Ljubljana, Slovenia Assist.
Statistical Decision-Tree Models for Parsing NLP lab, POSTECH 김 지 협.
LING 001 Introduction to Linguistics Spring 2010 Syntactic parsing Part-Of-Speech tagging Apr. 5 Computational linguistics.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Supertagging CMSC Natural Language Processing January 31, 2006.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Optimizing TFSG Parsing through Non- Statistical Indexing Paper Summary Sebastian Nowozin.
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
Pattern Recognition. What is Pattern Recognition? Pattern recognition is a sub-topic of machine learning. PR is the science that concerns the description.
CSC 594 Topics in AI – Natural Language Processing
Approaches to Machine Translation
Basic Parsing with Context Free Grammars Chapter 13
Authorship Attribution Using Probabilistic Context-Free Grammars
Natural Language Processing (NLP)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Improving a Pipeline Architecture for Shallow Discourse Parsing
CS416 Compiler Design lec00-outline September 19, 2018
LING/C SC 581: Advanced Computational Linguistics
CSCI 5832 Natural Language Processing
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
Approaches to Machine Translation
CS4705 Natural Language Processing
CS416 Compiler Design lec00-outline February 23, 2019
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Parsing Unrestricted Text
Natural Language Processing (NLP)
High-Level Programming Language
PRESENTATION: GROUP # 5 Roll No: 14,17,25,36,37 TOPIC: STATISTICAL PARSING AND HIDDEN MARKOV MODEL.
Chapter 10: Compilers and Language Translation
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Artificial Intelligence 2004 Speech & Natural Language Processing
Natural Language Processing (NLP)
Presentation transcript:

PRESENTED BY: PEAR A BHUIYAN PROJECT UPDATE PRESENTATION CSI 5386, FALL 2012 Title: Exploring Natural Language Parsing Techniques and evaluations PRESENTED BY: PEAR A BHUIYAN

OBJECTIVE OF THE PROJECT To learn some parsing techniques and evaluation methods by studying research papers. Collect research papers and select three large papers which describe parsing techniques or evaluation methods. Study to learn those techniques of parsing and evaluation methods. Summarize them and write a composite report with strengths and weaknesses.

INTRODUCTION Parsing is defined as the task of recognizing an input string and assigning a structure to it. It is useful in applications like grammar checking, as an intermediate stage of semantic analysis, information extraction etc. We have learnt some parsing algorithms in this course. The goal of this project is to learn some other parsing techniques by searching and studying some research papers on parsing and to write a review report on this. Therefore, I have collected 9 research papers on parsing. Of them, I have chosen 3 large papers to study and write the composite review report. At this point it is hard to summarize them as it needs more study to grasp the contents. Thus, for now we have a look at each paper individually.

PAPER-1 Title: Head-driven statistical models for natural language parsing. Author: Michel Collins, MIT Computer Science and Artificial Intelligence Laboratory, Published: © 2003 Association of computational linguistics. (49 pages) Three parsing models were introduced and evaluated on the Penn Wall Street Journal Treebank. Their strengths and weaknesses were discussed. The effects of various features on the parsing accuracy. The relationship of the models to other work on statistical parsing.

PAPER-1: Model 1 This extends probabilistic context-free grammars (PCFGs) to lexicalized grammars. Context-free grammar is defined as a 4-tuple (N,∑,A,R) where N is a set of non-terminals, ∑ is an alphabet, A is a start symbol in N, and R is a finite set of rules. In probabilistic context-free grammar (PCFG) each rule in a grammar has a probability. A PCFG can be lexicalized by associating a word w and a part-of-speech tag t with each non-terminal X in the tree.

PAPER-1: Model-2 &3 Model-2 extends the parser to distinct between complement and adjunct(temporal modifier). This may help parsing accuracy. Model-3 gives a probabilistic treatment of wh-movement.

PAPER-2 Title: Discriminative re-ranking for natural language parsing,. Authors: By: Michel Collins and Terry Koo, MIT Computer Science and Artificial Intelligence Laboratory. Published in: © 2005 Association of computational linguistics. (46 pages).

PAPER-2 This article provides with some approaches to re-rank the output of an existing probabilistic parser. The base parser defines an initial ranking of the candidate parse trees. A second model uses some additional features to improve the initial ranking. A method based on the boosting approach is introduced for re-ranking the parse trees.

PAPER-2 The algorithm can be viewed as a feature selection method. The boosting method is applied to parsing the wall Street Journal (WSJ) treebank. The method combines the log-likelihood under a baseline model with evidence from an additional 500000 features over parse trees. The new model achieved 89.75% whereas the baseline model achieved 88.2%.

PAPER-3 Title: Wide-Coverage Deep Statistical Parsing Using Automatic Dependency Structure Annotation. Authors: , Aoife Cahill, Michael Burke, Ruth O’ Donovan, Josef van Geabith, and Andy Way, Dublin City University, Stefan Riezler, Palo Alto Research Center. Published in: © 2008 Association of computational linguistics.(44 pages).

PAPER-3 Tree-based parser evaluation has some drawbacks: Does not provide enough information for NLP applications like deep dependency relations, predicate- argument structure etc. A number of alternative tree representations for the same input. Such problems motivated research on dependency based parser evaluation.

PAPER-3 A number of researchers conducted such experiments using simple and automatic methods to convert shallow parsers output trees into dependencies. In this article, such experiments revisited using sophisticated automatic LFG f-structure annotation methodologies. Various PCFG and history-based parsers are compared to find a baseline parsing system that fits best into this automatic dependency structure annotation technique. The experiments show that the combined system of syntactic parser and dependency structure annotation outperforms hand-crafted deep wide coverage grammars.

PAPER-3 Four machine-learning based shallow parsers and two hand-crafted wide coverage deep probabilistic parsers were evaluated. Their best system achieved f-score of 82.73% against PARC 700 Dependency Bank whereas the f-score for the hand crafted LFG grammar and XLE parsing system was 80.55%(2.18% improvement). The system achieved f-score of 80.23% against the CBS 500 Dependency Bank whereas the f- score for the hand-crafted RASP grammar and parsing system was 76.57%(3.66% improvement).

YET TO DO Studying the papers thoroughly to understand different parsing techniques and evaluation methods. Writing a report on the review of the papers.