Portability, Parallelism and Efficiency in Parsing Dan Bikel University of Pennsylvania March 11th, 2002.

Slides:



Advertisements
Similar presentations
IPS: Implementation of Protocol Stacks for Embedded Systems Yan Wang Halmstad University, Sweden The Second Internal EPC Workshop IPS, Halmstad University,
Advertisements

Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Multilinugual PennTools that capture parses and predicate-argument structures, and their use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus,
Creating and Using Systems That Know - Anything - August 2008 Dr. Richard L. Ballard Chief Scientist.
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey.
In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009.
Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.
Year 3 Plans: A first approximation 10/16/2009SUBTLE Year 2 Overview 1.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Software Reuse Building software from reusable components Objectives
1 A Fast Deterministic Parser for Chinese Mengqiu Wang, Kenji Sagae and Teruko Mitamura Language Technologies Institute School of Computer Science Carnegie.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
1/13 Parsing III Probabilistic Parsing and Conclusions.
CS CS 5150 Software Engineering Lecture 13 System Architecture and Design 1.
CS CS 5150 Software Engineering Lecture 13 System Architecture and Design 1.
Zephyr By Shannon Poskus. What is Zephyr? Zephyr is one of two components of the National Compiler Infrastructure (NCI) project Co-funded by DARPA and.
1/15 Synchronous Tree-Adjoining Grammars Authors: Stuart M. Shieber and Yves Schabes Reporter: 江欣倩 Professor: 陳嘉平.
Soft. Eng. II, Spring 02Dr Driss Kettani, from I. Sommerville1 CSC-3325: Chapter 6 Title : The Software Reuse Reading: I. Sommerville, Chap. 20.
Fall 2004 Lecture Notes #5 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
Named Entity Recognition and the Stanford NER Software Jenny Rose Finkel Stanford University March 9, 2007.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 18 Slide 1 Software Reuse.
Selected Topics in Software Computing Distributed Software Development CVSQL Final Project Presentation.
CS 355 – Programming Languages
WEB ENGINEERING LECTURE 4 BY Kiramat Rahman. outline  In this Lecture you will learn about:  Term “Software” and its relationship with “Hardware” 
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Chapter 10 Architectural Design.
Speaking Bluntly about SharpHDL: Some Old Stuff and Some Other Proposed Future Extensions Gordon J. Pace & Christine Vella Synchron’05 Malta, November.
DCS Overview MCS/DCS Technical Interchange Meeting August, 2000.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Design Pattern Interpreter By Swathi Polusani. What is an Interpreter? The Interpreter pattern describes how to define a grammar for simple languages,
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Webcommerce Computer Networks Webcommerce by Linnea Reppa Douglas Martindale Lev Shalevich.
Chapter 6 Programming Languages (2) Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
CS CS 5150 Software Engineering Lecture 13 System Architecture and Design 1.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
D OSHISHA U NIVERSITY 13 November XML-based Genetic Programming Framework: Design Philosophy, Implementation and Applications.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 1, 08/28/03 Prof. Roy Levow.
JAVA Programming “When you are willing to make sacrifices for a great cause, you will never be alone.” Instructor: รัฐภูมิ เถื่อนถนอม
Statistical Decision-Tree Models for Parsing NLP lab, POSTECH 김 지 협.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Weakly Supervised Training For Parsing Mandarin Broadcast Transcripts Wen Wang ICASSP 2008 Min-Hsuan Lai Department of Computer Science & Information Engineering.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
May08-21 Model-Based Software Development Kevin Korslund Daniel De Graaf Cory Kleinheksel Benjamin Miller Client – Rockwell Collins Faculty Advisor – Dr.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Multilinugual PennTools that capture parses and predicate-argument structures, for use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus, Mark.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
FUNCTIONAL PROGRAMING AT WORK - HASKELL AND DOMAIN SPECIFIC LANGUAGES Dr. John Peterson Western State Colorado University.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Tackling I/O Issues 1 David Race 16 March 2010.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
1 A Classifier-based Deterministic Parser for Chinese -- Mengqiu Wang Advisor: Prof. Teruko Mitamura Joint work with Kenji Sagae.
Software Hardware refers to the physical devices of a computer system.
World’s fastest Machine Learning Engine
Lexical and Syntax Analysis
PROGRAMMING LANGUAGES
Parsing in Multiple Languages
The merging of Web and Mobile APP
LING/C SC 581: Advanced Computational Linguistics
CSSSPEC6 SOFTWARE DEVELOPMENT WITH QUALITY ASSURANCE
Constraining Chart Parsing with Partial Tree Bracketing
Optimization for Fully Connected Neural Network for FPGA application
Presentation transcript:

Portability, Parallelism and Efficiency in Parsing Dan Bikel University of Pennsylvania March 11th, 2002

Slide 1 Parsing: Where are we now? Pounding away at Penn Treebank, §23 –Collins (1999): LR 88.0, LP 88.3 –Charniak (2000): LR 89.6, LP 89.5 –Collins (2000): LR 89.6, LP 89.9 Henderson & Brill (1999) on §22: LR 90.1, LP 92.4 Room to grow: new domains, better performance

Slide 2 The Right Architecture for Parallel Parsing CKY Client 1CKY Client 2CKY Client N  Language Language package DecoderServer N ModelCollection Switchboard Object server DecoderServer 1 ModelCollection 

Slide 3 Architecture for Parallel Parsing II Highly parallel, multi-threaded –New cluster about to come on-line; poised to take advantage Fully fault-tolerant Significant flexibility: layers of abstraction Optimized for speed Highly portable for new domains, including new languages

Slide 4 Layer of Abstraction: Probability Structure P(t h,w h ) H (t h,w h )M i (t i,w i )M i-1 (t i-1, w i-1 )  Collins BBN

Slide 5 Plug-’n’-play Probability Models New engine capable of implementing a wide variety of models, including Collins, BBN Have meticulously replicated Collins’ model and performance –Cleaned up probabilistic “oddities” –Code is thoroughly documented –Will release to public

Slide 6 Fast Portability to New Data Sets Parsers operate over augmented tree space, T + Generative models define joint probability P(S,T,T + ) Chiang & Bikel (2002, in submission) provide –New, portable syntax for augmenting tree nodes –Method for reestimating parser models in the augmented space such that P(S,T) is maximized

Slide 7 Rapid Portability to New Languages with High Accuracy Bikel & Chiang (2000) described porting two parsing models developed for English to Chinese –BBN: LR 69.0, LP 74.8 (≤ 40 words) –Chiang: LR 76.8, LP 77.8 (≤ 40 words) New engine designed from ground up for multi-lingual processing: language package –Original design goal for new parsing engine: develop new language packages in 1–2 weeks Developed Chinese language package for new engine in one and a half days Compared to other known Chinese parsers on the CTB, recall is equivalent and precision is significantly superior –LR 77.0, LP 81.6 (≤ 40 words)

Slide 8 What’s in store… Incorporating richer lexical information into parsing/language processing, specifically… Incorporating word sense information into a parsing model, building on both –previous work extending BBN parsing model to include word sense –recent work with David Chiang, viewing word sense as yet another component of “hidden” data in a Treebank

Slide 9 FIN