End-to-End Discourse Parser Evaluation

Slides:



Advertisements
Similar presentations
Números.
Advertisements

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
EuroCondens SGB E.
Worksheets.
Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.
Copyright © 2013 Elsevier Inc. All rights reserved.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Addition and Subtraction Equations
By John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
AIDS epidemic update Figure AIDS epidemic update Figure 2007 Estimated adult (15–49 years) HIV prevalence rate (%) globally and in Sub-Saharan Africa,
AIDS epidemic update Figure AIDS epidemic update Figure 2007 Estimated adult (15–49 years) HIV prevalence rate (%) globally and in Sub-Saharan Africa,
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
Identifying money correctly 1
I can count in decimal steps from 0.01 to
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Break Time Remaining 10:00.
The basics for simulations
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
EE, NCKU Tien-Hao Chang (Darby Chang)
1 Heating and Cooling of Structure Observations by Thermo Imaging Camera during the Cardington Fire Test, January 16, 2003 Pašek J., Svoboda J., Wald.
PP Test Review Sections 6-1 to 6-6
The Pecan Market How long will prices stay this high?? Brody Blain Vice – President.
Figure 3–1 Standard logic symbols for the inverter (ANSI/IEEE Std
MCQ Chapter 07.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
1 Prediction of electrical energy by photovoltaic devices in urban situations By. R.C. Ott July 2011.
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
Basis and Price Formation. Basis Basis is the difference between a cash price at a specific location and the price of a particular futures contract. The.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Copyright © [2002]. Roger L. Costello. All Rights Reserved. 1 XML Schemas Reference Manual Roger L. Costello XML Technologies Course.
Progressive Aerobic Cardiovascular Endurance Run
Biology 2 Plant Kingdom Identification Test Review.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Artificial Intelligence
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Foundation Stage Results CLL (6 or above) 79% 73.5%79.4%86.5% M (6 or above) 91%99%97%99% PSE (6 or above) 96%84%100%91.2%97.3% CLL.
Subtraction: Adding UP
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
Converting a Fraction to %
Resistência dos Materiais, 5ª ed.
& dding ubtracting ractions.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen.
WARNING This CD is protected by Copyright Laws. FOR HOME USE ONLY. Unauthorised copying, adaptation, rental, lending, distribution, extraction, charging.
9. Two Functions of Two Random Variables
A Data Warehouse Mining Tool Stephen Turner Chris Frala
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
What impact does the address have on the tribe?
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.
Presentation transcript:

End-to-End Discourse Parser Evaluation Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering and Computer Science University of Trento, Italy

Content Introduction Architecture Feature Result Conclusion Discourse Parser: what + why + how Discourse Parser & Penn Discourse TreeBank (PDTB) Our contribution Architecture Feature Result Conclusion End2End Disc Pars Eval

Introduction What: we refer to coherent structured group of sentences or expressions as a discourse Why: discourse structure to represent the meaning of the document How : Process flow: data (discourse) segmentation  discourse parsing  discourse structure Discourse structure includes relations (connective and its arguments ) lexically anchored in the document text Common Data Sources: Rhetorical Structure Tree (RST) & Penn Discourse TreeBank (PDTB )  We used this End2End Disc Pars Eval

Examples from PDTB(1) Arg1 -> I never gamble too far. Explicit Connective -> In particular Arg2 -> I quit after one try, whether I win or lose. [EXPANSION ] Each annotated relation includes a connective, two arguments and a sense label of connective Connective occur between two arguments or at the beginning of sentence or inside argument The top-level senses of three-layered hierarchy: TEMPORAL, CONTINGENCY, COMPARISON, EXPANSION End2End Disc Pars Eval

(Arg1 italicized, connectives underlined, Arg2 boldfaced) Examples from PDTB(2) When Mr. Green won a $240,000 verdict in a land condemnation case against the State in June 1983, he says, Judge O’Kicki unexpectedly awarded him an additional $100,000. [TEMPORAL ] As an indicator of the tight grain supply situation in the U.S., market analysts said that late Tuesday the Chinese government, which often buys U.S. grains in quantity, turned instead to Britain to buy 500,000 metric tons of wheat. [COMPARISON ] Since McDonald’s menu prices rose this year, the actual deadline may have been more. [CONTINGENCY ] (Arg1 italicized, connectives underlined, Arg2 boldfaced) End2End Disc Pars Eval

PDTB Corpus Statistics Arg2 always in same sentence as connective 60.9% of the annotated Arg1 in same sentence as connective, 39.1% is in the previous sentence (30.1% adjacent, 9.0% non adjacent) We used this statistic information to establish baseline End2End Disc Pars Eval

Our Contribution Developed end-to-end discourse parser to retrieve discourse structure with explicit connective, 2 arg spans starting with text paragraph Evaluation Established system with Gold-standard data (PTB+PDTB) Evaluated with baseline Implemented same method in automated system Improvement of the automated system in terms of applicability Overlapping discourse segmentation technique (+2/-2 window) applied on the complete text Followed chunking strategy for classification The discourse model is a cascaded CRF End2End Disc Pars Eval

End-to-End Architecture Doc Parser Parse_Tree Chunklink By Sabaine Buchholz CoNLL’00 task AddDiscourse Pitler & Nenkova ‘09 Conn. SenseDet. RootExtract +Morpha Morph & All Feat Johansson+ Minnen et al Pruner Arg2 Arg1 End2End Disc Pars Eval

Features Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Arg2 Labels For more details: Ghosh et al IJCNLP 2011 End2End Disc Pars Eval

Features: Arg1 Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Arg2 Labels For more details: Ghosh et al IJCNLP 2011 End2End Disc Pars Eval

Features: Arg2 Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Arg2 Labels For more details: Ghosh et al IJCNLP 2011 End2End Disc Pars Eval

Evaluation & Baseline Metrics: Precision, Recall and F1 measure Scoring schemes: Exact Match: correct if classified span exactly coincides with gold standard span Baseline (On the basis of statistics given at annotation manual): Arg2: by labeling all tokens of the text span between the connective and the beginning of the next sentence Arg1: by labeling all tokens in the text span from the end of the previous sentence to the connective position; if the connective occurs at the beginning of a sentence, labeling previous sentence. End2End Disc Pars Eval

Exact Arg2 Results: Comparison Viewgraph F1 Baseline 0.53 0.46 0.49 Gold - Standard 0.84 0.74 0.79 Automatic 0.80 0.74 0.77 AutoConn+GoldSPT 0.82 0.70 0.76 GoldConn+AutoSPT 0.76 0.61 0.68 Lightweight(Auto) 0.72 0.56 0.63 End2End Disc Pars Eval

Exact Arg1 Results: Comparison Viewgraph F1 Baseline 0.19 0.19 0.19 Gold - Standard 0.68 0.39 0.49 Automatic 0.63 0.28 0.39 AutoConn+GoldSPT 0.67 0.31 0.43 GoldConn+AutoSPT 0.62 0.31 0.41 Lightweight(Auto) 0.60 0.27 0.37 End2End Disc Pars Eval

Features The IOB(Inside-Outside-Begin) chain all constituents on the path between the root note and the current leaf node of the tree. For example IOB chain feature for ``flashed“: I-S/E-VP/E-SBAR/E-S/C-VP , where B-, I-, E- and C- indicate whether the given token is respectively at the beginning, inside, at the end of the constituent, or a single token chunk. End2End Disc Pars Eval

Conclusion The Automatic end2end system results nearly same with Gold standard We lead towards a “lightweight” version of the pipeline – shallow & less dependence of SPTs We wish to explore more features We improved our result by 5 points for Arg1 classification using a previous sentence feature (Ghosh et al IJCNLP 2011) End2End Disc Pars Eval

Thank you Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering and Computer Science University of Trento, Italy {ghosh, riccardi}@disi.unitn.it End2End Disc Pars Eval

Previous Work Task limited to retrieving the argument heads (Wellner et al 2007, Elwell et al 2008) Dinesh et al. (2005) extracted complete arguments with boundaries, but only for a restricted class of connectives The identification of Arg1 has been only partially addressed in previous works (Prasad 2010) Automatic surface-sense classification (at class level) already reached the upper bound of inter-annotator agreement (Pitler and Nenkova, 2009) End2End Disc Pars Eval

Data & Tools Corpus Used: Penn Discourse Tree Bank (PDTB) For Gold Standard System: Penn Tree Bank (PTB) corpus is used Third party software/scripts used: Stanford Syntactic Tree Parser (by Klein & Manning 2003) AddDiscourse (Explicit Connective Classification) (Pitler and Nenkova 2008) ChunkLink.pl to extract IOB chains (by Sabine Buchholtz: CoNLL Shared Task 2000) RootExtractor: Syntactic Parse Tree (SPT) processors (by Richard Johansson) Morpha (Minnen et al 2001) Conditional Random Field: CRF++ by Taku Kudo End2End Disc Pars Eval

Overall Architecture Syntactic tree parser is used for automatic systems Connective Detection and classification tool is used for automatic systems PDTB & PTB are not used during end-to-end automatic testing phase End2End Disc Pars Eval

End2End Testing Phase End2End Disc Pars Eval

Conditional Random Field We use the CRF++ tool (http://crfpp.sourceforge.net/) for sequence labeling classification (Lafferty et al., 2001), with second-order Markov dependency between tags. Beside the individual specification of a feature in the feature description template, the features in various combinations are also represented. We used this tool because the output of CRF++ is compatible to CoNLL 2000 chunking shared task, and we view our task as a discourse chunking task. On the other hand, linear-chain CRFs for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Also Sha and Pereira (2003) claim that, as a single model, CRFs outperform other models for shallow parsing. End2End Disc Pars Eval

Hill Climbing Algorithm function HILL-CLIMBING ( problem) returns a state that is a local maximum current 9— MAKE-NODE(problem.INITIAL-STATE) loop do neighbor highest-valued successor of current if (neighbor.VALUE < current.VALUE) then return current.STATE current 9<— neighbor [Artificial Intelligence: Stuart J. Russel] The hill climbing search algorithm, the most basic local search technique. At each step the current node is replaced by the best neighbor; Here neighbor with the highest VALUE, but if a heuristic cost estimate h is used, we would find the neighbor with the lowest h. Hill climbing is greedy, fast local search We optimized this selected set with feature ablation technique, leaving 1 feature each time End2End Disc Pars Eval

Features The IOB(Inside-Outside-Begin) chain corresponds to the syntactic categories of all the constituents on the path between the root note and the current leaf node of the tree. The corresponding feature would be I-S/E-VP/E-SBAR/E-S/C-VP, where B-, I-, E- and C- indicate whether the given token is respectively at the beginning, inside, at the end of the constituent, or a single token chunk. In this case, ``flashed" is at the end of every constituent in the chain, except for the last VP, which dominates one single leaf. End2End Disc Pars Eval

Result: Gold-lbl & Auto P R F1 Arg2 Exact 0.84 0.53 0.74 0.46 0.79 0.49 Partial 0.93 0.80 0.82 0.85 0.88 0.82 Overlap 0.97 0.98 0.88 0.85 0.92 0.91 Arg1 0.68 0.19 0.39 0.19 0.49 0.19 0.81 0.50 0.51 0.68 0.62 0.58 0.91 0.70 0.52 0.68 0.66 0.69 P R F1 Arg2 Exact 0.80 0.74 0.77 Partial 0.91 0.85 0.88 Overlap 0.97 0.92 Arg1 0.64 0.31 0.42 semi 0.76 0.39 0.52 auto 0.84 0.40 0.54 0.63 0.28 full 0.36 0.48 0.83 0.37 0.51 Automatic Sys Output Gold-labeled Sys Output (Baseline result in blue color) End2End Disc Pars Eval

Combo Result Auto Conn + Gold SPT Gold Conn + Auto SPT P R F1 Arg2 Exact 0.82 0.70 0.76 Partial 0.93 0.79 0.85 Overlap 0.96 0.83 0.89 Arg1 0.67 0.31 0.43 0.81 0.44 0.57 0.94 0.60 P R F1 Arg2 Exact 0.76 0.61 0.68 Partial 0.91 0.73 0.81 Overlap 0.96 0.77 0.85 Arg1 0.62 0.31 0.41 0.42 0.54 0.87 0.43 0.58 Auto Conn + Gold SPT Gold Conn + Auto SPT End2End Disc Pars Eval

Result: replc. IOB chain F1 Arg2 Exact 0.80 0.74 0.77 Partial 0.91 0.85 0.88 Overlap 0.97 0.92 Arg1 0.65 0.29 0.40 0.43 0.56 0.60 End2End Disc Pars Eval