Curs 10: Veins Theory Discourse structure and coherence Dan Cristea Selecţie de sliduri.

Slides:



Advertisements
Similar presentations
Referring Expressions: Definition Referring expressions are words or phrases, the semantic interpretation of which is a discourse entity (also called referent)
Advertisements

Syntactic Complexity and Cohesion
Kaplan’s Theory of Indexicals
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
What is anaphora The use of a linguistic unit, such as a pronoun, to refer back to another unit, as the use of her to refer to Anne in the sentence Anne.
Curs 7: Teorii ale discursului Centering Dan Cristea Selecţie de slide-uri prezentate în tutoriale (RANLP-03, Borovits, Sept. 2003; ICON-04, Hyderabad,
Curs 11: Aşteptări şi satisfacerea lor în parsarea discursului Dan Cristea Selecţie de sliduri.
Measuring Referring Expressions in a Story Context Phyllis Schneider, Speech Pathology & Audiology, University of Alberta Denyse Hayward, University of.
English 4 UNIT 2 MAKE, REFUSE AND ACCEPT INVITATIONS “Connectors” E.T.E. Karim Juárez Cortes Idea original y Diseño.
Automatic Essay Scoring Evaluation of text coherence for electronic essay scoring systems (E. Miltsakaki and K. Kukich, 2004) Universität des Saarlandes.
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Natural Language Processing
Reference Resolution #1 CSCI-GA.2590 Ralph Grishman NYU.
Generation of Referring Expressions: Modeling Partner Effects Surabhi Gupta Advisor: Amanda Stent Department of Computer Science.
Image Segmentation and Active Contour
By: Congjia Liao Computer Science 1631 Winter 2011’
Week 5a. Binding theory CAS LX 522 Syntax I. Structural ambiguity John said that Bill slipped in the kitchen. John said that Bill slipped in the kitchen.
CS 4705 Algorithms for Reference Resolution. Anaphora resolution Finding in a text all the referring expressions that have one and the same denotation.
Corpus 06 Discourse Characteristics. Reasons why discourse studies are not corpus-based: 1. Many discourse features cannot be identified automatically.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
Centering theory and its direct applications
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Short Selling A basic look at this Confusing, yet simple Investment strategy Bulls4Bears.
March 2006 CLINT-CS 1 Introduction to Computational Linguistics Chunk Parsing.
Borovets, sept Discourse theories and technologies Dan Cristea “Al. I. Cuza” University of Iasi, Faculty of Computer Science and Romanian Academy,
A Light-weight Approach to Coreference Resolution for Named Entities in Text Marin Dimitrov Ontotext Lab, Sirma AI Kalina Bontcheva, Hamish Cunningham,
1 DO NOW  Copy your homework: Finish writing your note cards and adding transitions to your speech. Practice your speech and keep up with you reading.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 3.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Binding Theory Describing Relationships between Nouns.
Discourse. The study of discourse: – Involves our efforts to interpret or be interpreted…and how we accomplish it – Goes beyond just linguistic forms.
Task 2 Review Day 2 May 28. SWBAT finish outlining Task 2 Do Now Quotation: DO the next one in the packet. Reminders: If you want to replace The Crucible.
Parts of Speech Today we will be looking at 2 parts of speech: NOUNS and PRONOUNS.
1 Anaphora, Discourse and Information Structure Oana Postolache EGK Colloquium April 29, 2004.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
An event related potential investigation of complement set reference Joanne Ingram University of Bedfordshire Linda M Moxey University.
CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples.
A Cross-Lingual ILP Solution to Zero Anaphora Resolution Ryu Iida & Massimo Poesio (ACL-HLT 2011)
1 Cohesion + Coherence Lecture 9 MODULE 2 Meaning and discourse in English.
REFERENTIAL CHOICE AS A PROBABILISTIC MULTI-FACTORIAL PROCESS Andrej A. Kibrik, Grigorij B. Dobrov, Natalia V. Loukachevitch, Dmitrij A. Zalmanov
Coherence and Coreference Introduction to Discourse and Dialogue CS 359 October 2, 2001.
Automatic Evaluation of Linguistic Quality in Multi- Document Summarization Pitler, Louis, Nenkova 2010 Presented by Dan Feblowitz and Jeremy B. Merrill.
WORDS The term word is much more difficult to define in a technical sense, and like many other linguistic terms, there are often arguments about what exactly.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Hullo Folks! How are you going? Let’s continue our adventure to the World of Language in Use With the Pragmatics Ranger E. Aminudin Aziz.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Pauline Jacobson,  General introduction: compositionality, syntax/semantics interface, notation  The standard account  The variable-free account.
Lesson 3 and 4 Grammatical cohesion. Grammatical Cohesion These are the grammatical words (function) words that are used to link sentences across an entire.
Coreferential Interpretations of Reflexives in Picture Noun Phrases: an Experimental Approach Micah Goldwater University of Texas at Austin Jeffrey T.
6/11/2016 Linear Resolution and Introduction to First Order Logic Michael Leuschel Softwaretechnik und Programmiersprachen Lecture 5.
PRAGMATICS. SCHEDULE May 14: Yule ch. 1, 2 and 3 May 16: Yule ch. 4, 5 and 6 May 21: Yule ch. 7, 8 and 9 May 22: Seminar EXAM Thursday; May 31,
2. The standards of textuality: cohesion Traditional approach to the study of lannguage: sentence as conventional object of study Structuralism (Bloofield,
2nd Language Learning Chapter 2 Lecture 4.
Curs 8 Teoria nervurilor.
NYU Coreference CSCI-GA.2591 Ralph Grishman.
Improving a Pipeline Architecture for Shallow Discourse Parsing
Referring Expressions: Definition
Clustering Algorithms for Noun Phrase Coreference Resolution
Algorithms for Reference Resolution
A Machine Learning Approach to Coreference Resolution of Noun Phrases
Integrating Segmentation and Similarity in Melodic Analysis
How to Avoid Redundant Object-References
A Machine Learning Approach to Coreference Resolution of Noun Phrases
Curs 4 Rezoluţia anaforei - continuare
Discrete Mathematics and Its Applications
References by: Dania Abbas M. Ali
Presentation transcript:

Curs 10: Veins Theory Discourse structure and coherence Dan Cristea Selecţie de sliduri

On cohesion

Types of references evocative references -evocative resolution processes: - an anaphor may be resolved to a referent that is not linearly the closest, but only hierarchically the closest - based on associations (pattern matching on morpho-semantic features) - fast - give fluency to the text

Types of references - post-evocative resolution processes: - are inferential processes developed in memory, - computationally and cognitively slow (compel to more inference load), - require more powerful referencing means (like proper nouns), - are less frequent. post-evocative references

Domain of evocative accessibility (DEA) dea(u) = pref(u, vein(u)) Remind! The vein expression of a terminal node (discourse unit): the sequence of units that are required to understand just that unit, in the context of the whole discourse. (simplified)

Heads and veins H=3 H=1 2 H=3 H=1 H=2 H=3 H=4 H=5 H= V=3 5 V=3 V=1 2 3 V=(1 2) 3 V=3 4

From vein expressions V=1 2 3 V=(1 2) 3 V=3 4 V=3 5

... to Domains of Evocative Accessibility V=3 5 V=1 2 3 V=3 4 DEAs V=1 2 3

The reason why she can refer Mary but not John’s mother 1 John told Mary that he loves her. 2. He has never been married 3. and lived until his 40s with his mother. 4. She, on the contrary, was married twice. antithesis 1414 4 2 elaboration V=1 2 4

The reason why we recuperate with difficulty the antecedent of it 1. With one year before finishing his mandate as president of the company, 2. Mr. W. Ross has begun to bring about its bankruptcy. 3. There were rumors that he has obtained it by fraud. 1313 circumstance 21 background V=2 3

… while here the reference is immediate 1. Mr. W. Ross has begun to bring about the bankruptcy of his company. 2. with one year before finishing his mandate as president. 3. There were rumors that he has obtained it by fraud. 1313 2 background 3 1 circumstance 123 V=1 2 3

Experiment 1: evocative vs post- evocative references SourceNo. of units Total no. of refs On the veins Outside the veins English %8.30% French %0.90% Romanian %4.50% Total %4.40%

The 4.4% exceptions decreasing evoking power Type of REVT pragmatic56.30% proper nouns22.70% common nouns16.00% pronouns5.00%

Experiment 2: potential to establish correct co-reference links Compare Linear-k and Discourse-VT-k models: –For each k, each re, and each model M (Linear or VT) p(M-k,re,DEA k ) = p(M-k,Corpus) =  re  Corpus p(M-k,re,DEA k ) 1, re can be resolved to antecedents in DEA k 0, otherwise. {

Potentials

Experiment 3: the effort required to find antecedents Compare Linear-k and Discourse-VT-k models: –For each k, each re, and each model M (Linear or VT) e(M-k,re,DEA k ) = e(M-k,Corpus) =  re  Corpus e(M-k,re,DEA k ) d<k, the distance between re and the closest antecedent in DEA k k, if no such antecedent exists. {

Effort: an example Michael D. Casey Genetic Therapy Inc. Mr. Casey Genetic Therapy Inc. Mr. Casey the smaller company Johnson & Johnson M. James Barett chairman its president its J&J Mr. Casey J&J Mr. Barett CEO Michael D. Casey, a top Johnson&Johnson manager, moved to Genetic Therapy Inc., a small biotechnology concern here, 2. to become its president and chief operating officer. 3. Mr. Casey, 46 years old, was president of J&J's McNeil Pharmaceutical subsidiary, 4. which was merged with another J&J unit, Ortho Pharmaceutical Corp., this year in a cost-cutting move. 5. Mr. Casey succeeds M. James Barrett, 50, as president of Genetic Therapy. 6. Mr. Barrett remains chief executive officer 7. and becomes chairman. 8. Mr. Casey said 9. he made the move to the smaller company.

Efforts

The account of VT on coherence Veins give a natural way to generalize Centering from local to global

Centering Rule 2: transitions C b (u) = C b (u-1) C b (u)  C b (u-1) C b (u) = C p (u) C b (u)  C p (u) CONTINUINGSMOOTH SHIFT RETAININGABRUPT SHIFT CON > RET > SSH > ASH

V=1 3 5 V= V=1 3 5 V= Vein expressions give „lines of argumentation“ 1. John sold his bicycle 3. He obtained a good price for it, 5. Therefore he decided to use the money to go on a trip. 1. John sold his bicycle 2. although Bill would have wanted it 3. He obtained a good price for it, 4. which Bill could not have afforded 5. Therefore he decided to use the money to go on a trip.

V=1 3 5 V= V=1 3 5 V= Lines of argumentation 2. although Bill would have wanted it. 1. John sold his bicycle 2. although Bill would have wanted it 3. He obtained a good price for it, 5. Therefore he decided to use the money to go on a trip.

V=1 3 5 V= V=1 3 5 V= Lines of argumentation 3. He obtained a good price for it, 1. John sold his bicycle 3. He obtained a good price for it, 5. Therefore he decided to use the money to go on a trip.

V=1 3 5 V= V=1 3 5 V= Lines of argumentation 4. which Bill could not have afforded. 1. John sold his bicycle 3. He obtained a good price for it, 4. which Bill could not have afforded 5. Therefore he decided to use the money to go on a trip.

V=1 3 5 V= V=1 3 5 V= Lines of argumentation 5. Therefore he decided to use the money to go on a trip. 1. John sold his bicycle 3. He obtained a good price for it, 5. Therefore he decided to use the money to go on a trip.

Computation of longest argumentation lines (al) uV(u)V(u)dea(u)al

Evaluating the coherence of a discourse A smoothness score: –CONTINUING = 4 –RETAINING = 3 –SMOOTH SHIFT =2 –ABRUPT SHIFT = 1 –NO Cb = 0 A global smoothness score: summing up the score of all units

The second conjecture (on coherence) The global smoothness score of a discourse when computed following VT is at least as high as the score computed following CT. But segments, as considered by Centering, typically are developed along veins. When passing segments frontiers, in a linear reading, transitions are usually abrupt. Therefore, what we claim here is that long- distance transitions, as computed along veins, are systematically smoother than accidental transitions at segment boundaries.

Transitions and scores on a linear adjacency metric J = [John], b = [John's bicycle], B = [Bill], p = [price], m = [the money], t = [a trip]) CfJ, bB, bJ, p, bp, BJ, m, t CbJbbp- TransASHRETSSHNo Cb Score1320 Global6/4 = 1.5

Transitions and scores on a hierarchical adjacency metric 12 Cf J, bB, b Cb Jb Trans ASH Score 1 Global 134 J, bJ, p, bp, B JJp CONSSH J, bJ, p, bJ, m, t JJJ CON 4 11/4=2.75

Verifying the second conjecture SourceNo. of transitions CT scoreAverage CT score per transition VT scoreAverage VT score per transition English French Romanian Total

VT references Cristea,D.; Ide,N.; Romary,L. (1998): Veins Theory. An Approach to Global Cohesion and Coherence. In Proceedings of Coling/ACL ‘98, Montreal Cristea,D., Ide,N., Marcu,D., Tablan, M.-V. (2000): Discourse Structure and Co-Reference: An Empirical Study, In Proceedings of The 18th International Conference on Computational Linguistics COLING'2000, LuxembourgDiscourse Structure and Co-Reference: An Empirical Study Ide,N., Cristea,D. (2000): A Hierarchical Account of Referential Accessibility. In Proceedings of The 38th Annual Meeting of the Association for Computational Linguistics, ACL'2000, Hong KongA Hierarchical Account of Referential Accessibility Sereţan,V., Cristea,D. (2002): The use of referential constrains in structuring discourse. In Proceedings of The Third International Conference on Language Resources and Evaluation, LREC-2002, Las Palmas Cristea, D. (2005): Motivations and Implications of Veins Theory, in B.Sharp (Ed.). Natural Language Understanding and Cognitive Science, Proceedings of the 2nd International Workshop on Natural Language Understanding and Cognitive Scienc3, NLUCS 2005, in conjunction with ICEIS 2005, Miami, U.S.A., May 2005, INSTICC Press