Anaphoric Third Person Pronouns and Prosodic Features as Markers of Cohesion in English Spoken Discourse: A Corpus Study Cyril Auran Laboratoire Parole.

Slides:



Advertisements
Similar presentations
APPROACHES TO T&L Language
Advertisements

Jeopardy Q 1 Q 2 Q 3 Q 4 Q 5 Q 6Q 16Q 11Q 21 Q 7Q 12Q 17Q 22 Q 8Q 13Q 18 Q 23 Q 9 Q 14Q 19Q 24 Q 10Q 15Q 20Q 25 Final Jeopardy Writing Terms.
Automatic Grapheme-Phoneme Conversion for Spoken British English Corpora C. AURAN, C. BOUZON & D.J. HIRST Laboratoire Parole et Langage CNRS UMR6057 Université
Mitglied der Leibniz-Gemeinschaft Querying Spoken Language Corpora Thomas Schmidt IDS Mannheim.
Mathla’ul Anwar University
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Phonetics as a scientific study of speech
Sub-Project I Prosody, Tones and Text-To-Speech Synthesis Sin-Horng Chen (PI), Chiu-yu Tseng (Co-PI), Yih-Ru Wang (Co-PI), Yuan-Fu Liao (Co-PI), Lin-shan.
Why study grammar? Knowledge of grammar facilitates language learning
Appositive Relative Clauses and their Prosodic Realization in Spoken Discourse: a Corpus Study of Phonetic Aspects in British English Cyril Auran & Rudy.
Nuclear Accent Shape and the Perception of Prominence Rachael-Anne Knight Prosody and Pragmatics 15 th November 2003.
Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.
1 Spoken Dialogue Systems Dialogue and Conversational Agents (Part IV) Chapter 19: Draft of May 18, 2005 Speech and Language Processing: An Introduction.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Prosodic marking of appositive relative clause types in spoken discourse: pragmatic and phonetic analyses of a British English corpus Cyril Auran & Rudy.
Prosodic analysis: theoretical value and practical difficulties Anne Wichmann Nicole Dehé.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.
Discourse and intertextual issues in translation.
Weakness of Structural linguistics Functionalism
TYPOLOGY AND UNIVERSALS. TYPOLOGY borrowed from the field of biology and means something like ‘taxonomy’ or ‘classification’ the study of linguistic systems.
Schizophrenia and Depression – Evidence in Speech Prosody Student: Yonatan Vaizman Advisor: Prof. Daphna Weinshall Joint work with Roie Kliper and Dr.
LREC 2010, Malta Maj Centre for Language Technology The DAD corpora and their uses Costanza Navarretta Funded by Danish Research.
The “interpretative” foundation of Intonation Unit (IU) or Intonation Phrase (  ). Amedeo De Dominicis Conferenza annuale A.I.S.V (Università degli.
AUTOMATIC DETECTION OF REGISTER CHANGES FOR THE ANALYSIS OF DISCOURSE STRUCTURE Laboratoire Parole et Langage, CNRS et Université de Provence Aix-en-Provence,
Discourse Topics, Linguistics, and Language Teaching Richard Watson Todd King Mongkut’s University of Technology Thonburi arts.kmutt.ac.th/crs/research/
UNIT 1 ENGLISH DISCOURSE ANALYSIS (an Introduction)
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
Discourse. The study of discourse: – Involves our efforts to interpret or be interpreted…and how we accomplish it – Goes beyond just linguistic forms.
Intonation in Communication Skill: Recent Research Discourse, both in theoretical linguistics and in foreign language pedagogy,has focused on describing.
LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.
The vowel detection algorithm provides an estimation of the actual number of vowel present in the waveform. It thus provides an estimate of SR(u) : François.
SPEECH AND WRITING. Spoken language and speech communication In a normal speech communication a speaker tries to influence on a listener by making him:
Annotating the HKCSE Pragmatically Martin Weisser Visiting Professor School of English and Education Guangdong University of Foreign Studies mail:
Mabel Ortiz N.. Discourse analysis 1. What is discourse? It is written or spoken _______. A. Words B. Sentences C. Paragraphs D. Communication What is.
HYMES (1964) He developed the concept that culture, language and social context are clearly interrelated and strongly rejected the idea of viewing language.
Introduction to Computational Linguistics
Summary and Questions for Psycholinguistics. Psycholinguistics as cognitive study Stimuli (makeup of information) processing (functions & operations)
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
A Fully Annotated Corpus of Russian Speech
For Friday Finish chapter 24 No written homework.
Topic and the Representation of Discourse Content
DiscAn : Towards a Discourse Annotation system for Dutch language corpora or why and how we would want to annotate corpora on the discourse level Ted Sanders.
Defining Discourse.
1 Special Electives of Comp.Linguistics: Processing Anaphoric Expressions Eleni Miltsakaki AUTH Fall 2005-Lecture 5.
Salerno- AISV ROUND TABLE - PROSODY Giovanna Marotta Università di Pisa.
Merging Segmental, Rhythmic and Fundamental Frequency Features for Automatic Language Identification Jean-Luc Rouas 1, Jérôme Farinas 1 & François Pellegrino.
Pragmatics. Definitions of pragmatics Pragmatics is a branch of general linguistics like other branches that include: Phonetics, Phonology, Morphology,
Stringing words together.  Connected speech is spoken language that is used in a continuous sequence, as in normal conversations. Also called connected.
Discourse analysis May 2012 Carina Jahani
The Linguistics of CA Session 3. Overview Linguistics Macro and Micro Linguistics Contrastive analysis Goal Mean Framework Levels Categories Models.
Welcome to All S. Course Code: EL 120 Course Name English Phonetics and Linguistics Lecture 1 Introducing the Course (p.2-8) Unit 1: Introducing Phonetics.
2. The standards of textuality: cohesion Traditional approach to the study of lannguage: sentence as conventional object of study Structuralism (Bloofield,
AN INTRODUCTION TO SPOKEN LANGUAGE LG4 Section A.
INTONATION And IT’S FUNCTIONS
Lecture Overview Prosodic features (suprasegmentals)
Grammar Grammar analysis.
4AOD Malinnikova Ekaterina
2. The standards of textuality: cohesion
Studying Intonation Julia Hirschberg CS /21/2018.
THE NATURE OF SPEAKING Joko Nurkamto UNS Solo.
Studying Spoken Language Text 17, 18 and 19
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Discourse Structure in Generation
CS4705 Natural Language Processing
Deixis Saja S. Athamna
Presentation transcript:

Anaphoric Third Person Pronouns and Prosodic Features as Markers of Cohesion in English Spoken Discourse: A Corpus Study Cyril Auran Laboratoire Parole et Langage CNRS UMR Université de Provence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003

Oh no, not another study on anaphora … 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Anaphora: a much studied phenomenon numerous fields of research: syntax semantics pragmatics ang language philosophy psycholinguistics prosody several related issues: referent attribution referent accessibility discourse function

Well, yes, yet another one, but … 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 This study focuses on: discourse anaphora anaphora and its role in the organisation of discourse the interaction between anaphora and prosodic markers of discourse organisation

Well, yes, yet another one, but … 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Central issue: Interaction between discourse cohesion markers in British English More precisely: How do anaphoric pronouns influence resetting phenomena in the marking of discourse cohesion?

Summary 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th Corpus study The Aix-MARSEC Corpus Data extraction and analysis Results and discussion 1.Views of discourse discourse as product and process a unified approach to discourse Conclusions and perspectives 2. Cohesion, connectivity and coherence Different approaches to the unity of discourse Anaphoric pronouns and resetting phenomena as markers of cohesion

Part I: Two views of discourse 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003

Two views of discourse 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Linguistic studies on discourse tend to fall into two categories (Brown & Yule, 1983 ; Di Cristo et al., 2003) : text-as-product view or grammatical approach - discourse as a structured text - main characteristic: cohesion of a set of sentences or utterances discourse-as-process or cognitive-pragmatic approach - focus on the elaboration and the processing of situated discourse - main characteristic: coherence of the cognitive representations triggered by discourse

Two views of discourse 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Di Cristo et al A broad and unified approach to discourse Discourse analysis = study of the relations between forms and functions within an interpretative framework Segmentation strategies: Grammatical units Conceptual units Discourse units Contextualisation activities Clause (Miller & Weinert, 1998) both a formal and pragmatic entity (evolution of discourse memory cf. Berrendonner & Reichler-Béguelin, 1989) Topics

Part II: Cohesion, connectivity and coherence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003

Charolles (1988) (inspired by De Beaugrande & Dressler, 1981): several parameters used to account for discourse unity; cohesion: redefined as the marking of relations between utterances or utterance constituents (p. 53, our translation) connectivity: logical-semantic relations (marked by connectives) between propositions and speech acts coherence: interpretability of discourse: Coherence is not a characteristic of texts [...]. The need for coherence, on the contrary, is a sort of a-priori mode of discourse reception Cohesion, connectivity and coherence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Different approaches but the same central issue: discourse unity Halliday & Hasan (1976): a text is characterised by its texture, based on cohesion; cohesion presented as a semantic concept relying on the interpretation of elements of the text but focus on the (formal) linguistic expressions (ties)

Cohesion, connectivity and coherence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 In this study we focus on the marking of cohesion through the use of: Anaphoric third person pronouns and possessive adjectives (he/she/they, him/her/them, his/her/their) Pitch resetting phenomena (high onset pitch values at the beginning of tone groups)

Cohesion, connectivity and coherence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Anaphoric pronouns and cohesion Some of the most typical discourse cohesion marks: endophoric personal referents (Halliday & Hasan, 1976), members of anaphoric chains (cf. Chastain, 1975); expressions pointing to highly accessible referents (cf. for instance Ariels or Gundels work and Grosz & Sidners Centering Theory) Anaphoric pronouns permit the thematic preservation (Danes, 1974) necessary for discourse to be cohesive

Topic-shifts in spoken discourse are prosodically marked as the boundaries of structural units of spoken discourse which take the form of speech paragraphs and have been called paratones (Brown & Yule, 1983). No strict hierarchy view (cf. Hirst, 1998) but some kind of hierarchic structure (cf. the minor vs. major tone group opposition in the (MAR)SEC corpus). Phonetic features: major unit beginning: extra high (F0) onset values pitch reset or resetting (Brown & Yule, 1983; Wichmann, 2000; Couper-Kuhlen, 2001); major unit end: very low pitch, loss of amplitude, lengthy pauses (Brown and Yule, 1983) and creaky voice (Wichmann, 2000). Cohesion, connectivity and coherence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Resettings and cohesion:

Cohesion, connectivity and coherence 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 More anaphoric marksmore cohesion Lower resettings more cohesion Effects of cohesion markers:

Part III: Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003

55,000 words, 339 min. and 18 sec. BBC 1980s recordings 11 speaking styles 53 (17 female and 36 male) speakers Orthographic transcription Prosodic annotation: 14 tonetic stress marks Automatic grapheme-to-phoneme conversion Automatic phoneme level alignment Automatic intonation annotation using the Momel-Intsint methodology 8 annotation levels aligned: phonemes, syllable constituents, syllables, words, feet and rythmic units, tone groups. Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 The Aix-MARSEC Corpus An evolution from the SEC and MARSEC corpora SEC Spoken English Corpus MARSEC Machine Readable SEC Aix-MARSEC Alignment of words and tone groups with the signal Conversion of all the TSM to ASCII characters

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Data extraction and analysis (1) Extraction of onset F0 values for all the tone groups which contained either a third person anaphoric pronoun or a connective. The whole of the Aix-MARSEC was used, except for the E type of recordings (Daily Service), the quality of which could not guaranty accurate F0 detection). Data extraction: Perl scripts on Aix-MARSEC Praat TextGrids Data analysis: R software

Momel methodology (Di Cristo & Hirst, 1986; Hirst et al., 2000) Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Data extraction and analysis (2) Experimental design: one dependent variable: onset F0 value 2 independent variables: - type of tone group (major vs. minor); - anaphoric marker (presence vs. absence) F0 values automatically measured on the modelled curve for the first stressed syllable within a tone group (cf. Wichman, 2000) Total: 12,272 values

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Data extraction and analysis (1) Even after logarithmic transform, the distribution of onset F0 values significantly diverged from a normal distribution. All ANOVA results were checked using two-sample Kolmogorov-Smirnov tests (KST) during transitive and intransitive binary comparisons. Raw distribution Log transformed distribution Normal distribution Kurtosis Skewness Shapiro-Wilk normality test: W= / p < 2.2e-16

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Results: Tone Group factor Significant effect ANOVA: F=513.7, p<2e ST difference Hierarchically higher units have higher onset values Lower onset values correspond to minor (i.e. more cohesive) units

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Results: Anaphora factor Significant effect ANOVA: F=54.94, p=1.32e ST difference Anaphoric markers of cohesion do influence resetting phenomena « anaphoric » units have higher onset values

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 A paradoxical effect ? Discussion AnaphoraHigher resettings Less cohesionMore cohesion Constant resulting degree of cohesion

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Discussion A closer look at resetting phenomena Resetting phenomena Discourse constraints More cohesion lower values Planning and Production constraints declination higher values

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Discussion Interaction with anaphora Resetting phenomena Anaphora Anaphoric markers Discourse constraints More cohesion lower values Planning and Production constraints declination higher values

Conclusions and perspectives 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003

Conclusion and Perspectives 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Conclusion … Markers of cohesion seem to interact in complex ways More particularly, anaphoric markers of cohesion influence resetting phenomena This constitutes arguments in favor of a unified approach to discourse taking into account both: the cognitive and pragmatic processes involved in it and their actual realisations in its linguistic product

Conclusion and Perspectives 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 … and perspectives Delicate results: Statistical correlations / causality relations Numerous other factors Perspectives Distinction between sentential and discourse markers Speaker-normalised data Other conceptions of resetting phenomena (as a differential value rather than an absolute one) Analyses taking into account both anaphoric markers and connectives (cf. Auran & Hirst, submitted)

Thank you for your attention ! ;o) 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th 2003 Presentation available from Details on the Aix-MARSEC project available from

Corpus study 6th NWCL International Conference Prosody and Pragmatics – Preston, November 14th-16th ASCII prosodic annotation symbols: _low level ~high level <step-down >step-up / (high) rise-fall /high \high fall fall-rise /high rise,low rise low fall,\(low rise-fall – not used) \,low fall-rise *stressed but unaccented |minor intonation unit boundary ||major intonation unit boundary (Roach, 1994) Back to the presentation