Hungarian Academy of Sciences

Slides:

Advertisements

Similar presentations

Philip Harrison J P French Associates & Department of Language & Linguistic Science, York University IAFPA 2006 Annual Conference Göteborg, Sweden Variability.

Advertisements

Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.

Analyses on IFA corpus Louis C.W. Pols Institute of Phonetic Sciences (IFA) Amsterdam Center for Language and Communication (ACLC) Project meeting INTAS.

/ nailon / – software for online analysis of prosody Interspeech 2006 special session: The prosody of turn-taking and dialog acts September 20, 2006 Jens.

Tools for Speech Analysis Julia Hirschberg CS4995/6998 Thanks to Jean-Philippe Goldman, Fadi Biadsy.

Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence Sankaranarayanan Ananthakrishnan, Shrikanth S. Narayanan IEEE 2007 Min-Hsuan.

INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.

AN INTRODUCTION TO PRAAT Tina John M.A. Institute of Phonetics and digital Speech Processing - University Kiel Institute of Phonetics and Speech Processing.

Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,

Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.

Looking at Spectrogram in Praat cs4706, Jan 30 Fadi Biadsy.

Tools for Speech Analysis 2 How do we choose? What kind of data? Which task?

MUSCLE movie data base is a multimodal movie corpus collected to develop content- based multimedia processing like: - speaker clustering - speaker turn.

Turn-taking in Mandarin Dialogue: Interactions of Tone and Intonation Gina-Anne Levow University of Chicago October 14, 2005.

Praat Fadi Biadsy.

Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.

Towards a definition of GestBase - an open database of gestures Milan Rusko Institute of Informatics of the Slovak Academy of Sciences, Bratislava.

Towards an integrated scheme for semantic annotation of multimodal dialogue data Volha Petukhova and Harry Bunt.

Movie scene corpus for language learning Eiichi Yubune (Toyo University), Ryuji Tabuchi (Mint Applications), Akinobu Kanda (Tokyo Metropolitan University),

Hands-on tutorial: Using Praat for analysing a speech corpus Mietta Lennes Palmse, Estonia Department of Speech Sciences University of Helsinki.

circle Adding Spoken Dialogue to a Text-Based Tutorial Dialogue System Diane J. Litman Learning Research and Development Center & Computer Science Department.

Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University Korea.

Turn-taking Discourse and Dialogue CS 359 November 6, 2001.

Annotating the HKCSE Pragmatically Martin Weisser Visiting Professor School of English and Education Guangdong University of Foreign Studies mail:

ENTERFACE 08 Project 1 “MultiParty Communication with a Tour Guide ECA” Mid-term presentation August 19th, 2008.

Dynamic Captioning: Video Accessibility Enhancement for Hearing Impairment Richang Hong, Meng Wang, Mengdi Xuy Shuicheng Yany and Tat-Seng Chua School.

Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.

Perceptual Analysis of Talking Avatar Head Movements: A Quantitative Perspective Xiaohan Ma, Binh H. Le, and Zhigang Deng Department of Computer Science.

Emilien Gorène. Automatic processing for pathologic speech : The case of schizophrenia Under direction of Maud Champagne-Lavau and Laurent Prévot. Recording.

Mr. Magdi Morsi Statistician Department of Research and Studies, MOH

1/17/20161 Emotion in Meetings: Business and Personal Julia Hirschberg CS 4995/6998.

Descriptive Statistics. Outline of Today’s Discussion 1.Central Tendency 2.Dispersion 3.Graphs 4.Excel Practice: Computing the S.D. 5.SPSS: Existing Files.

Instructor:Dr.Veton Kepuska Student:Dileep Narayan.Koneru PRAAT PROSODIC FEATURE EXTRACTION TOOL.

User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog Gabriel Skantze, David House & Jens Edlund.

Lexical, Prosodic, and Syntactics Cues for Dialog Acts.

Adapting Dialogue Models Discourse & Dialogue CMSC November 19, 2006.

ENTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation.

Conversational role assignment problem in multi-party dialogues Natasa Jovanovic Dennis Reidsma Rutger Rienks TKI group University of Twente.

Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

ELAN as a tool for oral history CLARIN Oral History Workshop Oxford Sebastian Drude CLARIN ERIC 18 April 2016.

Praat: doing phonetics by computer Introductory tutorial Kyuchul Yoon Division of English Kyungnam University.

Part Four ANALYSIS AND PRESENTATION OF DATA

SPSS: Using statistical software — a primer

Fluency in Oral Interaction Workshop (FLOW)

An Introduction to : a closer look at analysing vowels

On Defining Cephalic Gesture Categories

Investigating Pitch Accent Recognition in Non-native Speech

(things that make one sign different from another)

Grounding by nodding GESPIN 2009, Poznan, Poland

4. Finding the Average, Mode and Median

New Insights on the Cognitive Processing of AD and IS Questions

Analyzing and Interpreting Quantitative Data

DEPARTMENT OF COMPUTER SCIENCE

Research Methods RQ1: Do greeting expressions mean differently in different pragmatic contexts? Instrument / Software Purpose Analysis Textbooks.

Warm Up What are the 5 parameters of ASL?.

Studying Intonation Julia Hirschberg CS /21/2018.

THE NATURE OF SPEAKING Joko Nurkamto UNS Solo.

Limitations & Suggestions

EXPERIMENTS WITH UNIT SELECTION SPEECH DATABASES FOR INDIAN LANGUAGES

Voice source characterisation

Agustín Gravano & Julia Hirschberg {agus,

Hands-on tutorial: Using Praat for analysing a speech corpus

Discourse Structure in Generation

SECOND LANGUAGE LISTENING Comprehension: Process and Pedagogy

Emer Gilmartin, Carl Vogel, ADAPT Centre Trinity College Dublin

Statistics for the Social Sciences

Constructing and Interpreting Visual Displays of Data

(-4)*(-7)= Agenda Bell Ringer Bell Ringer

Tools for Speech Analysis

Looking at Spectrogram in Praat cs4706, Jan 30

Presentation transcript:

Hungarian Academy of Sciences Comparison of the nonverbal ways of marking the discourse relation of concession and the functions of lexical search and approximation in Hungarian dialogues – Relations and correspondances between discourse-pragmatic functions (concession versus lexical search) and nonverbal features of the Hungarian DM mondjuk (’let’s say’; ’although’) Ágnes Abuczki Hungarian Academy of Sciences February 2015

Goals, outline Empirical study on a multifunctional (polysemous) Hungarian lexical item (mondjuk, meaning „let’s say” or „although, it must added that”) that operates at discourse level and marks relationshop between discourse unit as well as the attitude of the speaker --> discourse marker (henceforth: DM) 2 salient functions of the item will be described by multimodal features in order to enhance its meaning disambiguation Questions: Is there a significant relation/correspondance between the discourse- pragmatic function of a DM and Manual gesticulation Facial expression Gaze direction of the speaker Duration of the DM Pause preceding the DM?

Material Multimodal HuComTech corpus (Hungarian only) 22 eaf files (involving audio, video, multimodal pragmatic and automatic prosodic annotation): 22 informal conversations 22 interviewees Number of tokens of DMs segmented: Mondjuk (~let’s say): 208

Most salient functions of the selected DM 2 most salient functions of mondjuk ('let’s say'): LXS APPR= lexical search + approximation (46 tokens); CON= concession (41 tokens).

Most salient functions of mondjuk ('let’s say') - markers of lexical search + approximation (abbreviated as LXS; can be glossed as about, like): „gyorsan megy a motorom mondjuk 120–140-nel” (‘my bike is really fast, it can do DMmondjuk 120–140 kmphs’) (HuComTech, 017_I) - markers of concession (abbreviated as CON; can be glossed as although, but): „szeretek a belvárosban élni mondjuk elég nagy a szmog” (‘I like living in the city centre DMmondjuk the air is polluted’) (HuComTech, 019_I)

Methods Segmentation of the selected word Tagging functions Low-level prosodic features and temporal features (durations and preceding pauses) were extracted from the segmented sound files (.wav) using Praat and Prosogram scripts. The nonverbal-visual features (gaze direction, facial expression, hand gestures) of the speaker’s behaviour were extracted from the manually-performed video annotations of the recordings and can be automatically queried using the ELAN software. The queries on the relation of each function and each nonverbal feature were run separately and were ultimately joined in contingency tables for statistical analysis.

Methods: Segmentation and annotation in ELAN

The relation of function and manual gesticulation Prior to queries, I expected to find correspondences between discourse functions and hand movements. I considered manual gesticulation: any handshape type annotated other than the default handshape of the actual speaker (most common default type: half-open-flat hands) any handshape change during uttering a DM I queried the relation of hand gesticulation and each of the salient functions of DM one by one in separate queries (with the ‘Find overlapping labels’ command), and then joined them in contingency tables for statistical analysis in SPSS 19. 0. 8

The relation of pragmatic functions and hand gestures (mondjuk – let’s say) initial observation and hypothesis: lexical search and gesticulation contrary to expectations  Significant (2(1)=12,442, p<0,01)

The relation of pragmatic functions and gaze direction (mondjuk – let’s say) typical marker of concession: averted gaze direction type during; not significant (p>0,05) typical marker of lexical search and approximation: upwards gaze direction type; not significant (p>0,05)

The relation of pragmatic functions and facial expressions (mondjuk – let’s say) Recalling affect display during mondjuk_LXS_APPR

Comparison of the durations of a DM expressing different functions Why was it analyzed? I expected that different functions are realized in different durations. Method: Queries were run by a Praat script in order to measure the duration of the individual DM tokens performing the two most salient functions, and save them in a spreadsheet file. Representation of results: box-and-whiskers plots - it shows the median and variation of duration.

Duration and pragmatic function My hypothesis about the duration of the various functions of this DM: Mondjuk (let’s say) expressing lexical search and approximation is expected to be realized longer than mondjuk expressing concession

Distribution of the duration of DMS with different functions iindependent samples t-test on mondjuk (say): significant iindependent samples t-test on ugye (is that so?): not significant

Silence annotation in Praat Silence annotation was performed following the segmentation of DMs with the aim to test the hypothesis if DMs are predominantly separated by pauses (as they are often described in the literature). The phonetic parameters set for automatic silence annotation were as follows: minimum pitch: 100 Hz (subtract mean) time step: automatic (0,01 s) silence threshold: - 45 dB minimum silent interval duration: 0,15 s minimum sounding interval duration: 0,05 s As a result, annotation segmented the recordings into sounding and silent segments:

Silence annotation in Praat The difference between the two categories has not been found significant by Pearson’s Chi-Square test (p>0,05).

Multiple layer searches in ELAN

Conclusions: prototypical sets of features of the canonical uses of mondjuk (say) performing its two different functions Lexical search, approximation Concession HAND GESTURES no yes GAZE DIRECTION upwards other than upwards FACIAL EXPRESSION recall other than recall DURATION > 250 ms < 250 ms PRECEDING PAUSE < 150 ms > 150 ms

References L. Hunyadi, Szekrényes I., Borbély A., Kiss H., Annotation of spoken syntax in relation to prosody and multimodal pragmatics. In: Proceedings of 3rd Cognitive Infocommunications Conference. Kosice: IEEE Conference Publications, 2012, 537–541. B. Fraser, “Topic orientation markers,” Journal of Pragmatics, 41, 2009, pp. 892–898. W. Chafe, “Consciousness and language,” Cognition and Pragmatics (Handbook of Pragmatics Highlights), D. Sandra, J. Östman, J. Verschueren, Eds. Amsterdam/Philadelphia: John Benjamins, 2009, pp. 135–145. Boersma P., Weenink, D, 2007. Praat: doing phonetics by computer 5.0.02. University of Amsterdam: Institute of Phonetic Sciences. http://www.praat.org http://www.physics.csbsju.edu/stats/exact_NROW_NCOLUMN_form.html This research was supported by the European Union and the State of Hungary, co-financed by the European Social Fund in the framework of TÁMOP 4.2.4. A/2-11-1-2012-0001 ‘National Excellence Program’.

Thank you for your attention.