Catia Cucchiarini Quantitative assessment of second language learners’ fluency in read and spontaneous speech Radboud University Nijmegen.

Slides:



Advertisements
Similar presentations
Assessment types and activities
Advertisements

DISCO Development and Integration of Speech technology into Courseware for language learning Stevin project partners: CLST, UA, UTN, Polderland Radboud.
Lecture 7: reliability & validity Aims & objectives –This lecture will explore a variety of techniques for ensuring that research is conducted with reliable.
Conversation table using Google Hangout: from online chat to F2F chat -written fluency and oral fluency development WAFLT Fall Conference Nov. 8.
Pearson Knowledge Technologies, Palo Alto, California NAACL Boulder Automatic Assessment of Spoken Modern Standard Arabic NAACL Boulder, Colorado.
® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.
Sentence Durations and Accentedness Judgments ABSTRACT Talkers in a second language can frequently be identified as speaking with a foreign accent. It.
Susan Malone Mercer University.  “The unit has taken effective steps to eliminate bias in assessments and is working to establish the fairness, accuracy,
Session 1 Getting started with classroom research DAVID NUNAN.
Confidential and Proprietary. Copyright © 2010 Educational Testing Service. All rights reserved. Catherine Trapani Educational Testing Service ECOLT: October.
S OCIAL S CIENCE R ESEARCH HPD 4C W ORKING WITH S CHOOL – A GE C HILDREN AND A DOLESCENTS M RS. F ILINOV.
Chapter 4 Validity.
Michael Daller and Nivja De Jong.  The aim of the present study is to operationalize language dominance in bilinguals with structurally different languages.
TBLT 2007 A comprehensive model for speaking and oral interaction: theoretical en practical implications for the development of task-based language assessment.
Types of Evaluation.
Catherine Caldwell-Harris Boston University 1 Speech Perception by Non-Native Speakers Declines Drastically in Noisy Conditions Catherine Caldwell-Harris,
Communicative Language Teaching
Acoustic and Linguistic Characterization of Spontaneous Speech Masanobu Nakamura, Koji Iwano, and Sadaoki Furui Department of Computer Science Tokyo Institute.
® Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists Klaus Zechner, John Sabatini and Lei Chen Educational Testing Service.
Measurement in Exercise and Sport Psychology Research EPHE 348.
Whither Linguistic Interpretation of Acoustic Pronunciation Variation Annika Hämäläinen, Yan Han, Lou Boves & Louis ten Bosch.
ACTFL Immersion. Who uses this?  US Department of Defense  Missionary training schools.
Giles Witton-Davies, National Taiwan University, Taiwan
ELA SCHOOL TEAM SESSION Welcome to EEA, 2012! 10/2/2015MSDE1.
Experimental Research Methods in Language Learning Chapter 2 Experimental Research Basics.
Clinical Documents Diagnostic Reports. Purposes To indicate whether or not a person needs therapy To support that recommendation with all necessary data.
Diamantino Caseiro and Isabel Trancoso INESC/IST, 2000 Large Vocabulary Recognition Applied to Directory Assistance Services.
CSD 5100 Introduction to Research Methods in CSD Observation and Data Collection in CSD Research Strategies Measurement Issues.
 The candidate uses student data (gathered by following the assessment plan) to profile and document student growth that occurred as a result of the.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
Measuring Complex Achievement
Chapter 4: Test administration. z scores Standard score expressed in terms of standard deviation units which indicates distance raw score is from mean.
The relationship between objective properties of speech and perceived pronunciation quality in read and spontaneous speech was examined. Read and spontaneous.
Building a sentential model for automatic prosody evaluation Kyuchul Yoon School of English Language & Literature Yeungnam University Korea.
Chap. 2 Principles of Language Assessment
Arizona English Language Learner Assessment AZELLA
Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael.
Comprehensible Input “Say WHAT?!” Translating “teacherese” into “studentese” with ease! ~Dr. Cindy Oliver.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
STANAG OPI Testing Julie J. Dubeau Bucharest BILC 2008.
The Added Value and Financial Savings of Contextually-Relevant Assessments Paul Gibbs, Anu Sachdev & Whitney Szmodis Abstract Researchers assessed 260.
Assessment Specifications – Standard Level** Component Overall Weighting (%) Approximate Weighting of Objectives Duration (hours) Format and Syllabus.
SECOND EDITION Chapter 5 Standardized Measurement and Assessment
Chapter 6 - Standardized Measurement and Assessment
Automatic Pronunciation Scoring of Specific Phone Segments for Language Instruction EuroSpeech 1997 Authors: Y. Kim, H. Franco, L. Neumeyer Presenter:
Lecture №4 METHODS OF RESEARCH. Method (Greek. methodos) - way of knowledge, the study of natural phenomena and social life. It is also a set of methods.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Outline  I. Introduction  II. Reading fluency components  III. Experimental study  1) Method and participants  2) Testing materials  IV. Interpretation.
Objectives of session By the end of today’s session you should be able to: Define and explain pragmatics and prosody Draw links between teaching strategies.
Evaluation and Assessment Evaluation is a broad term which involves the systematic way of gathering reliable and relevant information for the purpose.
S tructured O bjective C linical E xamination P ractical.
AAPPL Assessment Follow Up June What is AAPPL Measure? The ACTFL Assessment of Performance toward Proficiency in Languages (AAPPL) is a performance-
A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Sentence Durations and Accentedness Judgments
ASR-based corrective feedback on pronunciation: does it really work?
Fluency in Oral Interaction Workshop (FLOW)
Empathy in Medical Care Jessica Ogle (D
Levels of Processing Memory Model (LoP)
Dean Luo, Wentao Gu, Ruxin Luo and Lixin Wang
Key findings on comparability of language testing in Europe ECML Colloquium 7th December 2016 Dr Nick Saville.
Unit 6 Research Project in HSC Unit 6 Research Project in Health and Social Care Aim This unit aims to develop learners’ skills of independent enquiry.
ACTFL Immersion.
Automatic Fluency Assessment
Anastassia Loukina, Klaus Zechner, James Bruno, Beata Beigman Klebanov
Towards Automatic Fluency Assessment
The Development of the E 8 Listening Test
Presentation transcript:

Catia Cucchiarini Quantitative assessment of second language learners’ fluency in read and spontaneous speech Radboud University Nijmegen

Context Research on automatic assessment of oral proficiency in Dutch as a second language

Fluency Important construct in evaluation of second language proficiency Also relevant for pathological speech

Fluency Frequently applied notion, but not clearly defined Various interpretations Overall language proficiency Oral command of a language Temporal aspect of oral proficiency

Two experiments Exp1: read speech Exp2: spontaneous speech Human fluency judgements related to Objective temporal measures (CSR)

Aim of these experiments To explore the relationship between objective properties of speech and perceived fluency in read and spontaneous speech, with a view to determining whether such quantitative measures can be used to develop objective fluency tests.

Method: speakers Exp1: 60 non-native speakers 3 proficiency levels beginner (PL1) intermediate (PL2) advanced (PL3) different mother tongues different gender

Method: speakers Exp2: 60 non-native speakers 2 proficiency levels beginner level (BL) intermediate level (IL) different mother tongues different gender

Method: speech material Exp1: 2 sets of 5 phonetically rich sentences read over the telephone

Method: speech material Exp2: existing test of Dutch as a second language (DSL) è Profieltoets 8 items from BL version short tasks, 15 s to answer candidates can answer immediately 8 items from IL version long tasks, 30 s to answer candidates have to reflect to provide motivations

Method: raters Exp1: 3 phoneticians (PH) 3 speech therapists (ST1) 3 speech therapists (ST2) Exp2: 5 DSL teachers for BL (RBL) 5 DSL teachers for IL (RIL)

Method: automatic scoring Speech orthographically transcribed CSR: 38 monophones + lexicon Viterbi alignment of speech signals and orthographic transcriptions Segmentation at phone level

Method: some definitions silent pause: a stretch of silence of no less than 200 ms dur1 = duration speech without pauses (s) dur2 = duration speech with pauses (s)

Method: objective measures Primary variables art = # phones / dur1 ros = # phones / dur2 ptr = 100% * dur1 / dur2 mlr = mean # phones between 2 pauses mlp = mean length silent pauses dsp = tot. dur. sil. pauses / (dur2 / 60) # sp = # sil. pauses / (dur2 / 60)

Method: objective measures Secondary variables # fp = # filled pauses / (dur2 / 60) # disf =# disfluencies / (dur2 / 60)

Method: fluency ratings Sentences scored on fluency on the basis of a ten-point scale Raters received no special training

Method: rating procedure Exp1: each group of raters judged speakers of different proficiency levels Exp2: each group of raters judged speakers of the same proficiency level

Results: reliability

Results: raw fluency ratings

Results: objective measures

Results: disfluencies Repetitions: exact repetitions of words Repairs: corrections Restarts: repetitions initial parts of words

Results: disfluencies

Results: correlations

Discussion Reliable fluency scoring is possible Fluency scores related to task performed Role objective variables in rs and ss similarities: weak relation sec. var. / fluency differences: varying roles prim. var.

Discussion Read speech: strong relation: art, ros, ptr, #sp, dsp, mlr weaker relation: mlp  for perceived fluency pause freq. more important than pause length  two factors important fluency rs: articulation rate pause frequency

Discussion Spontaneous speech : strong relation: ros, ptr, #sp, dsp, mlr weaker relation: art, mlp  possibly higher freq. pauses effaces importance art  fluency in ss particularly related to var. that contain info on pause freq.

Conclusions Reliable fluency scoring by human raters is possible Objective fluency scoring is possible Fluency scores vary with speech type Fluency scores vary with task performed

Conclusions Read speech: fluency scores strongly related to art and pause frequency Spontaneous speech: fluency scores strongly related to pause frequency and distribution Expert fluency ratings can be predicted more accurately on the basis of objective measures in rs than in ss

Conclusions Temporal measures of fluency may be used to develop objective fluency tests Selection of variables to be employed should be dependent on material investigated and task performed