TBALL ASR Work Summary USC group: Joseph Tepperman, Matt Black, Abe Kazemzadeh, Matteo Gerosa, Sungbok Lee, Shri Narayanan.

Slides:



Advertisements
Similar presentations
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Advertisements

Advances in WP1 Turin Meeting – 9-10 March
Voice Recognition Technology Kathleen Kennedy COMP 1631 Winter 2010.
Detection of Recognition Errors and Out of the Spelling Dictionary Names in a Spelled Name Recognizer for Spanish R. San-Segundo, J. Macías-Guarasa, J.
CS Team 5 Alex Wong Raheel Khan Rumeiz Hasseem Swati Bharati Biometric Authentication System.
Recent Developments in Human Motion Analysis
The TBALL Project Data Collection: Making a Young Children's Speech Corpus Abe Kazemzadeh*, Hong You +, Markus Iseli +, Barbara Jones +, Xiaodong Cui +,
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Language and Speaker Identification using Gaussian Mixture Model Prepare by Jacky Chau The Chinese University of Hong Kong 18th September, 2002.
CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Bootstrapping pronunciation models: a South African case study Presented at the CSIR Research and Innovation Conference Marelie Davel & Etienne Barnard.
Measuring Language Development in Children: A Case Study of Grammar Checking in Child Language Transcripts Khairun-nisa Hassanali and Yang Liu {nisa,
® Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists Klaus Zechner, John Sabatini and Lei Chen Educational Testing Service.
Chapter 8 Prediction Algorithms for Smart Environments
Machine Learning CUNY Graduate Center Lecture 1: Introduction.
Zero Resource Spoken Term Detection on STD 06 dataset Justin Chiu Carnegie Mellon University 07/24/2012, JHU.
KYLE PATTERSON Automatic Age Estimation and Interactive Museum Exhibits Advisors: Prof. Cass and Prof. Lawson.
“Artificial Intelligence” in Database Querying Dept. of CSE Seung-won Hwang.
Assessing Reading Skills in Young Children: Assessing Reading Skills in Young Children: The TBALL Project (Technology Based Assessment of Language and.
1 BILC SEMINAR 2009 Speech Recognition: Is It for Real? Tony Mirabito Defense Language Institute English Language Center (DLIELC) DLIELC.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Ekapol Chuangsuwanich and James Glass MIT Computer Science and Artificial Intelligence Laboratory,Cambridge, Massachusetts 02139,USA 2012/07/2 汪逸婷.
Copyright © 2015 by Educational Testing Service. 1 Feature Selection for Automated Speech Scoring Anastassia Loukina, Klaus Zechner, Lei Chen, Michael.
1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.
1 Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition Qi Li, Senior Member, IEEE, Jinsong Zheng, Augustine.
Introduction Use machine learning and various classifying techniques to be able to create an algorithm that can decipher between spam and ham s. .
Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments 張智星
Fighting Identity Theft with Advances in Fingerprint Recognition Dick Mathekga.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
Getting Started with.NET Getting Started with.NET/Lesson 1/Slide 1 of 31 Objectives In this lesson, you will learn to: *Identify the components of the.NET.
A New Approach to Utterance Verification Based on Neighborhood Information in Model Space Author :Hui Jiang, Chin-Hui Lee Reporter : 陳燦輝.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
English Workshop Three Areas of English Speaking and Listening Reading Writing- includes spelling and handwriting.
STD Approach Two general approaches: word-based and phonetics-based Goal is to rapidly detect the presence of a term in a large audio corpus of heterogeneous.
S1S1 S2S2 S3S3 8 October 2002 DARTS ATraNoS Automatic Transcription and Normalisation of Speech Jacques Duchateau, Patrick Wambacq, Johan Depoortere,
Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Predicting Children’s Reading Ability using Evaluator-Informed Features Matthew Black, Joseph Tepperman, Sungbok Lee, and Shrikanth Narayanan Signal Analysis.
Utterance verification in continuous speech recognition decoding and training Procedures Author :Eduardo Lleida, Richard C. Rose Reporter : 陳燦輝.
WEKA: A Practical Machine Learning Tool WEKA : A Practical Machine Learning Tool.
A Bayesian Network Classifier for Word-level Reading Assessment Joseph Tepperman 1, Matthew Black 1, Patti Price 2, Sungbok Lee 1, Abe Kazemzadeh 1, Matteo.
A Text-free Approach to Assessing Nonnative Intonation Joseph Tepperman, Abe Kazemzadeh, and Shrikanth Narayanan Signal Analysis and Interpretation Laboratory,
Detection Of Anger In Telephone Speech Using Support Vector Machine and Gaussian Mixture Model Prepared By : Siti Marahaini Binti Mahamood.
Making yourself understood is not all about accent.
A NONPARAMETRIC BAYESIAN APPROACH FOR
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
What is Speech-Language Therapy?
Prepared by: Mahmoud Rafeek Al-Farra
Efficient Image Classification on Vertically Decomposed Data
The Four Components of a Good Language Class
Suggestions for Class Projects
Automatic Fluency Assessment
Data Mining: Concepts and Techniques Course Outline
Efficient Image Classification on Vertically Decomposed Data
SALT & The Microsoft Speech Application SDK
Anastassia Loukina, Klaus Zechner, James Bruno, Beata Beigman Klebanov
Semantic Interoperability and Data Warehouse Design
Automating Early Assessment of Academic Standards for Very Young Native and Non-Native Speakers of American English better known as The TBALL Project.
Machine Learning with Weka
Towards Automatic Fluency Assessment
AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
John H.L. Hansen & Taufiq Al Babba Hasan
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Letter Naming Practice
Type Topic in here! Created by Educational Technology Network
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

TBALL ASR Work Summary USC group: Joseph Tepperman, Matt Black, Abe Kazemzadeh, Matteo Gerosa, Sungbok Lee, Shri Narayanan

ASR contributions of this year Disfluency detection – Will be presented at InterSpeech 2007 Letter name and letter sound verification Bayesian Network approach for pronunciation scoring – Will be presented at InterSpeech 2007

Disfluency Evaluation Figure 1: which types of disfluencies are rated more or less fluent. Table 2: what is more important, fluency or accuracy.

Disfluency Detection Figure 2: grammar. Figure 3: classification algorithm. Table 3: Results for different disfluency types.(9% false alarm)‏

Letter name and letter sound verification Letter Name verification – Using word (letter) level models – 82% recognition accuracy, ~90% verification. – Need new performance measurement with new/noisy data letter sound verification – Baseline of using regular monophone models still outperforms experimental techniques. – New approach: to segment audio as a preprocessing step to deal with repetitions.

Bayesian Network approach for pronunciation scoring

Classifier performance over different feature sets

Bayes Nets Results

Question and Answering Evaluation Interesting task b/c a child may answer in different ways (specifically or generally, making up an answer out of the blue). Also, the task can be thought of as the bridge between mechanical reading and comprehension. Difficulty getting people to do the evaluation. – We had to resort to a gimmick….

The Tball Hall of Fame: Thanks to those who completed the evaluation! Joe Tepperman Jan Powell Matt Black Abe Kazemzadeh Patti Price Mia Callahan Christy Boscardin

Evaluation Results Out of possible pairings between evaluators, 8000 agreed and 3136 didn't. – Approx. 72% agreement. Still much analysis to do. Do the evaluation... it's fun.

SQLite This is a handy database tool that I found out about recently – I used it for the web evaluation for question answering. – It’s used in a number of well known programs (e.g., preferences in firefox, google desktop widgets)‏ I thought it might be ideal for the db on the laptops. – No installation, just a executable (or imported libraries)‏ – Command line or gui usage. – Bindings for common languages using DBI interface.

Discussion & Further studies Next steps: – Analyzing the outcomes of the case study. – Improve robustness of the existing algorithms for verification and disfluency detection. – Further improvement of the current automatic scoring method. – Making use of all the data we have collected for analysis and improvements including "comprehension" of spontaneous speech (i.e., open-ended question evaluation)