You’re Not From ‘Round Here, Are You? Naïve Bayes Detection of Non-native Utterance Text Laura Mayfield Tomokiyo Rosie Jones Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.
Advertisements

PHONE MODELING AND COMBINING DISCRIMINATIVE TRAINING FOR MANDARIN-ENGLISH BILINGUAL SPEECH RECOGNITION Yanmin Qian, Jia Liu ICASSP2010 Pei-Ning Chen CSIE.
Non-Native Users in the Let s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch Antoine Raux & Maxine Eskenazi Language Technologies Institute.
Discriminative Training in Speech Processing Filipp Korkmazsky LORIA.
Tweet Classification for Political Sentiment Analysis Micol Marchetti-Bowick.
Can Non-Native English Speakers Detect and Identify Native English Speakers’ Dialectal Variations? Rebecca Austerman.
Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.
Language Model based Information Retrieval: University of Saarland 1 A Hidden Markov Model Information Retrieval System Mahboob Alam Khalid.
Phoneme Alignment. Slide 1 Phoneme Alignment based on Discriminative Learning Shai Shalev-Shwartz The Hebrew University, Jerusalem Joint work with Joseph.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
Modeling the Cost of Misunderstandings in the CMU Communicator System Dan BohusAlex Rudnicky School of Computer Science, Carnegie Mellon University, Pittsburgh,
Using Emotion Recognition and Dialog Analysis to Detect Trouble in Communication in Spoken Dialog Systems Nathan Imse Kelly Peterson.
Speaker Detection Without Models Dan Gillick July 27, 2004.
1 Security problems of your keyboard –Authentication based on key strokes –Compromising emanations consist of electrical, mechanical, or acoustical –Supply.
Pedestrian Recognition Machine Perception and Modeling of Human Behavior Manfred Lau.
EE225D Final Project Text-Constrained Speaker Recognition Using Hidden Markov Models Kofi A. Boakye EE225D Final Project.
1. Introduction to Pattern Recognition and Machine Learning. Prof. A.L. Yuille. Dept. Statistics. UCLA. Stat 231. Fall 2004.
Albert Gatt Corpora and Statistical Methods Lecture 9.
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
A Phonotactic-Semantic Paradigm for Automatic Spoken Document Classification Bin MA and Haizhou LI Institute for Infocomm Research Singapore.
Improving Utterance Verification Using a Smoothed Na ï ve Bayes Model Reporter : CHEN, TZAN HWEI Author :Alberto Sanchis, Alfons Juan and Enrique Vidal.
Utterance Verification for Spontaneous Mandarin Speech Keyword Spotting Liu Xin, BinXi Wang Presenter: Kai-Wun Shih No.306, P.O. Box 1001,ZhengZhou,450002,
1 Bins and Text Categorization Carl Sable (Columbia University) Kenneth W. Church (AT&T)
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer 1 Ting-Hao (Kenneth) Huang Yun-Nung (Vivian) Chen Lingpeng Kong
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Recognizing Names in Biomedical Texts: a Machine Learning Approach GuoDong Zhou 1,*, Jie Zhang 1,2, Jian Su 1, Dan Shen 1,2 and ChewLim Tan 2 1 Institute.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Boosting Training Scheme for Acoustic Modeling Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie.
Project Final Presentation – Dec. 6, 2012 CS 5604 : Information Storage and Retrieval Instructor: Prof. Edward Fox GTA : Tarek Kanan ProjArabic Team Ahmed.
CHAPTER 8 DISCRIMINATIVE CLASSIFIERS HIDDEN MARKOV MODELS.
1 Broadcast News Segmentation using Metadata and Speech-To-Text Information to Improve Speech Recognition Sebastien Coquoz, Swiss Federal Institute of.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.
Combining Speech Attributes for Speech Recognition Jeremy Morris November 9, 2006.
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
Using Linguistic Analysis and Classification Techniques to Identify Ingroup and Outgroup Messages in the Enron Corpus.
Learning a Monolingual Language Model from a Multilingual Text Database Rayid Ghani & Rosie Jones School of Computer Science Carnegie Mellon University.
Integrating Multiple Knowledge Sources For Improved Speech Understanding Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering,
Statistical Models for Automatic Speech Recognition Lukáš Burget.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
1 Experiments with Detector- based Conditional Random Fields in Phonetic Recogntion Jeremy Morris 06/01/2007.
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
Predicting and Adapting to Poor Speech Recognition in a Spoken Dialogue System Diane J. Litman AT&T Labs -- Research
Language Identification and Part-of-Speech Tagging
A Simple Approach for Author Profiling in MapReduce
Automatic Speech Recognition
Authorship Attribution Using Probabilistic Context-Free Grammars
Conditional Random Fields for ASR
Statistical Models for Automatic Speech Recognition
Chao Xu, Parth H. Pathak, et al. HotMobile’15
Automatic Speaker Identification Using Sentinel Word Discrimination
Statistical Models for Automatic Speech Recognition
Jeremy Morris & Eric Fosler-Lussier 04/19/2007
iSRD Spam Review Detection with Imbalanced Data Distributions
Speech recognition, machine learning
Hsien-Chin Lin, Chi-Yu Yang, Hung-Yi Lee, Lin-shan Lee
Speech recognition, machine learning
Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee
Presentation transcript:

You’re Not From ‘Round Here, Are You? Naïve Bayes Detection of Non-native Utterance Text Laura Mayfield Tomokiyo Rosie Jones Carnegie Mellon University

Overview  Motivation  Speech data  Accent detection as document classification  Classification performance  Discriminative tokens  Conclusions

Non-native speech recognition The warship U.S.S. Jarrett has pulled into port in San Diego, CA after training voyage Native recognizer (word accuracy = 26.7): Tomorrow CPU a sister at has spilled into port and sandy and afford after a training wage Non-native recognizer (word accuracy = 73.3): The worst eighty U.S.S. chart has pulled into port in San Diego California after training warrior

Motivation  Practical  can we detect non-native users with enough accuracy to switch acoustic models?  Exploratory  how well does an algorithm based only on text features work?  what tokens are discriminative for non-native speakers?

Speech examples Over the next two months, public officials, Native American leaders, businesses and environmental groups will come up with plans for meeting the law’s requirements. Spontaneous speech Read speech I like to have anything very special in Boston, very native in Boston. Local specialties

Speech data Read speechSpontaneous speech Native language Speaker count Utterance count Word count (types) Speaker count Utterance count Word count (types) Japanese (3195) (826) English (2073) (418) Mandarin (391)

Transcripts and hypotheses A safety net for the salmons Environment= environmentalists… A safety net forced simon Um environmental activists… Usually gives a good idea of gold standard Finds true differences in linguistic usage Implicitly models acoustics Benefits from amplified difference between native and non-native samples Classification based on transcripts: Classification based on hypotheses: “A safety net for salmon: environmentalists, the government, and ordinary folks team up to save the Northwest’s wondrous wild salmon”

Related work  Acoustic feature based accent discrimination (e.g. Fung and Liu 1999)  Competing HMM based accent discrimination (e.g. Teixeira et al 1996)  Classification of documents according to style (Argamon-Engleson et al 1998), author (Mosteller and Wallace 1964)

Accent detection as document classification Native speaker utterances Non-native speaker utterances Classifier

Accent detection as document classification Classifier Test speaker utterances Classification decision: native or non-native?

Experimental methodology  Rainbow naïve Bayes classifier  Both word and part-of-speech tokens were examined  Classification based on token unigrams and bigrams  No feature selection initially  Stopwords were not excluded from feature set  Data randomly split into 30% testing, 70% training data for evaluation; evaluation repeated 20 times and classification results averaged  Utterances from the same speaker never appeared in both training and test sets

Classification of spontaneous speech (transcripts only) Native/ Japanese Native/ Chinese Japanese/ Chinese Native/ Non-native Native/ Japanese/ Chinese

Classification of read speech A train: same texts test: same texts baseline

Classification of read speech A train: same texts test: same texts B train: disjoint texts test: disjoint texts C train: disjoint texts test: same texts D train: same texts test: disjoint texts baseline

Classification of read speech A train: same texts test: same texts B train: disjoint texts test: disjoint texts C train: disjoint texts test: same texts D train: same texts test: disjoint texts baseline

Feature Selection MethodNumber of featuresAccuracy None IG SMART IG SMART-524, IG IG M&W IG SMART

Discriminative sequences Speech typeToken typeNativeNon-native ReadWordNMFSthe + the thethat ReadPOSnoun(pl)noun(sing) noun(pl)verb(past) SpontaneousWordWonderlandthe SpontaneousPOSTO + verb(base)noun(sing) SpontaneousPOSNounamnoun(sing) transcriptionshypotheses

Conclusions  Transcriptions of spontaneous speech can be classified with high accuracy for both 2-way and 3-way distinctions  Read speech samples, which are simple transformations of native-produced text, can be classified with high accuracy  Recognizer output is classified more accurately than transcripts

Future directions  Incorporating the classification decision in acoustic model selection  Minimizing the number of samples from the test speaker needed for classification  Applying classification to parsing grammar selection, language model construction, writer identification

Discriminative POS sequences NativeNon-native Noun(pl)Noun(sing) DeterminerPreposition Noun(pl);prepositionPreposition;preposition Adjective;noun(Pl)Noun(sing);noun(sing) Gerund;particleParticle;preposition Noun(s);verb(3s)Cardinal#;cardinal# Noun(pl);modalVerb(past)

Discriminative word sequences NativeNon-native NMFSthe;the the;NMFSin;in nineteen;hundredsthe hundreds;nowin hundredsthat habitats;andhabitat;and

Phone-based classification NativeNon- native Phone identity /D//D/ /I/ Phone class CCC V Discriminative tokens Condition B