Design, compilation and processing of CUCall: a set of Cantonese spoken language corpora collected over telephone networks by W.K. Lo, P.C. Ching, Tan.

Slides:



Advertisements
Similar presentations
WELCOME to an Introduction of the Monitoring the end-to-end Transport/Distribution Chain B2B Version V.
Advertisements

Recall Advanced Telephony Applications Recall by Jusan is a family of call recording systems, part of the Streamline CTI range Today, all.
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Not to be distributed or reproduced by anyone other than Qwest entities. Copyright © 2006 Qwest All Rights Reserved. BUSINESS SOLUTIONS 1 UMD ON-LINE VOICE.
Analyses on IFA corpus Louis C.W. Pols Institute of Phonetic Sciences (IFA) Amsterdam Center for Language and Communication (ACLC) Project meeting INTAS.
MAKING NOTES FOR RESEARCH
1 SMS at the University of Hong Kong Libraries William Ko, HKU Libraries Dr Frank Tong, ETI, HKU.
Development of Automatic Speech Recognition and Synthesis Technologies to Support Chinese Learners of English: The CUHK Experience Helen Meng, Wai-Kit.
Computer Concepts 5th Edition Parsons/Oja Page 492 CHAPTER 10 File And Database Concepts Section A PARSONS/OJA Databases.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
©2003 Prentice Hall Business Publishing, Accounting Information Systems, 9/e, Romney/Steinbart 18-1 Accounting Information Systems 9 th Edition Marshall.
1 Spontaneous-Speech Dialogue System In Limited Domains ( ) Development of an oral human-machine interface, by way of dialogue, for a semantically.
Ease Design Principles Tim Kelly University of Warwick.
Developing a Questionnaire. Goals Discuss asking the right questions in the right way as part of an epidemiologic study. Review the steps for creating.
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
Basic Concept of Data Coding Codes, Variables, and File Structures.
Introduction to databases Developed by Anna Feldman for the Association for Progressive Communications (APC)
Report writing RB, pp What is a report? A written statement prepared for... the benefit of others describing... what has happened or a state of.
Basics of Good Documentation Document Control Systems
Introduction to computer: storing instructions and information.
TrendReader Standard 2 This generation of TrendReader Standard software utilizes the more familiar Windows format (“tree”) views of functions and file.
Download & Play E-Learning System PROPOSAL draft1.0.
Speech Recognition Final Project Resources
1 “ Speech ” EMPOWERED COMPUTING Greenfield Business Centre, 20 th September, 2006.
NETWORK CENTRIC COMPUTING (With included EMBEDDED SYSTEMS)
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
Let’s enjoy stories Overview for learners. Let’s enjoy stories Overview for learners.
Innovations in Election Management – a primer for m-governance M.Selvendran DEO & District Collector, Katni, Madhya Pradesh.
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
1. There are different assistant software tools and methods that help in managing the network in different things such as: 1. Special management programs.
Document Control Basics of Good Documentation and
Recent Activities of Speech Corpora and Assessment in Korea Yong-Ju Lee Wonkwang University Korea.
Chapter 4 – Slide 1 Effective Communication for Colleges, 10 th ed., by Brantley & Miller, 2005© Technology and Electronic Communication.
DATA COLLECTION DATA COLLECTION Compilation and interpretation of primary and secondary sources of information. The integration of different sources will.
Medical Transcription Service Details1 TRANSCRIPTION SERVICES OVERVIEW A PRIMER ON MT SERVICE USAGE.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
3rd NRC Meeting, 9-12 June 2008, Windsor ICCS 2009 Main Survey Field Operations.
Field Work. Chapter Outline Chapter Outline 1) Overview 2) The Nature of Field Work 3) Field Work/ Data collection Process 4) Selection of Field Workers.
LREC 2008, Marrakech, Morocco1 Automatic phone segmentation of expressive speech L. Charonnat, G. Vidal, O. Boëffard IRISA/Cordial, Université de Rennes.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Arizona English Language Learner Assessment AZELLA
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
DATA COLLECTION DATA COLLECTION Compilation and interpretation of primary and secondary sources of information. The integration of different sources will.
Training and Evaluation Tool Milan Jovic Dusan Jevtic Dr Dragan Jankovic Public Reporting on Project Results TEMPUS project.
Relational Databases. Relational database  data stored in tables  must put data into the correct tables  define relationship between tables  primary.
A quick walk through phonetic databases Read English –TIMIT –Boston University Radio News Spontaneous English –Switchboard ICSI transcriptions –Buckeye.
Data Collection. Data Capture This is the first stage involved in getting data into a computer Various input devices are used when getting data to the.
© 2013 by Larson Technical Services
Use of Mobile Technology for Data Collection in Zimbabwe Experiences Gained and Lessons Learnt By Rodgers M. Sango Zimbabwe National Statistics Agency.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
0 / 27 John-Paul Hosom 1 Alexander Kain Brian O. Bush Towards the Recovery of Targets from Coarticulated Speech for Automatic Speech Recognition Center.
Chapter 7 Speech Recognition Framework  7.1 The main form and application of speech recognition  7.2 The main factors of speech recognition  7.3 The.
Natural Language and Speech (parts of Chapters 8 & 9)
Validation & Verification Today will look at: The difference between accuracy and validity Explaining sources of errors and how they could be overcome.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Chapter 11 Data Input and Output. Input Data Capture Forms Data can be collected using a data capture form or questionnaire that is printed on a piece.
The information contained herein is CONFIDENTIAL and is not to be used or distributed in any manner without the express consent of Global Tel*Link Introducing.
GHANA STATISTICAL SERVICE IPUMS – Country Report: Ghana BY N.N.N. Nsowah-Nuamah (Deputy Government Statistician)
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
Write the Screening Questionnaire GGGE /2014  Steps for developing a screener :  Review the profile to understand users’ backgrounds  Identify.
An AV Control Method Using Natural Language Understanding Author : M. Matsuda, T. Nonaka, and T. Hase Date : Speaker : Sian-Lin Hong IEEE Transactions.
Arnar Thor Jensson Koji Iwano Sadaoki Furui Tokyo Institute of Technology Development of a Speech Recognition System For Icelandic Using Machine Translated.
Ch 15 –part 3 -design evaluation
King Saud University, Riyadh, Saudi Arabia
Essentials of Oral Defense
Presentation transcript:

Design, compilation and processing of CUCall: a set of Cantonese spoken language corpora collected over telephone networks by W.K. Lo, P.C. Ching, Tan Lee and Helen Meng The Chinese University of Hong Kong at ROCLING XIV 16th August 2001

Acknowledgment The CUCall data collection is conducted under the support from the Innovation and Technology Fund (AF/96/99)The CUCall data collection is conducted under the support from the Innovation and Technology Fund (AF/96/99) We are also grateful to the industrial sponsors:We are also grateful to the industrial sponsors: –Group Sense Limited –SmarTone Mobile Communication Limited

Outline Corpus Design and OrganizationCorpus Design and Organization –phonetically oriented –application oriented Data Collection and ProcessingData Collection and Processing Data AnalysisData Analysis ConclusionsConclusions

Part I: Corpus Design and Organization

Overview extension to the CUCorpora microphone speech databaseextension to the CUCorpora microphone speech database collection of telephone speech data over fixed-line and mobile networkscollection of telephone speech data over fixed-line and mobile networks allow phonetically oriented and domain specific applicationsallow phonetically oriented and domain specific applications –rich phonetic coverage with speaking style variations –words, phrases and digit strings for specific use

CUCall Organization

Phonetically Oriented 5719 sentences5719 sentences –select from the pools of CUSENT training and testing set –target for phonetic coverage in a biphone context 90 short paragraphs90 short paragraphs –enrich the phonetic coverage in additional to the sentence materials –capture the variations brought about by the lengthy nature of the reading materials

Phonetically Oriented 6 spontaneous conversation6 spontaneous conversation –capture speakers’ spontaneous response –content is unlimited and unconstrained –contains all kinds of non-speech events, e.g. correction, hesitation, skipped word, … –questions must be simple and open-ended

Phonetically Oriented Criteria for the questions designCriteria for the questions design –simple enough for spontaneous response; avoid calculation, memory recall etc. –answers are expected to be different for different speakers –responses may be either long or short –avoid answers that are relevant to speakers’ privacy

Application Oriented 1440 words and phrases1440 words and phrases –simple words cover various domains names of placesnames of places listed companieslisted companies foreign currenciesforeign currencies navigation commandsnavigation commands Digit stringsDigit strings –strings of digits of various length all ten single digitsall ten single digits random generated strings of length 7, 8 and 16random generated strings of length 7, 8 and 16

Part II: Data Collection and Processing

Collection Process Preparation of reading materialsPreparation of reading materials –prepare reading materials as prompt sheets –separate male & female, fixed & mobile lines Distribution of prompt sheetDistribution of prompt sheet –distributed hierarchically through agents Speakers callSpeakers call –speakers call automatic recording servers –they are identified by unique serial numbers Questionnaire returnQuestionnaire return –information on age, telephone network type are collected

Data Collection System Set-up Calling End : From any location, using any telephone, by all walks of life Telephone Companies : mobile/fixed line network Telephone Companies : mobile/fixed line network Recording End : telephone outlet, telephony hardware, recording system, data storage system Recording End : telephone outlet, telephony hardware, recording system, data storage system Post-processing of data for various targeted for various targeted domains of applications Post-processing of data for various targeted for various targeted domains of applications ….. Note : CT board is Dialogic® D/41-ESC Recording Servers : fixed-line connection to local telephone companies Recording Servers : fixed-line connection to local telephone companies

Post-processing of Data Call validationCall validation –received prompt sheets are verified against the recorded speech data –user information are entered into databases Phonemic transcriptionPhonemic transcription –all accepted speech data are 100% phonemic transcribed on initial-final level Partitioning of collected dataPartitioning of collected data –collected data are partitioned properly –speech data and the transcriptions are organized per speaker basis

Validation: identify successful recording sessions Transcription: accurate verbatim transcription for the speech data Data Storage: collected telephone speech data Organization: organize data for easy access Distribution: printing CDROM for distribution. /nei5-hou2-maa1/. \speaker01\data\001.wav \002.wav. \speaker01\annotate\001.xsc \002.xsc. /nei5-hou2-maa1/ /ngo5-hou2-hou2/ /nei5-ne1/ /dou1-ng4-co3-laa1/. Data Processing After Collection

Part III: Data Analysis

Statistics of Reading Materials Part# per speaker# tonal syl.# base syl.syl. count Phonetically oriented corpora sent.50 (out of 5719) to 31 para.3 (out of 90) to 120 Application-specific corpora 1-digit 10 7-digit5 8-digit5 16-digit5 words48 (out of 1440) to 8

Frequency-of-frequency (FOF) Sentence Paragraph

Part IV: Conclusions

Current Status the collection process is divided into several stagesthe collection process is divided into several stages expected completion date: March 2002expected completion date: March 2002 until now, over 200 hours of data (from 1000 speakers) has been collecteduntil now, over 200 hours of data (from 1000 speakers) has been collected –120 hours for phonetically oriented data –80 hours for application-specific data over half of the collected have been phonemically transcribedover half of the collected have been phonemically transcribed

Conclusions design and collection process for the Cantonese telephone speech corpora is presenteddesign and collection process for the Cantonese telephone speech corpora is presented corpora are designed to cover both phonetically oriented and application- specific datacorpora are designed to cover both phonetically oriented and application- specific data include also long reading materials and open questions for spontaneous datainclude also long reading materials and open questions for spontaneous data details of post-processing and data analysis are givendetails of post-processing and data analysis are given

Thank You