King Saud University, Riyadh, Saudi Arabia

Slides:



Advertisements
Similar presentations
A Comparison of Three Language Assessment Tools
Advertisements

Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
The perception of dialect Julia Fischer-Weppler HS Speaker Characteristics Venice International University
 about 5,000-6,000 different languages spoken in the world today  English is far the most world wide in its distribution  1/4 to 1/3 of the people.
Acoustic Model Adaptation Based On Pronunciation Variability Analysis For Non-Native Speech Recognition Yoo Rhee Oh, Jae Sam Yoon, and Hong Kook Kim Dept.
Author :Panikos Heracleous, Tohru Shimizu AN EFFICIENT KEYWORD SPOTTING TECHNIQUE USING A COMPLEMENTARY LANGUAGE FOR FILLER MODELS TRAINING Reporter :
Dialectology Prepared by : Domantas Jasmontas Aistė Taraškutė.
Designing a Multi-Lingual Corpus Collection System Jonathan Law Naresh Trilok Pace University 04/19/2002 Advisors: Dr. Charles Tappert (Pace University)
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
Improving Spoken English NativeAccent™. What is NativeAccent? New internet-delivered technology that assesses a student’s English pronunciation skills.
Building High Quality Databases for Minority Languages such as Galician F. Campillo, D. Braga, A.B. Mourín, Carmen García-Mateo, P. Silva, M. Sales Dias,
The ‘London Corpora’ projects - the benefits of hindsight - some lessons for diachronic corpus design Sean Wallis Survey of English Usage University College.
Speech Recognition Final Project Resources
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
Arabic STD 2006 Results Jonathan Fiscus, Jérôme Ajot, George Doddington December 14-15, Spoken Term Detection Workshop
Language Issues in English-medium Universities: A Global Concern1 Using Mobile Phones in Pronunciation Teaching in English-medium Universities in Turkey.
Evaluating Statistically Generated Phrases University of Melbourne Department of Computer Science and Software Engineering Raymond Wan and Alistair Moffat.
Phonetic Variations between Mid-Vowels in Swiss French and Standard French Anna Buffington, Carly Kleiber, Rebecca Kopps, Dr. Jessica Miller
Averil Coxhead Hüsem Korkmaz MA TEFL. was developed from a corpus of 5 million words with the needs of ESL/EFL learners in mind, contains the most widely.
Synthesis of Child Speech With HMM Adaptation and Voice Conversion Oliver Watts, Junichi Yamagishi, Member, IEEE, Simon King, Senior Member, IEEE, and.
Enhanced Infrastructure for Creation & Collection of Translation Resources Zhiyi Song, Stephanie Strassel (speaker), Gary Krug, Kazuaki Maeda.
LREC 2008, May 26 – June 1, Marrakesh Speaker Recognition: Building the Mixer 4 and 5 Corpora Linda Brandschain, Christopher Cieri, David Graff, Abby Neely,
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
A quick walk through phonetic databases Read English –TIMIT –Boston University Radio News Spontaneous English –Switchboard ICSI transcriptions –Buckeye.
2.3 Markedness Differential Hypothesis (MDH)
An Assessment of the Readiness of a Tertiary Healthcare Organization in Saudi Arabia, in Adopting Effective Online Staff Development Programs Adnan D.
DEVELOPING AND VALIDATION OF AN ARABIC VERSION OF THE VISUAL FUNCTIONING INDEX VF14 FOR CATARACT PATIENTS Abdulrahman Al-Muammar, MD, FRCSC King Abdul.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
0 / 27 John-Paul Hosom 1 Alexander Kain Brian O. Bush Towards the Recovery of Targets from Coarticulated Speech for Automatic Speech Recognition Center.
Making it Meaningful  Dialects of American English as YOU see them Dialects of American English  Does everyone speak using a dialect? Information about.
Language, Society and Culture. Speech Social identity used to indicate membership in social groups Speech community Group of people who share norms, rules.
 Language and Culture LT 5. I can define language and examine its impact on culture.
Towards Developing a Multi-Dialect Morphological Analyser for Arabic 4 th International Conference on Arabic Language Processing May 2–3, 2012, Rabat,
Critical &Scientific Debate Soran University Faculty of Science / Chemistry Dept. Talib M. Sharif Omer Asst. Lecturer April 7,
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
IIS for Speech Processing Michael J. Watts
Arnar Thor Jensson Koji Iwano Sadaoki Furui Tokyo Institute of Technology Development of a Speech Recognition System For Icelandic Using Machine Translated.
Noise Cancelling 1-Wire Microphone
Teaching Listening Why teach listening?
Miami, Florida, 9 – 12 November 2016
Bi-dialectalism: the investigation of the cognitive advantage and non-native dialect perception in noise Brittany Moore, Jackie Rayyan, & Lynn Gilbertson,
What is sociolinguistics 2
CHAPTER 12 Statistics.
How To Write Research Abeer Bin Humaid.
Minimal English Test vs. TOEIC®
Progress Report - V Ravi Chander
Figure 1: An example scenario. Permitted re-print
A Country Report – COCOSDA Activities in China Data More and more companies on data resources and services suppliers are emerging in China: a new.
Anastassia Loukina, Klaus Zechner, James Bruno, Beata Beigman Klebanov
Sfax University, Tunisia
Population Structures
Dr. Glen E. Randall Dr. Michelle Howard McMaster University
APPROACHES TO THE STUDY OF LANGUAGE IN SOCIETY
Corpus Linguistics and Gender
CHAPTER 12 Statistics.
Using GOLD to Tracking L2 Development
A maximum likelihood estimation and training on the fly approach
Cheng-Kuan Wei1 , Cheng-Tao Chung1 , Hung-Yi Lee2 and Lin-Shan Lee2
AHRC Corpus 3 hours of conversation in each language
ICT Market Follow up in Morocco Market Observatory/ANRT MOROCCO
Design and Implementation of a Computerized Arabic Braille Environment
CHAPTER 12 Statistics.
Nahedh Rashed Alotaibi and Nesreen Al-Shubbar
CHAPTER 12 Statistics.
Regional dialects.
Da-Rong Liu, Kuan-Yu Chen, Hung-Yi Lee, Lin-shan Lee
Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen
Presentation transcript:

King Saud University, Riyadh, Saudi Arabia 1/18/2019 West Point, SAAVB, and BBN/AUB Arabic Speech Corpora: A Comparative Survey Yousef A. Alotaibi Ali H. Meftah King Saud University, Riyadh, Saudi Arabia This work is supported by NPST project No. 10-INF1325-02 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

OUTLINE INTRODUCTION SPEECH CORPORA BACKGROUND EVALUATION CONCLUSION MSA Arabic Arabic Dialects SPEECH CORPORA BACKGROUND (TIMIT, WESTPOINT, SAAVB, and BBN/AUB) EVALUATION Type, Speakers, Data Source, Labelling, Training and Testing CONCLUSION 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

INTRODUCTION MSA Arabic 34 Phonemes (6 V + 28 C) Valid syllables: CV, CVV, CVC, CVCC, CVVC, and CVVCC Limited number of research Low in quality of Arabic language resources 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

INTRODUCTION Arabic Dialects The Arab world can be divided into many different ways. The following is only one of many that cover the main Arabic dialects: Gulf Arabic Levantine Arabic Egyptian Arabic Maghreb Arabic Yemenite 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

SPEECH CORPORA BACKGROUND TIMIT American English speakers of different genders and dialects A read (Canonical) speech corpus Contains a total of 6,300 sentences, 10 sentences (about 30 sec of speech) Spoken by each of 630 speakers (438 males that account for a percentage of 70%, and 192 females 8 major dialect regions of the United States (US) A speaker's dialect region is a geographical distribution within the U.S. mainland. Those speakers lived during their childhood years in the same area 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

SPEECH CORPORA BACKGROUND West Point Represent MSA Arabic language Produced by the Linguistic Data Consortium (LDC) A read corpus Contain 110 speakers (66 male, 44 female) Consists of collections of 4 main Arabic scripts which contain 258 sentences and it has a total of 1512 tokens and 991types. The total number of distinct Arabic words is 1131. It consists of 8,516 speech files, totaling 1.7 gigabytes or 11.42 hours of speech data 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

SPEECH CORPORA BACKGROUND SAAVB Corpus Saudi Arabia dialect A telephony and noisy speech corpus Collected by KACST during 2002 to 2003 A Canonical and spontaneous speech corpus Acquired from 1,033 speakers (51% males and 49% females) Total duration recorded is 96.37 hours distributed among 60947 audio files (1033 speakers x 59 audio files) Size is 2.59 GB. It contains 1,033 directories with 183,518 files 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

SPEECH CORPORA BACKGROUND BBN/AUB Corpus Levantine dialect Developed by funding from the Defense Advanced Research Project Agency (DARPA) A set of spontaneous speech sentences Recorded in Boston (20%), and in American University of Beirut (AUB) (80%) Consists of 164 speakers, 101 males and 63 females Total duration recorded speech is 45 hours distributed among 75,900 audio, the total audio size: 6.5 GB. The total text size is 3.1 MB, Vocabulary is 15K words and Total words are 336K words 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Corpus Type Corpus TIMIT W.P SAAVB BBN/AUB TYPE Canonical + Spontaneously 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

Speakers Number of speakers EVALUATION Speakers Number of speakers Corpus TIMIT W.P SAAVB BBN/AUB NO. of Speakers 630 110 1033 164 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Speakers Gender Very clear in (TIMIT, SAAVB, and West Point) and implicitly reported in BBN/AUB Corpus TIMIT West Point SAAVB BBN/AUB Male 438 (70%) 66 (60%) 523 (50.63%) 101 (61.58%) Female 192 (30%) 44 (40%) 510 (49.37%) 63 (38.41%) 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Speakers Speakers’ Ages TIMIT, West Point, and BBN/AUB corpora unfortunately no reference to this was reported in their Catalogs and does not give information about the speakers’ ages In SAAVB corpus, the ages are distributed and documented for each speaker in a well-defined manner 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Speakers Speakers Nationalities BBN/AUB corpus does not refer to this important information, thy only refer to 20% of the corpus was recorded in Boston and the remaining 80% was recorded in AUB Corpus TIMIT West Point SAAVB BBN/AUB Native 100% 68.18% ? Nonnative - 31.81% 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Speakers Speakers distribution within investigated region of the targeted dialects West point and BBN/AUB not shown how they chose the group of speakers form the Arabic countries(22 country for W.P, and 4 countries for BBN/AUB) Corpus TIMIT West Point SAAVB BBN/AUB Distribution state 8 Regions ? ALL Saudi Cities. 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Data Sources Corpus TIMIT West Point SAAVB BBN/AUB Sampling Rate 16KHz 22.05KHz 8KHz Recorded By Soundproof chamber Shure SM10A microphone and a RANE Model MS1 pre-amplifier Telephone system Close-talking, noise-cancelling, headset microphone (the Andrea Electronics NC-65) 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

Lexicon and Labeling West Point is timeless EVALUATION Lexicon and Labeling West Point is timeless Corpus TIMIT West Point SAAVB BBN/AUB Labeling Yes No 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

EVALUATION Training and Testing Subsets TIMIT has been subdivided into suggested training and testing subsets using the following criteria: Roughly 20% to 30% of the corpus should be used for testing purposes, leaving the remaining 70% to 80% for training No speaker should appear in both subsets. All the dialect regions should be represented in both subsets Overlap of text material in the two subsets should be minimized; if possible no texts should be identical All the phonemes should be covered in the test material 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

Training and Testing Subsets EVALUATION Training and Testing Subsets SAAVB left it open for researchers and application developers to select the training and testing sets according to their need West Point and BBN/AUB do not refer to that 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

CONCLUSION Arabic language lacks reliable speech corpora Robust Arabic speech corpora must be consider: The different Arab countries The different Arab and dialects The different speakers' ages, genders and good distributions Training and testing subsets are too important additionally to the phonemes labelling 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

Thank you for Your Attention 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco

Any Question? 1/18/2019 4th International Conference on Arabic Language Processing, May 2–3, 2012, Rabat, Morocco