Presentation is loading. Please wait.

Presentation is loading. Please wait.

World Languages Mandarin English Challenges in Mandarin Speech Recognition  Highly developed language model is required due to highly contextual nature.

Similar presentations


Presentation on theme: "World Languages Mandarin English Challenges in Mandarin Speech Recognition  Highly developed language model is required due to highly contextual nature."— Presentation transcript:

1 World Languages Mandarin English Challenges in Mandarin Speech Recognition  Highly developed language model is required due to highly contextual nature of Mandarin  Tone modeling  Coarticulation  Large # of homophones  Chinese text is unsegmented  No standard lexicon  Chinese sentence/word structure is very flexible  Ex: Beijing Da Xue -> Bei Da Challenges in Mandarin Speech Recognition  Highly developed language model is required due to highly contextual nature of Mandarin  Tone modeling  Coarticulation  Large # of homophones  Chinese text is unsegmented  No standard lexicon  Chinese sentence/word structure is very flexible  Ex: Beijing Da Xue -> Bei Da Abstract Speech recognition has recently gained a great deal of popularity in video gaming, robotics, speech prosthetics, and mobile technology. With China becoming an increasingly important player in the world economy, a new demand for Mandarin speech recognition has emerged. This poster will discuss Applications of speech recognition Obstacles in English vs. Mandarin speech recognition Methods used to successfully implement Mandarin speech recognition systems. Benchmarks in the history of Mandarin speech recognition Abstract Speech recognition has recently gained a great deal of popularity in video gaming, robotics, speech prosthetics, and mobile technology. With China becoming an increasingly important player in the world economy, a new demand for Mandarin speech recognition has emerged. This poster will discuss Applications of speech recognition Obstacles in English vs. Mandarin speech recognition Methods used to successfully implement Mandarin speech recognition systems. Benchmarks in the history of Mandarin speech recognition John Steinberg & Dr. Joseph Picone Department of Electrical and Computer Engineering, Temple University, Philadelphia, Pennsylvania An Introduction to Mandarin Speech Recognition College of Engineering Temple University Speech Recognition Systems Speech Recognition Systems Modeling Methods  Prosodic Features  Models rhythm, focus, and intonation of speech  Reduces search space (i.e. reduces number of homophones and constrains sentence flexibility)  Language Models  Reduces solution space by using probable combinations of phones, words, phrases, etc. Modeling Methods  Prosodic Features  Models rhythm, focus, and intonation of speech  Reduces search space (i.e. reduces number of homophones and constrains sentence flexibility)  Language Models  Reduces solution space by using probable combinations of phones, words, phrases, etc. A Timeline of Mandarin Speech Recognition Mandarin Chinese Conclusions Mandarin speech recognition is a developing and increasingly important field Mandarin speech recognition involves many challenges including large numbers of homophones, word & sentence flexibility, a lack of a segmented lexicon, and coarticulation effects. Possible improvements to CER may be found in prosodic features and language models Reference [1] Lee, C-H. “Advances in Chinese Spoken Language Processing”, World Scientific Publishing Co., Singapore, 2007 pp 128 [2]Lin-Shan Lee; Keh-Jiann Chen; et. al., "Golden Mandarin(II)-an intelligent Mandarin dictation machine for Chinese character input with adaptation/learning functions," ISSIPNN '94, 1994 pp.155-159 vol.1, 13-16 Apr 1994 [3] Ren-Yuan Lyu; Lee-Feng Chien; et. al., "Golden Mandarin (III)-a user-adaptive prosodic-segment-based Mandarin dictation machine for Chinese language with very large vocabulary," ICASSP-95, vol.1, pp.57-60 vol.1, 9-12 May 1995 [4]“The History of Automatic Speech Recognition Evaluations at NIST,” 2009 http://www.itl.nist.gov/iad/mig/publications/ASRhistory/index.html http://www.itl.nist.gov/iad/mig/publications/ASRhistory/index.html [5] Lee, L.S.; Tseng, C.Y.; Gu, H.Y.; et. al., "Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary," Speech and Audio Processing, IEEE Transactions on, vol.1, no.2, pp.158-179, Apr 1993 [6] ] K. Kūriákī, A Grammar of Modern Indo-European, Asociación Cultural Dnghu, 2007 Conclusions Mandarin speech recognition is a developing and increasingly important field Mandarin speech recognition involves many challenges including large numbers of homophones, word & sentence flexibility, a lack of a segmented lexicon, and coarticulation effects. Possible improvements to CER may be found in prosodic features and language models Reference [1] Lee, C-H. “Advances in Chinese Spoken Language Processing”, World Scientific Publishing Co., Singapore, 2007 pp 128 [2]Lin-Shan Lee; Keh-Jiann Chen; et. al., "Golden Mandarin(II)-an intelligent Mandarin dictation machine for Chinese character input with adaptation/learning functions," ISSIPNN '94, 1994 pp.155-159 vol.1, 13-16 Apr 1994 [3] Ren-Yuan Lyu; Lee-Feng Chien; et. al., "Golden Mandarin (III)-a user-adaptive prosodic-segment-based Mandarin dictation machine for Chinese language with very large vocabulary," ICASSP-95, vol.1, pp.57-60 vol.1, 9-12 May 1995 [4]“The History of Automatic Speech Recognition Evaluations at NIST,” 2009 http://www.itl.nist.gov/iad/mig/publications/ASRhistory/index.html http://www.itl.nist.gov/iad/mig/publications/ASRhistory/index.html [5] Lee, L.S.; Tseng, C.Y.; Gu, H.Y.; et. al., "Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary," Speech and Audio Processing, IEEE Transactions on, vol.1, no.2, pp.158-179, Apr 1993 [6] ] K. Kūriákī, A Grammar of Modern Indo-European, Asociación Cultural Dnghu, 2007 A Timeline of Benchmarks English speakers in the World: ~ 350 million [2] Estimated # of current English learners in China: 200-350 million [3] Estimated # of native Mandarin speakers: 1+ Billion Tonal Language  4 distinct tones, 1 neutral tone Tonal Language  4 distinct tones, 1 neutral tone Character Based  8000+ characters compose 80k-200k common words  Act as morphemes  Monosyllabic  Have single associated tone Character Based  8000+ characters compose 80k-200k common words  Act as morphemes  Monosyllabic  Have single associated tone Coarticulation  Context can cause changes in tone Bu4 + Dui4 = Bu2 Dui4 Ni3 + Hao3 = Ni2 Hao3Coarticulation  Context can cause changes in tone Bu4 + Dui4 = Bu2 Dui4 Ni3 + Hao3 = Ni2 Hao3 Highly Contextual  Few syllables compared to English  English: ~10,000  Mandarin: ~1300  High # of homo- phones 7 instances of “ma” Highly Contextual  Few syllables compared to English  English: ~10,000  Mandarin: ~1300  High # of homo- phones 7 instances of “ma” [1] Speech Recognition Applications Speech Recognition Applications MobileTechnology Auto/GPS NationalIntelligence Other Applications Translators Prostheses Language Educ. Multimedia Search Prosodic Features Used for segmentation Measures pause duration, segment / syllable duration, and F 0 reset. Prosodic Features Used for segmentation Measures pause duration, segment / syllable duration, and F 0 reset. Language Models N-Grams Probabilities of the occurrence of N sequences of words Neural-Networks High dimensionality Random Forests Incorporates morphology into language model Language Models N-Grams Probabilities of the occurrence of N sequences of words Neural-Networks High dimensionality Random Forests Incorporates morphology into language model [1] [4] Language families across the world [6] Language families by speaker population [6]


Download ppt "World Languages Mandarin English Challenges in Mandarin Speech Recognition  Highly developed language model is required due to highly contextual nature."

Similar presentations


Ads by Google