Building Non Native Pronunciation Lexicon for English using a Rule Based Approach Rohit Kumar *, Amit Kataria, Sanjeev Sofat Department of Computer Science and Engineering Punjab Engineering College, Chandigarh * Language Technologies Research Center IIIT Hyderabad
Outline of the Presentation Introduction Need, Problems, Suggestion Overview of the Approach Grapheme to Phoneme Alignment Applying the Rules Approaches for Building the Rules Conclusion Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Introduction: Need for Pronunciation Lexicons TTS - Text Processing Modules for Non Phonetic Languages like English for Pronunciation Lookups Building Letter to Sound rules by Learning Approaches Building Language Models for Recognition Systems Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Introduction: Appearance of Pronunciation Variations Pronunciation Variations arise when foreign language is spoken by non – native speakers Speakers try to speak the foreign language under the pronunciation constraints of their native language Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Introduction: Problem & Suggestion Pronunciation Lexicon building is a time consuming manual process Since, Lexicons are needed, how do we build these up rapidly and semi automatically with minimal effort. Pronunciation Lexicon in Native pronunciation of Foreign Languages are easily available (particularly for English) A Rule based approach using these Native Lexicons to build Non Native Lexicons is proposed. An Example based approach for Rule Building is also suggested Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Overview of the Approach: 3 Step Process Step 1: Word Lookup: lieutenant > l e f t e n @ n t Step 2: Grapheme – Phoneme Alignment l i e u - t n a % f @ Step 3: Rule Application l % e f t n @ tz Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Overview of the Approach: Issues How to do Grapheme Phoneme Alignment ? (Step 2) How the Rules are triggered and how do they work ? (Step 3) How to come up with the Rules ? Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Grapheme – Phoneme Alignment Problem is that a Phoneme (Grapheme) may match with zero, one or many Graphemes (Phonemes) Algorithm based on matching the phoneme with a grapheme from its list of possible grapheme Consonant Phoneme matches with Consonant Grapheme and Vowel Phoneme Matches with possible Vowel Phoneme (a, e, i, o, u, y) $ o b s c - u re l y @ k u@ % ii Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Format of Rules 5 Fields Left Context LC Source Phoneme PH Right Context RC Corresponding Grapheme GR Target Phoneme OPH Wild Cards (*) are allowed Phonemes are represented by a bit pattern (and hence a number) describing their phonetic properties Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Issue: What should be the shape of the Window shown ?? Applying the Rules LC = * ; PH = @ ; RC = * ; GR = “o” ; OPH = “au” ; $ o b s c - u re l y @ k u@ % ii au r LC = * ; PH = % ; RC = * ; GR = “re” | “r” ; OPH = “r” ; Issue: What should be the shape of the Window shown ?? Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Building the Rules Manually (observe and write) Example based Approach Map each source phoneme to nearest target phoneme Find English words with a native pronunciation different from that obtained by the above mapping Automatically forms rules to model the difference that is observed Merge the newly formed rules with existing rules Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Building the Rules Example Based Approach $ a l m o n d s aa % @ z LC = aa ; PH = % ; RC = m ; GR = “l”; OPH = “l” ; Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Building the Rules Example Based Approach (contd.) MERGING OF RULES New Rule: LC = aa ; PH = % ; RC = m ; GR = “l”; OPH = “l”; Existing Rule: LC = e ; PH = % ; RC = b ; GR = “ll”; OPH = “l”; Merged Rule: LC = aa|e; PH = % ; RC = m|b ; GR = “l”|”ll”; OPH = “l”; Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019
Conclusion Reuse of information available in existing lexicon along with Rules can help in rapid creation of non native lexicons Algorithm for Grapheme to Phoneme is presented Issue of Window Shape for rule application needs further experimentation Example based approach can be used for rule building Building Non Native Pronunciation Lexicon for English using a Rule Based Approach 2/28/2019