W3C WORKSHOP II Internationalizing Speech Synthesis Markup Language W3C Office in Heraklion, Crete, Greece, May 2006
The Phonemic model from India for Bi-modal applications Prof. R. K. Joshi Visiting Design Specialist Centre for Development of Advanced Computing Mumbai (formerly NCST), India Ex HOD,Professor IDC,IIT Mumbai. Ex Chief Art Director,FCB-Ulka Advertising - Session 4: IPA/Phonetic Alphabets W3C WORKSHOP II, Heraklion, Crete, Greece, 30th May 2006
Multilingual communication (textual/verbal) <> Oriya. Punjabi. Sanskrit. Santhali. Sindhi. Tamil. Telugu. Urdu. Assamese. Bengali. Bodo. Dogri. Gujarati. Hindi. Kannada. Kashmiri. Konkani. Maithili. Malayalam. Manipuri. Marathi. Nepali.
Deshanagari – a common script for all Indian languages Multilingual Happenings <> <> Best wishes for the new year
Sound Poems <>
Collaborative research on notation system for Indian music with Dr. Ashok Ranade <>
Concrete text <>
Indian oral tradition <>
Rigveda Shakha Branches Sanhita Text Pratishakhya Phonetics Shiksha Grammar LakshangranthaBrahmanAranyankaUpanishadShroutasutra Kaushitaki Ashvalayan Shankhayan Shakal Bashkal Atharvaveda Samaveda Yajurveda Shukla -Madhyandina Kanva Krishna -Taitariya Maitrayani Shaunakiya Paipaladi Jaiminiya Ranayaniya Kauthum Rugveda Pratishakhya Yajurveda Pratishakhya Taitariya Pratishakhya Shaunakiya Chaturadhyayik Atharva Pratishakhya Samaveda Pratishakhya Pushpasutra Paniniya Apishali Audavraji Manduki Shodashashloki Yadnyavalka Naradiya etc. <> <>
Varna definitions Varnamala/Matruka Brahmamatruka = 65 Varnas ( ) Akshamatruka = 51 Varnas Rudramatruka = 43 Varnas Kulakramasiddha = 42 Varnas Varnas (Phonemes) Swara Varna (Vowel phonemes) Vyanjana Varna (Consonant phonemes) <>
Taitariya Pratishakhya Sadharanavidhi Samhitadhikarana Uccaranakalpa Formation of articulate sounds and mode of their production. Varnakrama Shuddha, Swara, Matra, Anga Varnasarabhuta Varnasara Bhutavarna Kramah 1. Dhvani 2. Sthanam 3. Karnam 4. Prayatnah 5. Kalatah 6. Swarah 7. Devata 8. Jati Saraswati Sakaara Akaara Reph Akaara Sakaaradvitta Vakaara Akaara Takaara Iikaara <> <> VedaLakshanam
Speech related issues <> <>
Spoken Utterances and written marks <>
<> <>
<> Brahmi script
<> <> Consonant sound + Vowel sound renderings in different Indian scripts Consonant sound + consonant sound + vowel sound renderings in Devanagari script
Consonant sound + Vowel sound + accent marks renderings in Vedic Sanskrit Consonant sound + Vowel sound + Samavedic intonation marks renderings in Vedic Sanskrit <> <>
The concept of InPho Phonemic Model from India for Bi-modal applications <>
InPho To identify the total range of InPho: vowel phonemes, consonant phonemes, modifiers, tonal marks and other signs with phonetic and orthographic definitions. To evolve standardized codes for InPho. To document bi-modal models: written and spoken. Bases: 1. Vedic Sanskrit 2. Manak Hindi 3. International Phonetic Alphabet is an attempt to identify and realize strong Indian oral tradition with the phonemic root base of Indian languages and to explore the correlation between verbal and written expressions. (Phonemic model from India for Bi-modal applications - Bharatiya Varna Samamnaya) <> <>
Proposal for phonemic codes <>
Testing of Vedic Sanskrit, A part from "Naradiya Siksa“ Bhandarkar Oriental Research Institute, Poona , INDIA How codes work A Glyph FullKa Glyph Ku AE Glyph KOUdatta <> <> Syllables of consonant sound K Samavedic syllabic text output from phonemic input V VV VC CV CCC….V Vm Vm C Cm V Cm Cm Cm...Vm m - modifier V – Vowel C - Consonant
The Oral Tradition (K model) The Writing Tradition (K model) Phonetic classification of linguistic sounds based on ‘Varna’ concept. Phonetic nuances in terms of pitch, time, stress in order to achieve intonations. Oral modes: recitation, repetition, memorization and oral reproductions through correct pronunciations of syllabic structures and attentive listening (shravana – shruti). The identification of orthographic marks for graphic structures of syllabic utterances. A simple addition representing Consonant sound + Vowel sound. A simple model of V, VV, C, CV, CC….V to create syllabic writing structure. <>
InPho and IPA <>
Range of InPho Vyanjana Varna - InPho Swara Varna - InPho <> <>
Range of IPA <>
IPA issues 1.Based on extended (Manak Hindi) and phonemic structure of Sanskrit, InPho with the combinations of 52 consonant phonemes and 22 vowel phonemes (CV), provides a basic bi-modal common K model for writing as well as speech in Indian languages. The model has a possibility of extension with the provision of tonal marks. 2.Vedic tonal marks and other intonation signs can have multipurpose usage for present linguistic and dialectal scene of India InPho issues 1.Based on Latin script, IPA has 516 code points in Unicode. Majority of code description is of graphic nature. No equivalences are marked for Asian/Indian languages variations of lower case ‘a’ and 16 variations of capital ‘A’ are provided without phonetic description. 3.IPA has its own coding scheme and is being used at higher scholarly level for transliteration purpose. <>
The proposal In realization of the further potential of exploratory applications it is proposed that: Varna should be treated as a basic smallest unit of input, encoding, output and transmission environment ('K' K model). At present the basic smallest unit for encoding is treated as KA (consonant varna K + vowel varna A). The K model should be used to draw parallelence between International Phonetic Alphabet (IPA) and Indian Phonemic Alphabet (InPho). The authentic phonetic breakup of words in Indian language dictionaries can be considered using ‘K’ model. To adopt ‘K' model for Indian Languages for bi-modal applications to realize element as part of SSML <>
Position Statement – Prof R. K. Joshi Writing was a means of recording speech. Writing as an art was practiced by calligraphers under calligraphy. Text printing was a means of spreading writing. Text planning as an art was practiced by Typographers under typography. Computing is a means of processing digital data of multi-modal linguistic/graphic expressions (writing, speech and graphics). Computing is an art practiced by compugraphers under compugraphy. Compugraphy should aim at a new idiom of writing - writing with graphic elemental changes. a new idiom of printing - dynamically changing printed text in time. a new idiom of processing - multi-modal conversions through natural language processing for cross cultural exchange. <> From pen to print (Analog era) Processing of spoken utterances and/or written marks was carried out internally by human brain/mind/intelligence. From print to processing (Digital era) Processing of spoken utterances and/or written marks is carried out externally by standardized processing mechanism with artificial intelligence.
<> Simple text Formatted text Concrete text Monotone speech Synthetic speech Stylistic speech
<>
Concrete text <>
Conclusion The multitude of Indian language and dialects are written using different scripts. These scripts have been allotted distinct code pages in the Unicode scheme. Bi-modal rendering and processing of Indian languages is complex and mandates distinctly different techniques than Latin script. InPho scheme caters standards to both writing and speech and is based on Indian oral and written tradition. InPho scheme is well suited for other processing tasks such as transliteration, sorting, searching, speech synthesis, speech recognition etc. InPho scheme with its K model to be propagated as a ‘new media’ standard for Indian language Information Interchange. <>
References Kelkar A.R Transliteration of South Asian languages, a brief review and a proposal for a standard. Centre of advanced study in linguistics, Deccan College, Pune. Handbook of the International Phonetic Association Cambridge University Press. Naravane V.D Bharatiya Vyavahar Kosh. Triveni Sangam. Peter Ladefoged, 2001, Vowels and Consonants. Blackwell publishers, UK. Prabodh Primer, Department of official languages, Ministry of home affairs, Govt. of India. R. K. Joshi, Prague 2003, A unified phonemic code based scheme for effective processing of Indian languages. 23rd Internationalization and Unicode. R. K. Joshi. October 2002, Vedic Code, a draft, Vishwabharat No. 7. R. Shama Sastri, K. Rangacharya, 1985 reprint, The Taittiriya Pratishakhya, Motilal Banarasidas, Delhi. Shrinath Shanbaug, Durgesh Rau, R. K. Joshi 2002, An intelligent multi-layered input scheme for phonetic scripts. ACM International conference proceeding series, Hawthorne, New York. Unicode The Unicode Consortium The Unicode standard is the universal character-encoding scheme for written characters and text. It defines a consistent way of encoding multilingual text through software. Vinod Kumar, 2005, IndiX information leaflet, C-DAC Mumbai. W.S. Allen, 1961, Phonetics in Ancient India, London, Oxford University Press. Gajendragadakar S. N., 1972, Bhasha Va Bhashashastra, Venus Prakashan, Pune, November Acknowledgements Executive Director, Director Admin, C-DAC Mumbai. IndiX technology team. IndiX design team: Jui Mhatre, Supriya Kharkar. Vaidik Samshodhan Mandal, Pune. Bharati Sanskrut Niketanam, Mumbai/Lonavala. <> <>
designer calligrapher type designer revivalist poet academician typographer compugrapher