Presentation is loading. Please wait.

Presentation is loading. Please wait.

MTP I Stage Project Presentation Guided by- Presented by- Prof. Pushpak Bhattacharyya Abhijeet Padhye Department of Computer Science and Engineering Indian.

Similar presentations


Presentation on theme: "MTP I Stage Project Presentation Guided by- Presented by- Prof. Pushpak Bhattacharyya Abhijeet Padhye Department of Computer Science and Engineering Indian."— Presentation transcript:

1 MTP I Stage Project Presentation Guided by- Presented by- Prof. Pushpak Bhattacharyya Abhijeet Padhye Department of Computer Science and Engineering Indian Institute of Technology, Bombay

2 1. Motivation 2. Introduction 3. Introduction to Transliteration 4. Syllables and their structure types 5. Sonority Theory 6. Relation between Sonority and Syllables 7. What is Schwa? 8. A Sonority theory based Syllabification module 9. Results obtained 10. References

3  Language – an integral part of society  Each has its specific structure and rules  Some basic concepts common to all  Helpful in processes like transliteration ultimately leading to better CLIR.  We are trying to exploit them for process of syllabification

4 “To study some Phonological similarities between English, Hindi and Marathi and exploit them in order to achieve the goal of transliteration with high accuracy so as to be able to tackle problems like OOV words during Cross-Lingual Information Retrieval.”

5  Concepts being emphasized  Transliteration  Theory of Syllables  Sonority Theory  Their relation  Theory of Schwa & Schwa deletion  Mainly based on the properties of Sound  Driving force behind word pronunciation in any language

6  A process of phonetically “translating” named entities like proper nouns from a source language to a target language.[1]  The process of transliteration should be as accurate as possible.  Faces the problem of multiple variants of words.

7

8 “Syllable is a unit of spoken language consisting of a single uninterrupted sound formed generally by a Vowel and preceded or followed by one or more consonants.”  Vowels are the heart of a syllable(Most Sonorous Element)  Consonants act as sounds attached to vowels.

9  A syllable consists of 3 major parts:-  Onset (C)  Nucleus (V)  Coda (C)  Vowels sit in the Nucleus of a syllable  Consonants may get attached as Onset or Coda.  Basic structure - CV

10  The Nucleus is always present  Onset and Coda may be absent  Possible structures  V  CV  VC  CVC

11  Prominence Theory  E.g. entertaining /ent ə te ɪ n ɪ ŋ/  The peaks of prominence: vowels /e ə e ɪ ɪ /  Number of syllables: 4  Chest Pulse Theory  Based on muscular activities  Sonority Theory  Based on relative soundness of segment within words

12 “The Sonority of a sound is its loudness relative to other sounds with the same length, stress and speech.”  Languages have sounds associated with them  Some sounds are more sonorous  Words in a language can be divided into syllables  Sonority theory distinguishes syllables on the basis of sounds.

13  Defined on the basis of amount of sound associated  The sonority hierarchy is as follows:-  Vowels (a, e, i, o, u)  Liquids (y, r, l, v)  Nasals (n, m)  Fricatives (s, z, f,…..sh, th etc.)  Affricates (ch, j)  Stops (b, d, g, p, t, k)

14  Obstruents can be further classified into:-  Fricatives  Affricates  Stops

15 “A Syllable is a cluster of sonority, defined by a sonority peak acting as a structural magnet to the surrounding lower sonority elements.”  Represented as waves of sonority or Sonority Profile of that syllable Nucleus Onset Coda

16 “The Sonority Profile of a syllable must rise until its Peak(Nucleus), and then fall.” Peak (Nucleus) Onset Coda

17  ABHIJEET  Sonority Profile 1 AIE E H J B T  Sonority Profile 2 AIE E H J B T

18 “The Intervocalic consonants are maximally assigned to the Onsets of syllables in conformity with Universal and Language-Specific Conditions.”  Determines underlying syllable division  Example  DIPLOMA DIPLOMA &DIPLOMA

19  First alphabet of IAL – {a}  Unstressed and Toneless neutral vowel  Sanskrit is phonetically perfect – no neutral vowels  Hindi, Bengali etc. allow schwa to be neutral  Some schwas deleted and some are not  Schwa deletion – important issue for grapheme to phoneme conversion

20 1) Saphalya and Amantrana 2) Priya and Tritiya 3) Kavya and Ashva 4) Badhai 5) Samuha and Chehara 6) Badara and Kalama 7) Kalama and Banda

21  Developed completely in Java  Platform independent  Tries to perform syllabification of words  Rides on the concepts of Sonority theory – mainly sonority sequencing principle  Makes use of Java’s Hashmap utility to save execution time.

22  Consists of three major functions:-  SonorityHierarchy()  syllabify(String word)  accuracy()  Delete_schwa() [Under Development]  Stores and references the Sonority hierarchy from the hashmap  Tries to find the syllable boundaries according to their sonority profile  Tries to delete schwas present in the input

23  Syllabification and PRR generation modules implemented  Number of manually syllabified words – 27614  No. of words fed as input – 27614  No. of words correctly syllabified – 26253  Accuracy obtained – 95.86 % for English and about 70% for Hindi  Accuracy of Schwa deletion in English – 77%  Schwa deletion for Hindi is under developement

24  Problems faced  First rule-based implementation failed  Some specific consonant and vowel clusters still result in erroneous syllabification  Future work  Schwa deletion for Hindi and Marathi  Implementation of Maximal Onset First principle  Packaging the above implementation in a stable transliteration module to be used further in CLIR

25 1) Giegerich, H. J. 1992. English Phonology. An Introduction. 2) Kahn, Daniel. 1976. Syllable-based generalizations in English phonology. 3) Lass, Roger. Phonology: An Introduction to Basic Concepts. Cambridge University Press, 1984


Download ppt "MTP I Stage Project Presentation Guided by- Presented by- Prof. Pushpak Bhattacharyya Abhijeet Padhye Department of Computer Science and Engineering Indian."

Similar presentations


Ads by Google