Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tel More Telugu Morphological Generator

Similar presentations


Presentation on theme: "Tel More Telugu Morphological Generator"— Presentation transcript:

1 Tel More Telugu Morphological Generator
Madhavi Ganapathiraju and Lori Levin Language Technologies Institute Carnegie Mellon University Pittsburgh USA I am going to present a tool that can generate morphological forms of telugu words ICUDL 2006: Second International Conference on Universal Digital Library Alexandria, Egypt November 17-19, 2006

2  U D L machine translation Information retrieval Interface design
digital storage summarization A number of language processing tools have emerged from the research base created by the universal digital library. This work that I am presenting fits well into the machine translation work presented by Prof Balki yesterday OCR 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

3 machine translation Rani gave the book to my mother OR
1. Phrase match in EBMT Gave to <noun>  <noun> ki ichchaad’u OR 1. Output from English Lexical analysis gave  Verb past, root give the book  Noun phrase, singular, neutral mother  noun, singular, feminine my  possessive, root I 2. English – Telugu Dictionary for root forms of nouns and verbs give  ichchut’a book pustakamu mother  talli, amma I  neinu 3. TelMore: Morphological generator for Telugu 3. TelMore: Morphological generator for Telugu ichchut’a  ichchaad’u (past masc), ichchinadi (past fem), ... Istun’di (future fem), istaad’u (future masc) pustakamu  pustakamu, pustakamutoo (with pustakamu), pustakamu loo (in pustakamu)… amma ammaki (to amma), amma cheita (by amma) I  naa (possessive) 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

4 TelMore Generates morphological forms for nouns and verbs
when the root word is given 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

5 ICUDL2006: TelMore - Telugu Morphological Generator
About Telugu 2nd largest spoken language in India (?) 70 M native speakers World ranking 13-17 with Korean, Vietnamese, Marathi and Tamil 7th century AD recorded origin literary language in 11th century AD 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

6 ICUDL2006: TelMore - Telugu Morphological Generator
Parts of Speech: Noun Number: singular, plural Gender: male, female, neutral Morphological forms: (vibhaktulu) nominative, genitive, dative, accusative, vocative, instrumental and locative 14 forms for each noun 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

7 Plural formation General rule is to add “lu” as a suffix;
A series of rules are then applied to yield final form of : ©Õ (lu), ©Õx (llu), @ÁÙ} (l’l’u) or ¢œ¿Õx (n’d’lu) 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

8 ICUDL2006: TelMore - Telugu Morphological Generator
Parts of Speech: Verb Number: singular, plural Gender: male, female, neutral Voice: 1st person, 2nd person, 3rd person Morphological forms: Present, past, future, aorist affirmative, aorist negative, imperative and prohibitive Present participle, past participle : affirmative and negative Number of forms: 2 x 3 x 3 x 7 + 4 130 forms for each verb 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

9 Features in TelMore (v.1)
Morphological form generation Nouns Verbs System Library module for integration elsewhere Flat file input & output (plain text or html) User-interactive through command line Web interface for data addition with user validation Web Interface 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

10 Current Data Size words have been created by native speakers upon request 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

11

12 ICUDL2006: TelMore - Telugu Morphological Generator
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

13 ICUDL2006: TelMore - Telugu Morphological Generator
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

14 ICUDL2006: TelMore - Telugu Morphological Generator
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

15 ICUDL2006: TelMore - Telugu Morphological Generator
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

16 ICUDL2006: TelMore - Telugu Morphological Generator
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

17 ICUDL2006: TelMore - Telugu Morphological Generator
Linguistic Knowledge The linguistic rules are taken from a book by C.P. Brown Rules are demonstrated through examples No formal description 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

18 Noun: First Declension Morphs
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

19 Noun: Second Declension
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

20 Noun: Third Declension
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

21 Noun: Third Declension: Irregular 2
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

22 Noun: Third Declension: Irregular 3
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

23 Noun: Third Declension: Irregular 4
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

24 Noun: Third Declension: Irregular 5
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

25 Verb: First Conjugation
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

26 Verb: Second Conjugation
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

27 Verb: Third Conjugation
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

28 Alternate dialects and spellings
Telugu is spoken in many dialects Andhra Pradesh has long borders with 4 states each of which speaks a different language, and one long coastal region Dialects in each of these regions is different learned and the others speak different dialects Urdu influence in Hyderabad due to Muslim rule pure/poetic formal/informal Telugu is written the way it is spoken Hence the different dialects result in different spellings of the words 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

29 Future work for this tool
Causative, middle and passive voices to be added Morphology of adjectives, etc Integration of Om  native font integration for flat file processing Integration with English Lexicon to be of real use in multilingual applications 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

30 Acknowledgements Prof. Lori Levin Linguistics Advisor Prof. Raj Reddy
Prof. N. Balakrishnan UDL Advisors R. Harsha Naveena Yanamala Web-interface creation Data Creation … V. Mythili Shyam G. Padmasree V. Abhinay B.V. Prashanth G. Ramana Lakshmi G. Padmavathy V. Nava Mallika 19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator

31 http://linzer.blm.cs.cmu.edu/morph/ www.cs.cmu.edu/~madhavi
19th Nov, 2006 ICUDL2006: TelMore - Telugu Morphological Generator


Download ppt "Tel More Telugu Morphological Generator"

Similar presentations


Ads by Google