Download presentation
Presentation is loading. Please wait.
Published byCatherine Sutton Modified over 9 years ago
1
Machine Translation marazI to UNL Presented by Ashwini, Salil Center for Indian Language Technology Solutions CSE, IIT Powai
2
Characteristics of marazI a.Syntactic structure –Subject-object-verb e.g. rama Baat Katao. –Similarity with Hindi b.Morphology –P`a%yaya –Differences with Hindi
3
Main tasks 1.Marathi-UW dictionary building 2.Rulebase building for converting Marathi language phenomenon to UNL expressions 3.Testing using corpus sentences 4.Verification with Hindi and Marathi deconverters.
4
Analysis consists of Morphology Syntax Semantics Pragmatics
5
Marathi analysis done so far We focus on Marathi morphology Noun morphology Pronoun morphology clickclick Verb morphology clickclick Relation label morphology clickclick Adjective morphology clickclick
6
Types of adjectives in Marathi 1.Pronounic adjectives 1.1 Pronoun adjectives: The nine pronouns being used as adjectives. 1.2 Adjectives derived from the nine pronouns 2. Qualitative adjectives 2.1 Adjectives ending with vowel +É 2.2 Adjectives ending with vowels other than +É 2.3 Postposition adjectives
7
Type of adjectives [contd.] 3. Numerical adjectives 3.1 Cardinal 3.1.1 (whole number) 3.1.2 (fractional number) 3.1.3 (entirety, totality, completeness) 3.2 Ordinal 3.3 Occurrencial 6 types 3.4 Distinctive
8
[ pAvaNedonashe] means 175 or 199.75? - There is no word assigned to 199.75, 299.75, etc. -the problems with paun, pauvane and savva. -(pAvaNedon) times 100 (she). she and shambhar, both mean 100. pAUNashe means 75. pAvaNeshambhar means 99.75. -The powers of ten for which there is a distinct word in Marathi need to be stored separately. -pronunciation is not pAvaNedona-[pause]-she but pAvaNe -[pause]-donashe
9
Tables of numbers: continous and random access. Some forms of numbers are used for verbalizing the tables of numbers: ºÉÉiÉ / ºÉÉiÉÉ / ºÉÉiÉä / ºÉÉiÉÒä / ºÉiiÉä. Marathi: A, B times, (is C), occurring in the table for A. English: B A’s (are C). Usage of forms: 1. only for the expression ‘A’ 2. only for ‘B times’ 3. only while recalling the number directly without going through the table. Some forms occur especially for square. The repetition is emphasized.
10
words used to familiarise a child with numbers Some words are used mostly to familiarise a child with numbers: BEÒ BE, nÖEÔ nÉäxÉ, ÊiÉEÔ iÉÒxÉ, etc. The similarity of each word with the number is used to help a child remember the number. The words used as familiarisers are: BEÒ, nÖEÔ, ÊiÉEÔ, SÉÉèEÒ, {ÉÉSÉÒ, ºÉɽÒ, ºÉÉiÉÒä, +É`Ò, xÉ´Éä, nɽÒ.
11
playing cards and game of cricket 1.playing cards: ekka, durri / durra, tirri / tirra, chavvi / chouka, panji / panja, chhakki / chhakka, satti / satta, atthi / attha, navvi / nashsha, dashshi / dashsha. 2. shots scoring multiple runs in the game of cricket: SÉÉèEÉ®, ¹É]EÉ®.
12
The current status of dictionary Number of entries 375 Dictionary clickclick Nouns Noun morphology suffixes Verbs Verb morphology suffixes
13
The current status of rulebase Number of rules is 1050. Verb morphology (Simple and conjunct verbs) –Tense (Past, Present, Future) –Aspect of tense (Progress, complete, custom) –Voice (Passive voice) –+lÉÇ (imperative, should, negative) –Ability, intention etc. for conjunct verbs only.
14
The current status of rulebase [contd.] Noun morphology –Number –With case marker ( ºÉɨÉÉxªÉ° {É) Case when penultimate vowel is either > or <Ç e.g. ¨ÉÚ±É - ¨ÉÖ±Éä ( Plural )
15
The current status of rulebase [contd.] Relation labels used so far agt, obj, gol, aoj, and, or e.g. ¨ÉÖ±ÉÉÆxÉÒ +ÉƤÉä JÉɱ±Éä xÉ´½iÉäÃ. obj(eat(icl>do).@entry.@pred.@past.@not. @complete, mango(icl>fruit):08.@pl) agt(eat(icl>do).@entry.@pred.@past.@not. @complete, child(icl>person):00.@pl)
16
Plans Adjective morphology Pronoun morphology Relation labels handling for corpus sentences. For simple sentence only.
17
THANK YOU
18
References: Damle, Moro Keshav (1970). Shastriya marathi vyakarana. [SaswrIya marATI vyAkaraNa]. (Ed: K. S. Arjunwadkar). Pune: Deshmukh & Co. Meying, Zhu (2000) EnConverter specifications, version 2.1. Tokyo: UNU/IAS/UNL Center. Meying, Zhu (2002) UNL specifications, version 3 edition 1. Tokyo: UNU/IAS/UNL Center. Valambe, M. R. (2001) Sugam marathi vyakaran lekhan [sugama marATI vyAkaraNa leKana]. Pune: Nitin Prakashan.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.