Linguist’s Assistant: An Analysis of the Required Modifications when Converting a Computational Tagalog Grammar into an Ayta Mag-Indi Grammar The 13th.

Slides:



Advertisements
Similar presentations
Grammar Recipes, Grammar Ideas and Writing Labs
Advertisements

Language and Cognition Colombo, June 2011 Day 2 Introduction to Linguistic Theory, Part 4.
Introduction to phrases & clauses
Words Words Words! Helping ELL Students Develop Vocabulary.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Identify the grammatically incorrect sentences and correct them. Write Tama (Correct) or Hindi Tama (Incorrect)
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Matakuliah: G0922/Introduction to Linguistics Tahun: 2008 Session 10 Syntax 1.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
Copyright, 1996 © Dale Carnegie & Associates, Inc. FOCUS TIP For additional advice see Dale Carnegie Training® Presentation Guidelines.
Basics of the English grammar
Subject Pronouns Object pronouns 1. How many subject pronouns are in the English language? List the subject pronouns. 2. What part of speech always follows.
Syntax Nuha AlWadaani.
Chapter 4 Basics of English Grammar Business Communication Copyright 2010 South-Western Cengage Learning.
Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.
Noun Markers and Pronouns
Wendy Scarlett Govea Donjuán Juan Carlos Ordoñez Reyes Marisol Alvarado Rebolloso PRONOUNS.
Dr. Kenny. COPY THE FOLLOWING: It was (she, her) who came with us to the movies. (I, Me) gave into the pressure. All of us would rather be with (he, him)
Markers What are markers? And Why are they Important.
Grammar Review Name___________ Title____________ Author _________ Parts of Speech COPY A SENTENCE FROM YOUR BOOK. Label the parts of speech of each word.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Deep structure (semantic) Structure of language Surface structure (grammatical, lexical, phonological) Semantic units have all meaning components such.
Making it stick together…
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
3 Phonology: Speech Sounds as a System No language has all the speech sounds possible in human languages; each language contains a selection of the possible.
Unit 1 Language Parts of Speech. Nouns A noun is a word that names a person, place, thing, or idea Common noun - general name Proper noun – specific name.
Welcome to our Parent Workshop. Example questions.
Inflection. Inflection refers to word formation that does not change category and does not create new lexemes, but rather changes the form of lexemes.
King Faisal University [ ] 1 E-learning and Distance Education Deanship Department of English Language College of Arts King Faisal University Introduction.
When our vacation ended Piper and Levy climbed up in the tree, and they would not answer their mother. 1. Which answer contains the prepositional phrase.
Lecture 1 Sentences Verbs.
Nichole Carlton & Mary Beth Secor ECED 4300 B Dr. Tonja Root Fall 2007 Nichole Carlton & Mary Beth Secor, ECED 4300 B, November 15, 2007.
D.L.P. – Week Twelve GRADE SEVEN. Day One – Skills Indenting A writer should indent (start a new line and move to the right five spaces) the beginning.
Syntax Parts of Speech and Parts of the Sentence.
Unit 2 PRONOUNS.
All the Word (ATW) Introduction: The software developed at ATW is a fully functional, large scale, multi-lingual natural language generator designed and.
Child Syntax and Morphology
The 8 Parts of Speech An Interactive PowerPoint
Pronoun Usage.
Daily Grammar Practice Week One Grade 8
Level (Common European Framework, A1 / A2).
Lecture 4b: Verb Processes
Chapter Eight Syntax.
Part I: Basics and Constituency
Telegraphic speech: two- and three-word utterances
Adjectival, adverbial, and nominal
PRONOUN CASE NINTH GRADE ENGLISH.
Chapter 4 Basics of English Grammar
Certificate III in ESL (Further Studies)
By: Mrs. Smith St. Mary’s Middle School English
General Writing Concerns
GRAMMAR قواعد اللغــــــــــة الإنجليزية
Grammar and Vocabulary Development
Pronouns – Part One Grade Eight.
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Chapter Eight Syntax.
Introduction to Linguistics
Synonymous Word = cousins _________= cousins.
Class IX: Construct Chain Dr. Esa Autero
Daily Grammar Practice Week One Grade 8
Possessive adjective clauses
Chapter 4 Basics of English Grammar
OBJECT PRONOUNS.
Dr. Bill Vicars Lifeprint.com
Pronouns.
REPORTED SPEECH A short guide.
Vocabulary/Lexis LEXIS: n., collective, uncountable
Presentation transcript:

Linguist’s Assistant: An Analysis of the Required Modifications when Converting a Computational Tagalog Grammar into an Ayta Mag-Indi Grammar The 13th National NLP Research Symposium April 21st, 2017 Dr. Tod Allman Graduate Institute of Applied Linguistics todallman@yahoo.com

Presentation Overview 1) Introduction to Linguist’s Assistant (LA) 2) Results of the Ayta Mag-Indi Project 3) Surface Structure Similarities between Tagalog and Ayta Mag-Indi 4) Deep Structure Similarities between Tagalog and Ayta Mag-Indi 5) Tests to determine the Quality of LA’s texts 2

Linguist’s Assistant: A Multilingual Natural Language Generator Generates initial draft translations that are: easily understandable grammatically perfect semantically equivalent to the source documents at approximately a fifth grade reading level Employs linguistic techniques rather than stochastic techniques The texts quadruple the productivity of experienced mother-tongue translators 3

Model of Linguist’s Assistant Show video after this slide 4

Five Components of every NLG System 1) Semantic Representations 2) Ontology 3) Transfer Grammar 4) Synthesizing Grammar 5) Lexicon 5

LA’s Semantic Representational System Semantically simple concepts in structurally simple sentences. 6

LA’s Feature System - Nouns Number Singular, Dual, Trial, Quadrial, Plural, Paucal Participant Tracking First Mention, Routine, Generic, Interrogative, Frame Inferable, … Polarity Affirmative, Negative Proximity Near Speaker and Listener, Near Speaker, Near Listener, Remote within Sight, Remote out of Sight Person First, Second, Third, First Inclusive, First Exclusive 7

LA’s Ontology Concepts are Precisely Defined break-A someone breaks something (John broke the window.) break-B to break a bone (John broke his leg.) break-C to break or disobey a law (John broke the law.) break-D something breaks (intransitive) The window broke. break-E to break a promise (John broke his promise.) 8

LA’s Transfer Grammar 9

LA’s Transfer Grammar 1) Insert Complex Concepts ‘to sign,’ ‘foreigner,’ ‘blind,’ etc. 2) Collocation Correction ‘ganda’ -> ‘buti’ (good book, person, food) 3) Theta Grid Adjustment Rules X respects Y -> X lifts up Y’s name X loves Y -> X sits happily with Y X obeys Y -> X hears Y’s talk 4) Structural Adjustment Rules ‘ganda’ is the word usually used to translate the English word ‘good.’ But for a ‘good book’ or ‘good food’ or ‘good man,’ we use ‘buti.’ 10

LA’s Synthesizing Grammar 11

LA’s Synthesizing Grammar 1) Spellout Rules Insert Case Markers (ang/si, ng/ni, sa/kay) 2) Phrase Structure Rules Put constituents in their proper order 3) Pronoun Rules Identify where pronouns can be used 4) Morphophonemic Rules Change the Relativizer ‘na’ to ‘-ng’ “The man that saw John …” “lalaking nakakita kay Juan …” 12

Ayta Mag-Indi Approximately 5,000 speakers Spoken near Pampanga Language status: stable and developing Lexical similarity with Filipino: 38% Lexical similarity with Kapampangan: 51% Ethnologue.com 13

Number of Meetings Tagalog Ayta Mag-Indi Story #1 (2 pages) 38 6 50 7 Note: For Ayta Mag-Indi I didn’t have to change even one rule in the transfer grammar. The only rules I changed were in the synthesizing grammar, particularly the spellout rules and the morphophonemic rules. I’ll discuss these later. 14

Transfer Grammar Rules Edited New Complex Concept Insertion Rules Feature Adjustment Rules Styles of Direct Speech Target Tense/Aspect/Mood Rules Relative Clause Strategies Collocation Correction Rules Genitival Noun-Noun Relationships Theta Grid Adjustment Rules Structural Adjustment Rules 15

Example of a Transfer Rule 16

Synthesizing Grammar Rules Edited New Feature Copying Rules Spellout Rules 5 Clitic Rules Movement Rules Phrase Structure Rules 1 Pronoun Identification Rules Pronoun Spellout Rules Morphophonemic Rules 34 Find/Replace Rules 17

Example of a Synthesizing Rule 18

Tagalog / Ayta Mag-Indi Case Markers Common Proper Ergative ng / un ni / -n Absolutive ang / ya si / si Oblique sa / sa kay / kan The Tagalog stem is ‘suntok’ and the Ayta stem is ‘dugê’. English: John hit Bill. Tagalog: Sinuntok ni Juan si Bill. Ayta Mag-Indi: Dinugun Juan si Bill. 19

LA Rule that inserts Tagalog Case Markers 20

LA Rule that inserts Ayta Mag-Indi Case Markers 21

Tagalog / Ayta Mag-Indi Personal Pronouns Absolutive Ergative Oblique 1st Sg. ako / aku ko / ku akin / kangku 1st Incl. tayo / kitamu natin / tamu atin / kantamu 1st Excl. kami / kay namin / yan amin / kanyan 2nd Sg. ka / ka mo / mu iyo / kamu 2nd Pl. kayo / kaw ninyo / yu inyo / kamuyu 3rd Sg. siya / ya niya / na kanya / kana 3rd Pl. sila / sila nila / la kanila / kalla 22

Tagalog / Ayta Mag-Indi Possessive Pronouns 1st Sg. aking/ku 1st Incl. ating/tamu 1st Excl. aming/yan 2nd Sg. iyong/mu 2nd Pl. inyong/yu 3rd Sg. kanyang/na 3rd Pl. kanilang/la 23

Tagalog Pronoun Rule 24

Tagalog Possessive Pronouns “I saw John’s book.” “Nakita ko ang aklat ni Juan.” “I saw his book.” “Nakita ko ang kanyang aklat.” I saw John’s book. Nakita ko ang aklat ni John. I saw his book. Nakita ko ang kanyang aklat. If we put the possessive pronoun after the noun, it becomes Nakita ko ang aklat niya. I saw my book. Nakita ko ang aking aklat. or Nakita ko ang aklat ko. 25

Tagalog Possessive Pronouns 26

Ayta Mag-Indi Possessive Pronouns “I saw John’s book.” “Nakit kuy libron Juan.” “I saw his book.” “Nakit kuy libron na.” 27

Ayta Mag-Indi Possessive Pronouns 28

Ayta Mag-Indi Morphophonemic Rule 29

Tagalog and Ayta Mag-Indi Particles Relativizer na ya Complementizer Possessive Marker ni (mata ni Juan) -n (mata-n Juan) Adjectivizer ma- (ma-linis) Adverbializer (ma-buti) (ma-ngêd) Verb Phrase Ligature -ng (Pumarito ka-ng mabilis …) (Maku ka-n tambêng …) ‘mata ni Juan’ John’s eyes ‘malinis’ clean ‘mabuti’ thoroughly ‘pumarito kang mabilis’ ‘come quickly …’ 30

Similarities at Surface Structure English: Title: Melissa’s Eyes are Sore Tagalog: Pamagat: Makirot ang mga mata ni Melissa. Ayta Mag-Indi: Pamagat: Makirot ya mani matan Melissa. 31

Similarities at Surface Structure English: But Melissa was not happy because her eyes were very sore. Tagalog: Ngunit hindi masaya si Melissa dahil napakakirot nang kanyang mga mata. Ayta Mag-Indi: Nuwa asê masaya si Melissa gawan napakakirot un mani mata na. But Melissa was not happy because her eyes were very sore. Tagalog can put a possessive pronoun after the noun, but it’s a different form. This sentence would become “… nang mga mata niya.” 32

Similarities at Deep Structure Deletion of Verb Phrase Ligature English: Melissa shouted, “Alex, come into my house.” Tagalog: Sumigaw si Melissa, "Alex, pumarito ka sa loob ng aking bahay.” Ayta Mag-Indi: Nan-angaw si Melissa, "Alex, maku ka sa lalên bali ku.” Melissa shouted, “Alex, come into my house.” Both languages delete their verbal connectors ‘-ng/-n’ if the next word is a preposition that begins with ‘sa’. 33

Similarities at Deep Structure Pronominal Length English: He gave a book to you. Tagalog: Binigyan ka niya ng libro. gave you he Ergative book. Ayta Mag-Indi: Binyan na ka-n libru. gave he you-Ergative book. He gave the book to you. In both languages, single syllable pronouns precede multi-syllable pronouns. In this Ayta Mag-Indi example, both pronouns are single syllable, so the subject pronoun precedes the indirect object pronoun. 34

Malayo-Polynesian Language Family Tree The next language I might work in is Rinconada, which is very closely related to Bikol. So it should be much more similar to Tagalog than Ayta Mag-Indi. 35

Things LA Cannot Do gising “to wake up” gumising - nagising takas “to escape” tumakas - nakatakas ‘gumising’ means to wake up because you’ve had enough sleep, but ‘nagising’ means to wake up because of some disturbance such as a loud Jeepney, a dream, an earthquake, etc. ‘tumaka’ means a premeditated escape, but ‘nakatakas’ means a spur of the moment escape. Our semantic representations don’t include this kind of information, so the software can’t generate these forms. So we choose whichever one we think is the most common. 36

Experiments for Testing the Content and Quality of the Texts Backtranslation Experiments Comprehension Questions Productivity Experiments Quality Experiments 37

Quality Experiments for Jula Manual Equal 12 11 17 38

Quality Experiments for Korean LA Manual Equal 88 71 33 39

Quality Experiments for Tagalog LA Manual Equal 53 60 56 24 control questionnaires – 2 outliers 40

Linguist’s Assistant: An Analysis of the Required Modifications when Converting a Computational Tagalog Grammar into an Ayta Mag-Indi Grammar The 13th National NLP Research Symposium April 21st, 2017 Dr. Tod Allman Graduate Institute of Applied Linguistics todallman@yahoo.com