Estonian Word Sketches: the Case of Multi-Word Lexical Verbs Maria Khokhlova (St. Petersburg State University) Jelena Kallas (Institute of the Estonian.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

Verbals and Verb Phrases
Syntax. Definition: a set of rules that govern how words are combined to form longer strings of meaning meaning like sentences.
Chapter 4 Syntax.
Greenberg 1963 Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements.
Linguistics, Morphology, Syntax, Semantics. Definitions And Terminology.
Verbs Longman Student Grammar of Spoken and Written English Biber; Conrad; Leech (2009, p ) Verbs provide the focal point of the clause. The main.
Statistical NLP: Lecture 3
Used in place of a noun pronoun.
Ian Cushing English teacher, Surbiton High School UK Linguistics Olympiad Committee Education Committee, Linguistics Association of Great Britain Grammar.
Word Order Choices Chapter 12
1 Words and the Lexicon September 10th 2009 Lecture #3.
Corpus 3 Corpus-based Description. Aspects of corpus-based studies lexis, morphology, syntax and discourse. fig. 3.1 A classification of corpus-based.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Outline of English Syntax.
PARTS OF SPEECH 1 The principles of the traditional classification of the English vocabulary 2 Notional and functional parts of speech. 3 The field structure.
Corpus Linguistics Case study 2 Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics, London: Longman,
Phrasal Verbs Ed McCorduck English 402--Grammar SUNY Cortland
GRAMMAR APPROACH By: Katherine Marzán Concepción EDUC 413 Prof. Evelyn Lugo.
The journey back home By Cako- Creative CommonsCako Paco travels to New Zealand: The journey back home.
Parsing Estonian with Constraint Grammar Kaili Müürisep Institute of Cybernetics at Tallinn Technical University.
Paul Lwere Teacher of English Language Kyambogo College School ©2013.
Overview Project Goals –Represent a sentence in a parse tree –Use parses in tree to search another tree containing ontology of project management deliverables.
Lecture 14 & Lecture 15 Passive Voice 1.Active sentence and passive sentence As has been pointed out, a sentence/clause whose predicator (predicate verb)
Spanish FrameNet Project Autonomous University of Barcelona Marc Ortega.
Linguistic Essentials
The Parts of Speech The 8 Parts of Speech… Nouns Adjectives Pronouns Verbs Adverbs Conjunctions Prepositions Interjections.
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
C HAPTER 11 Grammar Fundamentals. T HE P ARTS OF S PEECH AND T HEIR F UNCTIONS Nouns name people, places things, qualities, or conditions Subject of a.
Unit 8 Syntax. Syntax Syntax deals with rules for combining words into sentences, as well as with relationship between elements in one sentence Basic.
GoBack definitions Level 1 Parts of Speech GoBack is a memorization game; the teacher asks students definitions, and when someone misses one, you go back.
What do we mean by Syntax? Unit 6 – Presentation 1 “the order or arrangement of words within a sentence” And what is a ‘sentence’? A group of words that.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Corpus search What are the most common words in English
Human Language Technology Part of Speech (POS) Tagging II Rule-based Tagging.
SYNTAX.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Basic Syntactic Structures of English CSCI-GA.2590 – Lecture 2B Ralph Grishman NYU.
Category 2 Category 6 Category 3.
Word classes and part of speech tagging Chapter 5.
Parts of Speech By: Miaya Nischelle Sample. NOUN A noun is a person place or thing.
Syntax- the object study. What is syntax?  Syntax is the study of the structure  of sentences.  Syntax analyzes how words combine to form sentences.
Applying Word Sketches to Russian Máša Khokhlova St.Petersburg State University
GERUND Научный руководитель– Агаева Алия А.. The –ing Forms in English.
English Grammar Lecture 10: Phrasal Verbs
The theory of word classes in modern grammar studies
The categorial System of English verbal
Parts of Speech Review.
Words, Phrases, Clauses, & Sentences
Appendix A: Basic Grammar and Punctuation Reference
PHRASE.
Daily Grammar Practice Week One Grade 8
Statistical NLP: Lecture 3
Revision Outcome 1, Unit 1 The Nature and Functions of Language
ALL ABOUT VERBS GRAMMAR SUMMARY.
Chapter 4 Basics of English Grammar
Syntax.
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
PRELIMARIES Dr. Sami Ben Salamh
Language Variations: Japanese and English
FIRST SEMESTER GRAMMAR
Daily Grammar Practice Week One Grade 8
©2004 Pearson Education, Inc., publishing as Longman Publishers.
PREPOSITIONAL PHRASES
Linguistic Essentials
Chapter 4 Basics of English Grammar
What is a clause? A clause is a group of related words containing a subject and a predicate. It is different from a phrase in that a phrase does not include.
Traditional Grammar VS. Generative Grammar
The 7Cs: A Pedagogical Framework for Grammar Teaching and Learning
Adverbs and adverbial What about "Wendy could see a house at the end of the street“? What is ‘at the end of the street? This sentence is ambiguous. First.
Presentation transcript:

Estonian Word Sketches: the Case of Multi-Word Lexical Verbs Maria Khokhlova (St. Petersburg State University) Jelena Kallas (Institute of the Estonian Language, Tallinn University) 1

Estonian Reference Corpus input ca 250mln, 10 mln sample tagged for sentences, clauses, and morphology (POS- tag and inflections) e.g. majas /S/maja/s/sg_in (‘in the house’) morphological ambiguity 2

Grammatical relations 1  Theoretical background – valency theory  Syntactic and collocational properties of nouns, adjectives, adverbs and verbs 3

Grammatical relations 2 for noun → case, adposition and infinitive government; noun+noun, adjective+noun, noun (subject)+verb and noun (object)+verb collocations; for adjective → case, adposition, infinitive government and adverb+adjective collocation; for verb → object, case, adposition, infinitive government and verb+adverb collocations; for adverb → case government and adverb+adverb collocations. 4

Word Sketch Grammar 1 Constraint Grammar Formalism “ The task of the constraints is basically to discard as many alternatives as possible, the optimum being a fully disambiguated sentence with one syntactic reading only.” Karlsson, Fred Constraint Grammar as a framework for parsing running text. Helsinki,

Word Sketch Grammar 2 Formal Grammar of Estonian. Tartu, 2002  1,240 morphological disambiguation rules  47 clause boundary detection rules  180 morphosyntactic mapping rules  1,118 syntactic constraints 6

Grammatical relations 3 (38 grammatical relations)  relations, which correspond to POS-tag and morphological inflections (subject, object, adverbials, modifiers);  oblique objects of noun prepositional phrases (*TRINARY =noun_pp_%s) (ingl. fear of/for something);  oblique objects and adverbials of particle verbs (*TRINARY =pp_%s_ühendverb) (ingl. to put something on);  oblique objects of prepositional verbs (*TRINARY =verb_pp_%s) (ingl. to struggle for something). 7

Grammatical relations 4  constructions with conjunctions ja/või ‘and/or’, kui/nagu ‘as’  predicative (complements of the copula-like verb olema ‘be’)  various combinations of finite verbs with non-finite verbs 8

9

10

11

12

Multi-Word Lexical Verbs 1 In English:  phrasal verbs (verb+adverbal particle), e.g. carry out, find out, pick up.  prepositional verbs (verb+preposition), e.g. listen to, talk about.  phrasal-prepositional verbs (verb+adverbial particle+preposition), e.g. get away with. Longman student grammar of spoken and written English. Longman,

Multi-Word Lexical Verbs 2 In Estonian: Particle verb (verb+affixal adverb), e.g. alla kukkuma ‘fell down’, läbi lugema ‘read through’, välja naerma ‘to ridicule’. Expression verb (verb+noun phrase), e.g. aru saama ‘to understand’. Estonian Language. Edited by Mati Erelt. Tallinn,

Treatment of Phrasal Verbs 3 (lists of affixal adverbs) *TRINARY =pp_%s_ühendverb [tag!="V"] 1:[tag="V"&word!="ei"&features!="maks"&features!="mas"&features!="mast" &features!="mata"&features!="tud"&lemma!="ole.*"] 2:[tag="S"] 3:[tag="D"&(word="alla"|word="alt"|word="edasi"|word="eemale"|word="esil e"|word="ette"|word="juurde"|word="järele"|word="kaasa"|word="kinni"|word ="kokku"|word="kõrvale"|word="külge"|word="lahku"|word="lahti"|word="ligi" |word="läbi"|word="maha"|word="mööda"|word="otsa"|word="peale"|word=" pealt"|word="ringi"|word="sisse"|word="taga"|word="tagant"|word="tagasi"|w ord="täis"|word="vahele"|word="vastu"|word="välja"|word="ära"|word="üle"| word="üles"|word="üleval"|word="ümber")] *DUAL =verb_väljendverb/adv_väljendverb 2:[tag="V"] 1:[tag="X"] [tag!="V"] [tag!="V"] 1:[tag="X"] 2:[tag="V"] 15

16

Treatment of Phrasal Verbs 4 (lists of pre-and postpositions) * TRINARY =verb_pp_%s 1:[tag="V"] 2:[tag="S"] 3:[tag="K"&word!="allapoole"&word!="altpoolt"&word!="eespool"&word!="enne"&wo rd!="hoolimata"&word!="ilma"&word!="keset"&word!="kesk"&word!="koos"&word!="ku ni"&word!="piki"&word!="põiki"&word!="päri"&word!="risti"&word!="sealpool"&word!=" sealtpoolt"&word!="seespool"&word!="siiapoole"&word!="siinpool"&word!="siitpoolt"& word!="sinnapoole"&word!="sissepoole"&word!="teispool"&word!="teispoole"&word!= "tänu"&word!="väljapoole"&word!="väljaspool"&word!="väljaspoolt"&word!="ülalpool" &word!="ülaltpoolt"&word!="ülespoole"&word!="ülevalpool"&word!="ülevaltpoolt"&wor d!="läbi"&word!="mööda"&word!="tükkis"&word!="ühes"&word!="üle"] [tag!="V"&tag!="X"&word!="alla"&word!="alt"&word!="edasi"&word!="eemale"&word !="esile"&word!="ette"&word!="juurde"&word!="järele"&word!="kaasa"&word!="kinni"& word!="kokku"&word!="kõrvale"&word!="külge"&word!="lahku"&word!="lahti"&word!=" ligi"&word!="läbi"&word!="maha"&word!="mööda"&word!="otsa"&word!="peale"&word !="pealt"&word!="ringi"&word!="sisse"&word!="taga"&word!="tagant"&word!="tagasi" &word!="täis"&word!="vahele"&word!="vastu"&word!="välja"&word!="ära"&word!="üle "&word!="üles"&word!="üleval"&word!="ümber"] 17

Frequency lists  adverbial particles of particular verb  nominal components of expression verbs on condition that this nominal component has particular POS-tag X 18

19

Further Developments Corpus:  morphological disambiguation  eliminations of mistakes  balance between genres SkE:  search within clauses  word sketches for multi-word verbs  Tickbox Lexicography template  GDEX 20

Thank You! 21