Detection of Links between Words in the Task of Syntactic-Semantic Analysis of Russian Texts. Dmitry V. Merkuryev Saint-Petersburg State University, Russia.

Slides:



Advertisements
Similar presentations
Mini Presentations: How To
Advertisements

THEORY-BOOK.
Brief Introduction to Logic. Outline Historical View Propositional Logic : Syntax Propositional Logic : Semantics Satisfiability Natural Deduction : Proofs.
Valentina Tenedini Istituto Regina M. Adelaide The English of science & technology.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Statistical NLP: Lecture 3
GRAMMAR & PARSING (Syntactic Analysis) NLP- WEEK 4.
Module 14 Thought & Language. INTRODUCTION Definitions –Cognitive approach method of studying how we process, store, and use information and how this.
Natural Language Processing - Feature Structures - Feature Structures and Unification.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
NLP and Speech 2004 Feature Structures Feature Structures and Unification.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Introduction to Computational Linguistics Lecture 2.
An interactive environment for creating and validating syntactic rules Panagiotis Bouros*, Aggeliki Fotopoulou, Nicholas Glaros Institute for Language.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Creation of a Russian-English Translation Program Karen Shiells.
Irene Díaz 1, Camino R. Vela 1 1 Computer Science Department. University of Oviedo (SPAIN) s 1.
The Eight Parts of Speech
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
What is a Sentence? By Jaye Lynn Trapp.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
APPLICATIONS OF CONTEXT FREE GRAMMARS BY, BRAMARA MANJEERA THOGARCHETI.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Finding High-frequent Synonyms of a Domain- specific Verb in English Sub-language of MEDLINE Abstracts Using WordNet Chun Xiao and Dietmar Rösner Institut.
LAS LINKS DATA ANALYSIS. Objectives 1.Analyze the 4 sub-tests in order to understand which academic skills are being tested. 2.Use sample tests to practice.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
8 Parts of Speech Noun Pronoun Adjective Verb Adverb Preposition Conjunction Interjection.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
group ПР-09-4 м Shevchenko Lilia
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Grammars Grammars can get quite complex, but are essential. Syntax: the form of the text that is valid Semantics: the meaning of the form – Sometimes semantics.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Chapter 3 Describing Syntax and Semantics
Artificial Intelligence: Natural Language
Theory of Programming Languages Introduction. What is a Programming Language? John von Neumann (1940’s) –Stored program concept –CPU actions determined.
The Functions and Purposes of Translators Syntax (& Semantic) Analysis.
 There must be a coherent set of links between techniques and principles.  The actions are the techniques and the thoughts are the principles.
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
By Benjamin Newman.  Define “Cognitive Rigor” or “Cognitive Demand”  Understand the role (DOK) Depth of Knowledge plays with regards to teaching with.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
WORDS The term word is much more difficult to define in a technical sense, and like many other linguistic terms, there are often arguments about what exactly.
Unit 8 Syntax. Syntax Syntax deals with rules for combining words into sentences, as well as with relationship between elements in one sentence Basic.
LANGUAGE IMPAIRED. ELIGIBILITY CRITERIA Language Impaired (LI) An impairment in the language system is an abnormal processing or production of: Form including.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Levels of Linguistic Analysis
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Category 2 Category 6 Category 3.
NATURAL LANGUAGE PROCESSING
7.2 Programming Languages - An Introduction to Informatics WMN Lab. Hye-Jin Lee.
CSC 594 Topics in AI – Natural Language Processing
Statistical NLP: Lecture 3
Natural Language Processing (NLP)
To support your understanding of the Literacy and Numeracy demands in your workplace and how they relate to the ability and efficiency of your employees.
Extracting Semantic Concept Relations
THEORY-BOOK.
Parts of Speech Mr. White English I.
What would be our focus ? Geometry deals with Declarative or “What is” knowledge. Computer Science deals with Imperative or “How to” knowledge 12/25/2018.
Levels of Linguistic Analysis
English parts of speech
What would be our focus ? Geometry deals with Declarative or “What is” knowledge. Computer Science deals with Imperative or “How to” knowledge 2/23/2019.
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Natural Language Processing (NLP)
Artificial Intelligence 2004 Speech & Natural Language Processing
COMPILER CONSTRUCTION
A Link Grammar for an Agglutinative Language
Parts of Speech.
Natural Language Processing (NLP)
Presentation transcript:

Detection of Links between Words in the Task of Syntactic-Semantic Analysis of Russian Texts. Dmitry V. Merkuryev Saint-Petersburg State University, Russia Mathematics and Mechanics Faculty Department of Computer Science Petrozavodsk, May 21st, 2008

Content  1. Introduction. The task of Syntactic-Semantic Analysis of Russian Texts.  2. Syntactic and semantic analyzers.  3. Main principles of V.A Tuzov’s theory.  4. Sentence analysis.  5. The detection of links between words.  6. Examples.  7.Conclusions.

1. Introduction. The task of Syntactic-Semantic Analysis of Russian Texts. Natural Language Processing (NLP) is one of the most actual tasks of modern computer science. Professor V.A.Tuzov's functional model [1], [2] is an adequate solution for natural language formalization. Syntactic-semantic analyzer is the unique working system based on this theory. It allows getting syntactic structure of Russian sentences which matches with their semantic one. The analyzer is able to solve word sense disambiguation problem for the most sentences of journal and even literature Russian texts. The detection of links between words is one of the most significant operations of the syntactic-semantic analyzer. This operation allows getting right semantic alternative of a word in sentence context.

2. Syntactic and semantic analyzers. Some of the most actual NLP parsers: DictaScope (Russian language syntactic parser) [3] The program automatically builds a word subordination tree.It also gets grammar values of words in a sentence. AOT (automatic handling of texts, Russian language) [4] This program builds semantic graph and performs initial semantic analysis of a text. Link Grammar Parser (syntactic parser of English) [5] The system assigns to a sentence a syntactic structure, which consists of a set of labeled links connecting pairs of words. All of these parsers have restrictions because of word sense disambiguation problem. Therefore, Professor Tuzov’s Syntactic-semantic analyzer is the unique system.

3. Main principles of V.A Tuzov’s theory. Thesis 1. Language is algebraic system {f1, f2,..., fn, M}, where fi is a basic function and M is data structure (basic concepts) of a given language. Thesis 2. Every word of language is the name of the function. This function allows us to evaluate the semantics of given word. Each sentence is a superposition of these functions. Thesis 3. Grammar is linked with semantics of language and represented by semantic dictionary.

A function that corresponds to a word has semantic arguments and semantic-grammar types. Semantic arguments and grammar types consist of semantic classes and prepositional-case forms. Examples: $16~!Вин($16~! “Accusative”) $15~!Где($15~!”Where”) $ - notation of semantic class !Вин, !наВин(“on Accusative”), !Дат(“Dative”), etc – notations of prepositional-case forms !Куда(“Where to”),!Где,!Кому(“Whom”), etc – notations of generalized grammar types Semantic-grammar types define links where this word connects to other words as an argument. Semantic arguments determine links where this word connects other words as arguments (by their semantic-grammar types).

Example (results from the analyzer): Он едет в город (“He is going into the city”). Syntax tree of the едет в город ) ) Semantic values of each word and links between them: Он (“He”) ** ОН {Мест._Муж $17() semantics: ОН () \\ links:

едет(“is going”) ** \!ОНА$17\!ОНО$17,Z2:ПРИЧИНА$1/37/05\ПРИКАЗ$ ~!Почему,Z3:НЕЧТО$1~!поДат,Z4:НЕЧТО$1~!Откуда,Z5: НЕЧТО$1~!Куда,Z6: ТРАНСПОРТ$121324~!Тв\!наПред) semantics: ЕХАТЬ Oper01(Z1,ПОЕЗДКА$15402(ПОЧЕМУ:Z2,ПОДАТ:Z3,ОТКУДА:Z4,КУДА:Z5,ТВ:НАПРЕД:Z6)) \\ links: => Z5: => в (“into”) ** В {Предлог. ПОСЕЛЕНИЕ$123~!Вин) semantics: В Y1>Direkt(Y1:,ВНУТРИ$12/313/05(ВИН:Z1)) \\ links: Z1: => Z5:

город (“the city”) ** ГОРОД {Сущв._Муж_Неодуш $12314(Z1 :СТРАНА$1231~!Род) semantics: ГОРОД (РОД:Z1) \\ links: Z1: Classifier of basic concepts. Basic concept is a word which meaning can’t be expressed through more simple concepts. There are more than basic concepts (nouns and adjectives) in the semantic dictionary. Other more than words (derived words) are expressed using superposition of basic concepts and basic functions. Basic concepts are organized in hierarchical tree (classifier). Main rules: All words of a class inherit the same semantic properties from parent class. Also words of the class have its own specific characteristics. The name of the root class is НЕЧТО("SOMETHING”). There are more than 1500 classes.

Examples: $1 Noun НЕЧТО(“SOMETHING”), СУЩЕСТВИТЕЛЬНОЕ(“NOUN”),… $110 Noun AO (Abstract Object) Idea ПОНЯТИЕ (“CONCEPT”),… $1100/01 Noun АО Idea => Abstract-Concrete АБСТРАКТНЫЙ(“ABSTRACT”), КОНКРЕТНЫЙ(“CONCRETE”),… $12 Noun PO (Physical Object) МАТЕРИЯ(“SUBSTANCE”), ПРОСТРАНСТВО(“SPACE”), ТЕЛО(“BODY”),… $122 Noun PO Nature ПРИРОДА(“NATURE”),… $122/1 Noun PO Nature Weather ПОГОДА(“WEATHER”),… $12211 Noun PO Nature Plants Trees ДЕРЕВО(“TREE”), ДУБ(“OAK”), СОСНА(“PINE”),… Basic functions. Basic functions describe relationship between its arguments. We can express the formal meanings of each derived word by superposition of basic concepts and basic functions.

Examples: And(x,y) x and y Caus(x,y) x causes of y Cont(x) x is continuing Content(x,y) x contents y Control(x,y) x controls y Func(x) x occurs Hab(x,y) x has y Incep(x) x is starting Lab(x,y) x exposes y Loc(x,y) x situated in y Magn(x) x higher of norm Mult(x) multiset of x Ne(x) negation of x Oper(x,y) x performs y Rel(x,y) x has a relation to y etc… ЛЕСНОЙ A1>Rel(A1:НЕЧТО$1,ЛЕС$122412) (“forest”, adjective, “something has a relation to a forest”) КОНСТРУИРОВАТЬ Caus(Z1,IncepFunc(КОНСТРУКЦИЯ$1/422(ВИН:Z2))) (“construct”, verb, “Z1 causes the appearance of a construction”)

Semantic dictionary. It consists of more than Russian words. The dictionary can be divided into 2 main parts: syntactic and semantic. Examples: ПОЛУЧИТЬ (“get”, verb) Syntactic: ПОЛУЧИТЬ N%~ПОЛУЧЕНИЕ$15310/0/04({Z1: НЕЧТО$1~!Им,Z2 : НЕЧТО$1~!Откуда\!Изо\!Ото\!сРод,Z3: !заВин,Z4: ПИЩА$101/0\НЕЧТО$1~!Вин}) Semantic: ПОЛУЧИТЬ N%~ПОЛУЧЕНИЕ $15310/0/04 (PerfCaus(Oper01(Uzor(Z1,ОТКУДА:Z2),Z3),Hab(Z1,РОД:Z4))) \\ НАГРАДА (“reward”, noun) Syntactic: НАГРАДА $1241/131/03({Z1: !Дат\!Род,Z2: !Тв,Z3: !заВин,Z4: !наВин}) Semantic: НАГРАДА $1241/131/03(ДАТ:РОД:Z1,ТВ:Z2,ЗАВИН:Z3,НАВИН:Z4) \\

4. Sentence analysis. The processing of natural language texts includes morphologic, word-by-word and syntactic-semantic analysis. The syntactic-semantic analyzer solves 2 main problems: - the selection of right semantic alternative of a word - the binding of selected alternatives in integrated construction. This system is represented with a bunch of recursive functions. Each function handles specific part of speech: verb, noun, preposition, adjective etc. 5. The detection of links between words. The detection of links is the main operation of the analyzer.It binds words or assembled constructions. There are 2 main types of interoperabilities between 2 constructions: - semantic arguments of incorporating construction interact with semantic-grammar types of affiliable construction (control link, e.g., verb and noun). - semantic-grammar types of a construction interact with semantic-grammar types of another one. (agreement link, e.g., adjective and noun)

Examples of links: - by case: pronoun and noun: его успех (“his - by semantic class, case, gender, number: adjective and noun: красивый лес(“beautiful forest”) links: Other examples are contained in the item 6 of the presentation.

Dictionary articles of two neighboring words after the first steps of text processing have following structures: ::= < { morphologic information, semantic-grammar types} (syntactic-semantic information, semantic arguments) > > Detection of links procedure check matches for all arguments of all semantic alternatives in a word1 with all arguments of all semantic alternatives in a word2. This procedure can be sufficiently optimized if use complex data structures (the optimization is the subject of current investigations).

6. An example of analyzed sentence. Люди любят отдыхать на природе(“People like to rest in nature”). Syntax tree of the любят отдыхать на природе ) ) Semantic values of each word and links between them: Люди (“People”) ** ЧЕЛОВЕК {Сущв._Муж_Одуш $1241(Z1: ВРЕМЯ$16\ЧЕЛОВЕК$1241\ПЛАНЕТА$12271~!Род) semantics: ЧЕЛОВЕК (РОД:Z1) \\ links: Z2:

любят (“like”) ** ЛЮБИТЬ {Глагол. N%~ЛЮБОВЬ$1241/40113/05(Z1: !Инфин,Z2: ЖИВОЙ$124~!ОНИ$17,Z3: !заВин) semantics: ЛЮБИТЬ Caus(ИНФИН:Z1,Oper02(ИМ:Z2,ПРИЯТНОСТЬ$1241/40012/03(ЗАВИН:Z3))) \\ links: => Z2: => отдыхать(“to rest”) ** ОТДЫХАТЬ {Глагол. N%~ОТДЫХ$15308(Imperf Z1 : #,Z2: !Ото,Z3: НЕЧТО$1~!Где) semantics: ОТДЫХАТЬ Oper01(Z1,ОТДЫХ$15308(ОТО:Z2,ГДЕ:Z3)) \\ links: Z3: =>

на (“in”) ** НА {Предлог. Z1:ПРИРОДА$122 \ГРАНИЦА$12/15/16\РАССТОЯНИЕ$12/32\ПЛОЩАДЬ$12316~!Пред) semantics: НА Y1>Loc(Y1:,ПРЕД:Z1) \\ links: Z1: => Z3: природе (“nature”) ** ПРИРОДА {Сущв._Жен_Неодуш $122(Z1 : !Род) semantics: ПРИРОДА (РОД:Z1) \\ links: Z1:

7.Conclusions. The syntactic-semantic analyzer based on V.A.Tuzov’s theory is the unique system. The detection of links between words allows getting the right semantic alternative of a word in a sentence. The correctness of text processing is more than 95%.

Bibliography, internet resources: [1] Tuzov V.A. Mathematical Model of Language. Saint-Petersburg State University Publishing House, 1984, p. 176 (in Russian). [2] Tuzov V.A. Computer Semantics of Russian Language. Saint-Petersburg State University Publishing House, 2004, p. 400 (in Russian). [3] [4] [5]