Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE.

Slides:



Advertisements
Similar presentations
Books published by the Department of Social Epistemology Ilya Kasavin elaborates a version of the social, non- classical theory of cognition. With the.
Advertisements

Grammar: Meaning and Contexts * From Presentation at NCTE annual conference in Pittsburgh, 2005.
Second Language Acquisition
Defining Syntax. Lec What is Syntax? O Syntax is the scientific study of sentence structure O Science: methodology of study O Hypothesis  observation.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Vocative: paradigmatization of address (with parallels from other case domains) Michael Daniel Moscow State University.
Fall Semantics Juan Carlos Guzmán CS 3123 Programming Languages Concepts Southern Polytechnic State University.
Syntax Lecture 4.
The Cultural Landscape: An Introduction to Human Geography
From Prototypes to Abstract Ideas A review of On The Genesis of Abstract Ideas by MI Posner and SW Keele Siyi Deng.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
1.3 The importance of Morphology.
Traditional Grammar Vs Linguistics
What are imperatives? Why do we care? The Solution: A brief syntactic background: Movement in X-bar theory: Paula Hagen  English Linguistics  University.
Chapter 6 Language.
Invitation to Computer Science 5th Edition
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Linguistic modeling of professional terminology Olga Klevtsova, Tyumen State University, Russia.
“LANGUAGES of the WORLD”: Ongoing projects Olga Romanova ● Institute of Linguistics, Russian Academy of Sciences (Moscow) CML-12 Croatia, September 2010.
DATA BASE “LANGUAGES OF THE WORLD” DB JM SOFTWARE SURVEY: 2010 Vladimir Polyakov (Institute of Linguistics of RAS )
3.000 years ago years ago Few knowings about the languages spoken Celts  Celtic languages were spoken al over Europe.  There were many tribes.
Vladimir Polyakov APPROACHES TO IMPROVEMENT OF SIMILARITY MEASURE, BASED ON THE STRUCTURE OF LANGUAGE DESCRIPTION IN THE DB "LANGUAGES OF THE WORLD"
Linguistics and Language
Historical linguistics Historical linguistics (also called diachronic linguistics) is the study of language change. Diachronic: The study of linguistic.
Explanation. -Status of linguistics now and before 20 th century - Known as philosophy in the past, now new name – Linguistics - It studies language in.
LANGUAGE FAMILY Groups of languages are related to each other Common ancestry Indo- European Languages Indo- European Languages Vocabulary, phonology,
“LANGUAGES of the WORLD” (Jazyki mira): A longitudinal project
Ferenc Havas Tallinn, Introduction to the project: Uralic Typology Database Project website:
FUNDAMENTALS OF LEXICOLOGY
English lexicology Lecture # 1 English lexicology Lecture # 1 Григорьева М.Б., 2011.
1 Term Paper Mohammad Alauddin MSS (Government &Politics) MPA(Governance& Public Policy) Deputy Secretary Welcome to the Presentation Special Foundation.
The Great Vowel Shift Continued The reasons behind this shift are something of a mystery, and linguists have been unable to account for why it took place.
LECTURE DISCUSSION TODAY (9/17/02) LECTURE: RESEARCH METHODS IN SOCIAL PSYCHOLOGY DISCUSSION (time permitting) HUMANS VS. OTHER ANIMALS.
Business Communication Workshop Course Coordinator:Ayyaz Qadeer Lecture # 9.
CMPF144 FUNDAMENTALS OF COMPUTING THEORY Module 5: Classical Logic.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Introduction to Linguistics Ms. Suha Jawabreh Lecture # 2.
Sergey Gromov Yulia Krasilnikova Vladimir Polyakov (NRTU MISIS, Moscow) KNOWLEDGE BASE CREATION FOR NATIONAL NANOTECHNOLOGY NETWORKS «CONSTRUCTIONAL NANOMATERIALS»
Social Science Inquiry Model. Scientific inquiry has 5 steps Identify a problem Develop a hypothesis Gather data Analyze the data Draw conclusions.
Detection of Links between Words in the Task of Syntactic-Semantic Analysis of Russian Texts. Dmitry V. Merkuryev Saint-Petersburg State University, Russia.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
Kherson, May 20-22, Nikolaj S. Nikitchenko Kyiv National Taras Shevchenko University, Ukraine Integration of Informatics-Programming Disciplines.
WHAT IS LANGUAGE?. INTRODUCTION In order to interact,human beings have developed a language which distinguishes them from the rest of the animal world.
Unit 8 Syntax. Syntax Syntax deals with rules for combining words into sentences, as well as with relationship between elements in one sentence Basic.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
Levels of Linguistic Analysis
Lecture №1 Role of science in modern society. Role of science in modern society.
“LANGUAGES of the WORLD”: Ongoing projects Andrej A. Kibrik (Institute of Linguistics, RAN) CML-2008 Montenegro, September 2008.
The Structure of Language Finding Patterns in the Noise Presented by Cliff Jones, M.A., Linguistics.
LEAP TH GRADE. DATES: APRIL 25-29, 2016 Test Administration Schedule:  Day 1 April 25- ELA Session 1: Research Simulation Task (90mins) Mathematics.
MORPHOLOGY. PART 1: INTRODUCTION Parts of speech 1. What is a part of speech?part of speech 1. Traditional grammar classifies words based on eight parts.
Syntax By WJQ. Syntax : Syntax is the study of the rules governing the way words are combined to form sentences in a language, or simply, the study of.
Textbook: Влавацкая М.В. ‘English Lexicology in Theory and Practice’ Новосибирск: НГТУ, ББК В 57.
CODILANG: MIXED MODEL OF DIVERGENCE-CONVERGENCE OF LANGUAGES OF THE WORLD Vladimir Polyakov (Russia)
Welcome to All S. Course Code: EL 120 Course Name English Phonetics and Linguistics Lecture 1 Introducing the Course (p.2-8) Unit 1: Introducing Phonetics.
+7 (499) , Moscow pr. 60-letiya Oktyabrya, 9 SYSTEM FOR INTELLIGENT SEARCH AND ANALYSIS OF LARGE-SCALE TEXT COLLECTIONS Institute.
L3 THE LINGUISTIC COMPONENTS OF CA.
Scott C. Johnson Lecturer Rochester Institute of Technology Spring 2016.
Geography (nb: strange names) Families Indo-European: most primary branches Greek (Hellenic) Italic (> Romance) Celtic Germanic Balto-Slavic Albanian.
Text Linguistics. Definition of linguistics Linguistics can be defined as the scientific or systematic study of language. It is a science in the sense.
Lexicology as a Branch of Linguistics. Lexical Units
The Cultural Landscape: An Introduction to Human Geography
Lexicology as a Branch of Linguistics. Lexical Units
Linguistics Class 2.
INTRODUCTION TO PHONETICS AND PHONOLOGY
Module One: Foundations of Linguistics and The Study of Language
Levels of Linguistic Analysis
Traditional Grammar VS. Generative Grammar
Presentation transcript:

Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE (ON THE DATA OF DB «LANGUAGES OF THE WORLD») (*) * The research was supported by Russian Scientific Foundation of Humanities (grant № в)

Screenshots. Win Version

Source of Data for DB JM Encyclopedic issue “Jaziki Mira”(Languages of the World) – 18 volumes, printed by Institute of Linguistics of Russian Academy of Science from 1993 to Large Encyclopedic Dictionary. Linguistics (Edited by Yarceva V.N.) – includes interpretation of all terms of model of DB.

List of some Encyclopedic Publications “Jaziki Mira”(Languages of the World) Languages of the world: Uralic (1993). Languages of the world. Paleoasiatic languages. Мoscow: Publ. “Indricк”. (1996) p. Languages of the world: Turkic. Мoscow: Publ. “Indricк”. (1997) p. Languages of the world: Mongolic languages. Manchu-Tungus languages. Japan. Korean. (Ed.: Kibrik A.A., Rogova N.B., Romanova O.I.). Мoscow: Publ. “Indricк”. (1997) p. Languages of the world: Iranian languages. I. South-Western Iranian languages. Мoscow: Publ. “Indricк”. (1997) p. Languages of the world: Iranian languages. II. North-Western Iranian languages. Мoscow: Publ. “Indricк”. (1999). – 302 p. Languages of the world: Dardic and Nuristani languages. Мoscow: Publ. “Indricк”. (1998) p. Languages of the world: Iranian languages. III. East Iranian languages. Мoscow: Publ. “Indricк”. (1999) p. Languages of the world: Germanic languages. Celtic languages. Moscow: Publ. “Academia”. (1999) p. Languages of the world: Caucasian languages. RAS. Institute of Linguistics. Moscow: Publ. “Academia”. (2001) p. Languages of the world: Romance languages. Moscow: Publ. “Academia”. (2001) p. Languages of the world: Indo-Aryan languages of Ancient and Middle Period. Moscow: Publ. “Academia”. (2004) p. Languages of the world: Slavonic languages. RAS. Institute of Linguistics. /Ed. A.M. Moldovan, S.S. Skorvid, A.A. Kibrik/ Moscow: Publ. “Academia”. (2005) p. Languages of the world: Baltic languages. RAS. Institute of Linguistics. /Ed. V.N.Toporov, M.V.Zavyalov, A.A. Kibrik /. Moscow: Publ. “Academia”. (2006), 224 p.

Dictionary and source books Dictionary Two of 18 source books

Characteristics of Data Base “Languages of the World” Content The Data Base “Languages of the World” has the following quantitative characteristics. - contains more than 3800 features - the number of languages is 315 Eurasian languages - contains the description of the following spheres of language: phonetics, morphology, syntax. - representation of data: binary In Data Base “Languages of the World” the following language families and unities are represented: Austroasian, Austronesian, Altaic, Afroasian, Indoeuropean, Caucasian, Paleoasian, Sinotibetic, Uralic, Hurrito-Urartean. DB contains the description of languages-isolates: Ainu, Nivch, Burushaski, Sumeran, Elamite. The unique peculiarity of Data Base “Languages of the World” is a large collection of extinct languages description, that includes 54 essays. There is no analogues of such detailed and systematic description of exinct languages. The main principles forming of the model of language description are binarity, hierarchicity and paradigmaticity.

Task Formulation 1.Grammatical constructions are supposed to require different resources of the brain in processing. 2.There is another supposition that the total number of the resources of the brain aimed at processing of the volume, which is approximately equal in the meaning, must be constant. 3.Semantic cases can be an example of a complex construction for the verification of these statements (Fillmore’s cases). 4.The DB “Jazyky Mira” contains semantic cases that form a rather wide paradigm.

Example Let’s study an example of the accusative case “Суд обвинил Вас-ю в краже.” “The court accused Basil of robbery.” In the Russian language case is marked by a form of the noun (Вас-ю) and by a preposition (в), and in the English language – only by preposition (of).

Method of Data Processing Velina Slavova used the data of DB “Jazyki Mira” in order to receive a more convenient representation of the case paradigm. After a rather sophisticated reduction we received the first results that show examples of correlation of different case systems.

Case description in DB. Scope of the research. In DB JM we have 405 grammar features devoted to case system (in the Part number of Model). In this research only actant case meaning were investigated (140 grammar features ). They were divided in six fragments: --subject/object --contrastive case formation of subject --contrastive case formation of object --method of expressing subject--object-meanings --other actant cases -case of nominal predicate. At the first step only four fragments were investigated.

Examples of case description --subject/object ---absolutive ---absolutive/relative ---dative ---narrative ---nominative/accusative ---nominative/accusative/genitive ---nominative/accusative-genitive ---nominative/accusative/indefinite accusative ---nominative/acusative/genitive/partitive ---nominative/accusative/privative/sociative ---nominative/accusative/locative ---nominative/accusative/partitive ---nominative/dative-accusative ---nominative/narrative ---nominative/partitive ---nominative/genitive ---nominative/genitive/partitive ---nominative/general indirect ---nominative/ergative ---nominative/ergative/genitive At left the part of “subject/object” paradigms in DB is shown. At right fragment of description of the English language is shown *LANGUAGE DENOMINATION.English ………………………………………………… CASE MEANINGS.actant case meanings..subjective/objective...general case/accusative..contrastive case formation...of object....nouns and pronouns..method of expressing subject.-object.meanings...case affixes...word order...auxiliary words....in preposition.case of attributive relation..prepositional construction.case of possesive relation..prepositinal construction..possesive affix at possesor's name.case of locative relations..method of expression...prepositions ………………………………………………..

Metrics of complexity For each six part the own metrics of complexity was developed. Part of case description (Complex characteristics) Type of feature codingMetrics --subject/objectParadigma – only one choiceMaximal number of cases marked in language --contrastive case formation of subject Multi-choiceNumber of features presented in language --contrastive case formation of object Multi-choiceNumber of features presented in language --method of expressing subject--object- meanings Multi-choiceNumber of features presented in language

Correlation Analysis We can see good correlations between three complex characteristics (marked by yellow).

Factor Analysis We have two groups of factors (# 1 – yellow, # 2 - blue)

Tree Analysis The distances between the languages following this “SO syntactic rules complexity” measure seem to keep languages from some genealogic groups closed together. Nevertheless, it is seen that Indo-European languages are VERY dispersed. OLD languages seem to stay a part!

ANALYSIS OF RESULTS 1.The hypothesis about the preservation of the complexity of the grammar structure of the language on a certain level found its confirmation. The study showed that languages with a complex case paradigm have simpler grammatical means of expressing cases and fewer differences in the description of cases for subject/object. Languages with a simple case paradigm have more complex means of expressing case relations and have more differences in the description of cases for subject/object. Such dichotomy explains 76% variations of the content of the DB “Jazyki Mira” 2.In general such description of the case system (as two groups of factors) correlates well with the genealogical tree. The exception is Indo-European language family, which can be conditioned by a big geographical spread of EU languages and, consequently, intensive borrowing during areal contacts. This hypothesis requires additional check.

The present report is called upon to show that DB “Jazyki Mira” is an interesting resource for studying the complexity of different grammar parts of the language. We have only received the first experience. The methods and approaches are still at the stage of establishment and development. Works in this direction will be continued. AS A CONCLUSION

Thank you for your attention Contacts: