DATA BASE “LANGUAGES OF THE WORLD” DB JM SOFTWARE SURVEY: 2010 Vladimir Polyakov (Institute of Linguistics of RAS )

Slides:



Advertisements
Similar presentations
Introductory to database handling Endre Sebestyén.
Advertisements

VORTEX Version Software Application Sociology; Marketing research; Social-psychological research Social-medical research Staff recruitment, staff.
MICHAEL MARINO CSC 101 Whats New in Office Office Live Workspace 3 new things about Office Live Workspace are: Anywhere Access Store Microsoft.
Search in Source Code Based on Identifying Popular Fragments Eduard Kuric and Mária Bieliková Faculty of Informatics and Information.
1 Chapter 2: Product Development Process and Organization Introduction Importance of human resources: Most companies have similar technology resources.
Rubryx Document Classification Technology Authors: V.N. Polyakov, V.V. Sinitsin.
Adaptability of learning objects by appropriate knowledge representation Anastas Misev Institute of Informatics Faculty of Natural Science and Mathematics.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Welcome to E-Prime E-Prime refers to the Experimenter’s Prime (best) development studio for the creation of computerized behavioral research. E-Prime is.
A New Learning Tools. Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
Two main requirements: 1. Implementation Inspection policies (scheduling algorithms) that will extand the current AutoSched software : Taking to account.
File Systems and Databases
THE BRIEF PSYCHIATRIC RATING SCALE SYSTEM Senior Project by John Newman.
Systems Software Operating Systems.
1. 2 Content WSK Online is a new online database of specialized dictionaries covering all the major areas of linguistics and communication science: Biannual.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
SciFinder Web Version Pootorn R. Book Promotion & Service Co.,Ltd. Thailand.
CHAPTER 9 DATABASE MANAGEMENT © Prepared By: Razif Razali.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Databases C HAPTER Chapter 10: Databases2 Databases and Structured Fields  A database is a collection of information –Typically stored as computer.
Creating and Running Your First C# Program Svetlin Nakov Telerik Corporation
Galina Bogdanova, Konstantin Rangochev, Desislava Paneva-Marinova, Nikolay Noev Institute of Mathematics and Informatics, Bulgarian Academy of Sciences.
Stimulsoft Reports.Net 20 Problems which Stimulsoft Reports.Net solves
Vladimir Polyakov APPROACHES TO IMPROVEMENT OF SIMILARITY MEASURE, BASED ON THE STRUCTURE OF LANGUAGE DESCRIPTION IN THE DB "LANGUAGES OF THE WORLD"
Simple Database.
PLATFORM INDEPENDENT SOFTWARE DEVELOPMENT MONITORING Mária Bieliková, Karol Rástočný, Eduard Kuric, et. al.
The ID process Identifying needs and establishing requirements Developing alternative designs that meet those requirements Building interactive versions.
“LANGUAGES of the WORLD” (Jazyki mira): A longitudinal project
Some years ago, CarTech CEO told us: What I want to see in our new hires is:
Assistant. The CHIP Assistant control panel pops up immediately upon opening the CHIP Assistant program. Use this control panel to navigate through the.
POPULATION AND HOUSING CENSUSES IN SLOVAKIA ON THE WEBSITE Miroslav Hudec Pavol Büchler INFOSTAT – Bratislava MSIS Geneva
Chapter Nine NetWare-Based Networking. Introduction to NetWare In 1983, Novell introduced its NetWare network operating system Versions 3.1 and 3.1—collectively.
26 June 2008 DG REGIO Evaluation Network Meeting Ex-post Evaluation of Cohesion Policy Programmes co-financed by the European Fund for Regional.
CHAPTER TEN AUTHORING.
19/10/20151 Semantic WEB Scientific Data Integration Vladimir Serebryakov Computing Centre of the Russian Academy of Science Proposal: SkTech.RC/IT/Madnick.
1 Microsoft Access Introduction – Tables and Forms ©Richard Goldman January 2000.
Database What is a database? A database is a collection of information that is typically organized so that it can easily be storing, managing and retrieving.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Developing software and hardware in parallel Vladimir Rubanov ISP RAS.
REAL ESTATE INVENTORY SYSTEM Training Seminar - December 1, 2011 Tirana, Albania Guidelines on how to work with the Promise System.
Ad Hoc Graphical Reports Ad Hoc Graphical Reports Copyright © Team #4 CSCI 6838 Spring CSCI Research Project and Seminar Team# 4 (
Sergey Gromov Yulia Krasilnikova Vladimir Polyakov (NRTU MISIS, Moscow) KNOWLEDGE BASE CREATION FOR NATIONAL NANOTECHNOLOGY NETWORKS «CONSTRUCTIONAL NANOMATERIALS»
LINGUISTICS RESEARCH AND ANALYSIS OF THE BULGARIAN FOLKLORE. EXPERIMENTAL IMPLEMENTATION OF LINGUISTIC COMPONENTS IN BULGARIAN FOLKLORE DIGITAL LIBRARY.
Computing System Fundamentals 3.1 Language Translators.
2 nd Quarter ELA Standards Reading Informational Text 4.RI.2. Determine the main idea of a text and explain how it is supported by key details; summarize.
Clarity Cross-Lingual Document Retrieval, Categorisation and Navigation Based on Distributed Services
Building Dashboards SharePoint and Business Intelligence.
Information Retrieval
Velina Slavova (Bulgaria) Vladimir Polyakov (Russia) THE METRICS OF COMPLEXITY BASED ON SYSTEM OF CASE RELATIONS IN TYPOLOGICAL STRUCTURE OF THE LANGUAGE.
 Programming - the process of creating computer programs.
Distant Course Master English English Language Course For Masters of Mathematical and Mechanical Faculty Saint Petersburg State University The Faculty.
“LANGUAGES of the WORLD”: Ongoing projects Andrej A. Kibrik (Institute of Linguistics, RAN) CML-2008 Montenegro, September 2008.
PDS4 Demonstration Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Introduction  Program: Set of sequence instruction that tell the computer what to do.  Software: A collection of programs, data, and information. 
Tema 3 INEbase history Statistical books available on the web Celia Santos
Do Now You have 10 minutes to finish your About Me essay. When you are done, print out both your new About Me Ad and your typed essay.
+ Introduction to the Digitization of Hanguk Bulgyo Chonso Bo Kwang Han, Young Sik Hong, Keum Suk Lee, Yong Kyu Lee, Soon Il Hwang, Jae Soo Lee Institute.
Computer 4 JEOPARDY Bobbie, Sandy, Trudy.
Kanban Task Manager for Outlook ‒ Introduction
Course: Introduction to Computers
Lecture 2 Introduction to Programming
Databases.
Unit# 8: Introduction to Computer Programming
File Systems and Databases
Welcome to E-Prime E-Prime refers to the Experimenter’s Prime (best) development studio for the creation of computerized behavioral research. E-Prime is.
Independent work of students
Sharing of Eurostat predefined tables
Sharing of Eurostat predefined tables
Computer Basics Applications.
Presentation transcript:

DATA BASE “LANGUAGES OF THE WORLD” DB JM SOFTWARE SURVEY: 2010 Vladimir Polyakov (Institute of Linguistics of RAS )

Software Products related to DB JM Versions of DataBaseDOS Version Windows Version Web Version Quantitative And Other Research Products Includes comparison of two languages as function Similarity – Software for similarity measure calculations LangFam – Software for language family portraits calculations, genetic markers revealing, deal with rare features filters, investigate typological shift etc. Special software for modeling of evolution Special software for clusterization task Special software for phylogeny with different metrics of feature space BiCoTree –software for easy tree building on DB. Some other research programs, developed for different aims during partial investigations in areal, historical and typological linguistics (Gusareva, Loginova, Fashutdinov, Omlin, Polyakov, Solovyev). Includes comparison of two languages as function Reference and Educational Products (under constr.) Living Diagrams – reference software with possibility of integration source data and quantitative diagrams EduDBLANG – educational version of DB with full spectrum of reference possibilities The Web-version of “Living diagrams” is prepared. Outer tools applicable to JM data R – statistical sotware tools Phylogeny tools: …

Kernel versions of DB JM Program language or environment / Data Base Engine and data format Programmers, year of issue English interface and content Main functionsCompatibility DOS VersionClipper / Dbase compatible, DBF Skokan †, 1997 (*) Yes, but not synchronized with RUS-version on content Correction of model, add new languages, browse, export, import, save, search, comparison With Win version via files of essay export/import Win VersionPascal Delphi / Borland Data Base Engine, DBF Logunov, Polyakov, 2002 (*) Yes, but not synchronized with RUS-version on content Correction of model, add new languages, browse, navigation, export, import, save, simple and complex search, comparison, alphabetic and thematic indices With DOS version via files of essay export/import, with Web version via direct conversion of data base files Web versionC# and.NET / MS SQL Server Goncharov (1 st var.), 2005 Khanukaev (2 nd var.), 2006 (**) There is also a Linux- version (at KSU). The content Is fulfilled (Yaroslavtceva, Makarova). Interface is fulfilled (Khanukaev). We are finishing the work. Browse, tree navigation, comparison Loads data from Win version via direct conversion of data base files (*) Task formalization was done by Novikov † (**) Task formalization was done by Polyakov

Source of Data for DB JM Encyclopedic issue “Jaziki Mira”(Languages of the World) – 14 volumes, printed by Institute of Linguistics of Russian Academy of Science from 1993 to Large Encyclopedic Dictionary. Linguistics (Edited by Yarceva V.N.) – includes interpretation of all terms of model of DB. Main work on language description in DB format was fulfilled by Yelena Yaroslavceva, DSc.

List of Encyclopedic Publications “Jaziki Mira”(Languages of the World) Languages of the world: Uralic (1993). Languages of the world. Paleoasiatic languages. Мoscow: Publ. “Indricк”. (1996) p. Languages of the world: Turkic. Мoscow: Publ. “Indricк”. (1997) p. Languages of the world: Mongolic languages. Manchu-Tungus languages. Japan. Korean. (Ed.: Kibrik A.A., Rogova N.B., Romanova O.I.). Мoscow: Publ. “Indricк”. (1997) p. Languages of the world: Iranian languages. I. South-Western Iranian languages. Мoscow: Publ. “Indricк”. (1997) p. Languages of the world: Iranian languages. II. North-Western Iranian languages. Мoscow: Publ. “Indricк”. (1999). – 302 p. Languages of the world: Dardic and Nuristani languages. Мoscow: Publ. “Indricк”. (1998) p. Languages of the world: Iranian languages. III. East Iranian languages. Мoscow: Publ. “Indricк”. (1999) p. Languages of the world: Germanic languages. Celtic languages. Moscow: Publ. “Academia”. (1999) p. Languages of the world: Caucasian languages. RAS. Institute of Linguistics. Moscow: Publ. “Academia”. (2001) p. Languages of the world: Romance languages. Moscow: Publ. “Academia”. (2001) p. Languages of the world: Indo-Aryan languages of Ancient and Middle Period. Moscow: Publ. “Academia”. (2004) p. Languages of the world: Slavonic languages. RAS. Institute of Linguistics. /Ed. A.M. Moldovan, S.S. Skorvid, A.A. Kibrik/ Moscow: Publ. “Academia”. (2005) p. Languages of the world: Baltic languages. RAS. Institute of Linguistics. /Ed. V.N.Toporov, M.V.Zavyalov, A.A. Kibrik/. Moscow: Publ. “Academia”. (2006), 224 p. Also a new volume about semitic languages was issued.

Characteristics of Data Base “Languages of the World” Content The Data Base “Languages of the World” has the following quantitative characteristics. - contains more than 3800 features - the number of languages is 313 Eurasian languages - contains the description of the following spheres of language: phonetics, morphology, syntax. - representation of data: binary In Data Base “Languages of the World” the following language families and unities are represented: Austroasian, Austronesian, Altaic, Afroasian, Indoeuropean, Caucasian, Paleoasian, Sinotibetic, Uralic, Hurrito-Urartean. DB contains the description of languages-isolates: Ainu, Nivch, Burushaski, Sumeran, Elamite. The unique peculiarity of Data Base “Languages of the World” is a large collection of extinct languages description, that includes 55 essays. There is no analogues of such detailed and systematic description of exinct languages. The main principles forming of the model of language description are binarity, hierarchicity and paradigmaticity.

Quantitative And Other Research Products ProductProgram language or environment / Data Base Engine and data format Programmers, year of issue Main functions SimilarityVBA, ExcelPolyakov, 2006Similarity measure calculations and evaluation LangFamVBA, ExcelPolyakov, 2006Software for language family portraits calculations, genetic markers revealing, deal with rare features filters, investigate typologycal shift etc. Special software for modeling of evolutionPascal DelphiYuzhikov, 2006 (*) Modeling of process of appearance, borrowing, extinction of features. Uses different parameters of model, gives different quantitative values. Special software for clusterization taskPascal DelphiDvoenosova (1 st var), 2006 Zheleznovsky (2 nd var), 2008 (*) Clusterization of languages and features by different techniques of classic cluster analysis Special software for phylogeny wspaceith different metrics of feature Visual CFaskhutdinov, 2008 (*) Use two heuristic ideas of L- and S- metrics for calculation of distance between languages. BiCoTree –software for easy tree building on DB. Pascal DelphiSarvarov, 2010 (*) Some other research programs, developed for different aims during partial investigations in areal, historical and typological linguistics (Gusareva, Loginova, Fashutdinov, Omlin, Polyakov, Solovyev). C, Pascal DelphiAllow to solve different tasks: -To calculate a core of relevant features for different language families; -To calculate a motherland for different language families using grammar features; -To calculate stability index using different metrics; -Etc. (*) Task formalization was done by Valery Solovyev

Reference and Educational Products (under constr.) Product Program language or environment / Data Base Engine and data format ProgrammersMain functions Living Diagrams C# and.NET MS SQL Server Excel Khanukaev (*) Reference software with possibility of integration source data and quantitative diagrams. Allows to draw quantitative pictures or tables and to do queries to source data immediately from picture. Has purpose to improve confidence of linguists to quantitative results. EduDBLANG C# and.NET MS SQL Server Excel Belyaev (*) Educational version of DB with full spectrum of reference possibilities. Includes genetic and geographic indices, annotation and examples for features, full texts of papers according to the best WALS traditions. New concept of user interface. (*) Task formalization is done by Polyakov

Specific problems, related to the software development ProblemSolution Problem of compatibilitySpecial converters of data are needed. Partly solved in Kernel versions of DB, not solved in related products. Live cycle of product is more then “life time” of OS, program environment and even programmers. Solved by keeping of key members of the team and organization of permanent knowledge inheritance English interfaceEasy solved English contentVery hard problem because of enormous volume of source data and problems of correct terms translation Problem of code tablesMay be solved by careful testing and data format choosing Content adding and supportSolved by high qualified team of content developers

Dictionary and source books Dictionary Two of 14 source books

Screenshots. Win Version

Screenshots. Living diagrams.#1

Screenshots. Living diagrams. #2

Screenshots. Living diagrams. #3

Screenshots. Living diagrams. #4

Screenshots. Living diagrams. # 5

Web-version

THANK YOU! Contacts: Vladimir Polyakov