Download presentation
Presentation is loading. Please wait.
Published byAudra Gordon Modified over 9 years ago
1
Ontology-based information retrieval of scientific information Natalia V. Loukachevitch Laboratory of Information Resources Analysis Research Computing Center of Moscow State University (MGU NIVC)
2
Thematic Search of Scientific Information Knowledge-based (ontology-based) search Use of synonyms Automatic query expansion Automatic analysis of query results Help in interactive search
3
Bilingual Sociopolitical Thesaurus The thesaurus development is based on three methodologies: methods of construction of information-retrieval thesauri (information-retrieval context, analysis of terminology, terminology-based concepts, a small set of relation types) development of wordnets for various languages (word-based concepts, detailed sets of synonyms, description of ambiguous text expressions) ontology and formal ontology research (strictness of relations description, necessity of many-step inference) (33,000 concepts, 80,000 Russian terms, 85,000 English terms)
4
General Lexicon Specific Lexicon Специальная лексика Socio-Political Domain vs. General Lexicon and Specific Lexicons Intermediate Zone Information Security Aviation Ontology Cultural Heritage Ontology on Natural Sciences and Technology 30,000 concepts; 70,000 terms
5
Thematic Structure tax; taxation system; tax payer; finances; economy; tax legislation; VAT legislation; law; draft law; Taxation Code; deputy minister; Ministry of Finance; finances; reform; tax reform population budget, estimate; finances; economy; document government; state power; Minister of Finance State Duma; state power; state
6
Thematic representation of a text: Thematic Node i || + == Thematic Node j Thematic node in the text
7
University Information System RUSSIA (http://www.cir.ru, http://uisrussia.msu.ru )http://www.cir.ruhttp://uisrussia.msu.ru - Database of Fulltext Documents (1,5 mln): Legal Acts, Newspaper articles, Scientific Reports - Database “Statistics of Russian Federation” (Socio-economic Statistics, Demographic Statistics, Agrarian Statistics, Urban Statistics) - Database “Budget system of Russian Federation”) (www.budgetrf.ru)www.budgetrf.ru
8
Visualisation of Data in Dynamic Tables and Maps
9
ConvertorsProcessingInterfacesServices Unified Technology Platform (Constructor)
10
Cross-Language Information Retrieval
11
Applications of technology Concept-based information retrieval (monolingual, bilingual) Information-Retrieval systems combining word-based and concept-based serach Concept-based automatic text categorization Automatic Question-Answering Automatic Text Summarization
12
Main Projects State Duma of RF (1999 - …) Central Election Commission of RF (1997 - …) Legal Company “Garant” (2002 – …) Ministry of Education (2005-2006) Accounting Chamber of RF (2003 – …) Central Bank of RF (2006 – …) Grants: – McArthur Foundation (1994, 1995, 2004 - …) – Ford Foundation (2002, 2003) – Russian Foundation for Basic Research (9) – Russian Foundation for Humanitarian (5) – Eurasia Foundation (2002, 2003)
13
Participance in International Forums Participance in Text REtrieval Conference TREC organized by NIST DARPA (TREC-6, TREC-8) Participance in Summarizarion Conference SUMMAC organized by NIST DARPA (1 st place) Cross-Language Evaluation Forum CLEF (DELOS program) –paricipance in Steering Committee –provision of Russian collections for evaluation purposes –information retrieval of domain-specific information retrieval Organizers of Russian Information Retrieval Evaluation Seminar ROMIP (www.romip.ru/en/ )www.romip.ru/en/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.