Organizing and Implementing on the Thesauri Mapping Project Dr. Chang Chun Associate Professor Agriculture Information Institute, Chinese Academy of Agricultural.

Slides:



Advertisements
Similar presentations
ERA-Nets as Information Sources Gerry Lawson, UK Natural Environment Research Council.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Food and Agriculture Organization of the United Nations Land and Water Division AQUASTAT FAOs global information system on water and agriculture by Karen.
Library and Documentation Systems Division Johannes Keizer 5 th AOS workshop, Beijing May 2004 AOS Project And Realization.
Sharing Human Rights Terminologies: towards an online Thesauri Builder Boris Panhoelzl ECCHRD-meeting, 22 October 2010.
Why, what were the idea ? 1.Create a data infrastructure, 2.Data + the knowledge products that are produced on the basis of data a) Efficiant access to.
KNOWLEDGE FOR LIFE CABI database training CAB Abstracts Introductory demonstration.
Agricultural Ontology Web Services Striving for more interoperability in agricultural information management OASIS Symposium May 2006 Boris Lauser.
Advanced Searching of CAB ABSTRACTS ● Field Searching ● CABICODES ● The CAB Thesaurus ● Search Limits.
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Text Operations: Preprocessing. Introduction Document preprocessing –to improve the precision of documents retrieved –lexical analysis, stopwords elimination,
Margherita Sini, FAO 1/ FAO projects in the area of the Semantic Technologies 23rd APAN Meeting Manila, Philippines
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
1/ 22 AGROVOC and the OWL Web Ontology Language: the Agriculture Ontology Service Concept Server OWL model DC 2006 Mexico, 4 October.
Reengineering AGROVOC to Ontologies Step towards better semantic structure NKOS Workshop 31 May 2003 Rice University Houston, Texas, USA Frehiwot Fisseha.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
1 UPGRADING HUMAN RESOURCES SKILLS FOR AGRICULTURAL STATISTICS IN 21ST CENTURY FAO Statistics Division October 2009.
Working Plan of US-China Bilateral cooperation on biomedical data sharing.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
4th project meeting 27-29/05/2013, Budapest, Hungary FP 7-INFRASTRUCTURES programme agINFRA agINFRA A data infrastructure for agriculture.
FAO, Library and Documentation Systems Division – Dr. Johannes Keizer | May 2006 AGRIS – A new Vision and Strategy CAAS, Beijing May 2006 A new vision.
FOOD AND AGRICULTURE ORGANIZATION OF THE UNITED NATIONS Statistics Division ICAS IV - Fourth International Conference on Agricultural Statistics (Beijing,
1 Intra- and interdisciplinary cross- concordances for information retrieval Philipp Mayr GESIS – Leibniz Institute for the Social Sciences, Bonn, Germany.
Ontology Learning for Chinese Information Organization and Knowledge Discovery in Ethnology and Anthropology Kong Jing Institute of Ethnology & Anthropology,
FAO 1/ AOS Community Margherita Sini & Gauri Salokhe Food and Agriculture Organization 8 th AOS Workshop – 22 Sept
Johannes Keizer Food and Agriculture Organization of the UN Library and Documentation Systems Division The Agricultural Ontology Service - project, a.
Multilingual Information Exchange APAN, Bangkok 27 January 2005
Vocabularies in the VO Alasdair J G Gray Norman Gray Iadh Ounis.
CAB Abstracts on CAB Direct Chris Ison International Training Manager.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
In pursuit of interoperability: Can we standardize mapping types? Stella G Dextre Clarke Project Leader, ISO NP
Designing the Team-oriented Ontology Management System with Ajax Technology Ze Li, Johannes Keizer, Zhong Wang, Margherita Sini, Yelu Zheng The Institute.
Incorporating ARGOVOC in DSpace-based Agricultural Repositories Dr. Devika P. Madalli & Nabonita Guha Documentation Research & Training Centre Indian Statistical.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
AOS and Agricultural Information Service in Guangdong China Zhong Wang, Liang Huang and Jie Chen Institute of Sci-Tech Information ( ISTI) Institute of.
Food and Agriculture Organization of the UN Library and Documentation Systems Division July 2005 Ontologies creation, extraction and maintenance 6 th AOS.
Coastal Atlas Interoperability - Ontologies (Advanced topics that we did not get to in detail) Luis Bermudez Stephanie Watson Marine Metadata Interoperability.
, 1/21, © Library and Documentation Systems Division 21 st APAN Meeting Tokyo, January 2006 AGROVOC and AOS, Margherita Sini, FAO From.
5/31/ ESS, FAO. 5/31/20162  The Supply Utilization Accounts (SUAs) are the core statistics of a statistical framework for food and agricultural.
The KOS interoperability in aquatic science field through mapping processes Carmen Reverté Reverté Aquatic Ecosystems Documentation Center. IRTA. (Sant.
FAO, Library and Documentation Systems Division – Dr. Johannes Keizer | May 2006 AGRIS – A new Vision and Strategy GAAS, Guangzhou May 2006 A new vision.
ISO 25964: a standard in support of interoperability Stella G Dextre Clarke Project Leader, ISO NP
Food and Agriculture Organization of the UN Library and Documentation Systems Division Margherita Sini July 2005 Managing domain ontologies within the.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
APAN AG-WG Bangkok Food and Agriculture Organization of the UN Library and Documentation Systems Division Margherita Sini Slide Sustainable.
AGROVOC Thesaurus. 1980s: developed as multilingual structured thesaurus for agricultural terminology (“rice”) : parallel effort to express thesaurus.
The Agricultural Ontology Service: A Proposal to Create a Knowledge Organisation Framework in the Area of Food and Agriculture Johannes Keizer, Food and.
Johannes Keizer Food and Agriculture Organization of the UN Library and Documentation Systems Division FAO-IUFRO- GFIS-CABI Discussion about a Multilingual.
The Agricultural Ontology Service: A proposal to create a Knowledge Organisation Framework in the Area of Food and Agriculture Johannes Keizer, Food and.
Types of mapping recommended in ISO 25964, and the question of reciprocity Stella G Dextre Clarke Project Leader, ISO NP
June 2003INIS Training Seminar1 INIS Training Seminar 2-6 June 2003 Subject Analysis Thesaurus and Indexing Alexander Nevyjel Subject Control Unit INIS.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12 RDF, OWL, Minimax.
Trait ontology approach Marie-Angélique LAPORTE NCEAS June 7 th 2010.
A special ontology – case Agriforest Päivi Lipsanen and Kimmo Koskinen Viikki Science Library.
Margherita Sini, FAO 1 / 19 Using RSS to Share KOS Metadata Margherita Sini, Gauri Salokhe IV Ecoterm Vienna, Austria April.
A Collaborative Approach to Developing a Multilingual Forestry Thesaurus A project in development between IUFRO, CABI and FAO –Gillian Petrokofsky, CAB.
Ontology Based Annotation of Text Segments Presented by Ahmed Rafea Samhaa R. El-Beltagy Maryam Hazman.
Gauri Salokhe, FAO 1/ Examples of Ontology Applications Seventh Agricultural Ontology Service Workshop Bangalore, India Gauri.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
GACS: Towards a common concept scheme for information in agriculture International Conference on Big Data and Knowledge Discovery Bangalore, March 9-11,
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
Food and Agriculture Organization of the UN Library and Documentation Systems Division Slide 1 July 2005 Mapping CAT to AGROVOC 6 th AOS Workshop Vila.
1 Katri Seppälä Semantic Computing Research Group (SeCo) Helsinki University of Technology, Laboratory of Media Technology; University of Helsinki, Department.
The Mapping Project from CAT to AGROVOC
Thai AGROVOC Ontology Base for Agricultural Information Retrieval
knowledge organization for a food secure world
VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY
Presentation transcript:

Organizing and Implementing on the Thesauri Mapping Project Dr. Chang Chun Associate Professor Agriculture Information Institute, Chinese Academy of Agricultural Sciences (AII/CAAS), Beijing China The Seventh Agricultural Ontology Service (AOS) Workshop AFITA 2006 November 9-11, Bangalore, India

Outline  Introduction  Organizing  AGROVOC and CAT  Conclusions Outline 7 th AOS  Objectives  Methods  Mapping rules  Discussions

Brief Introduction on the Mapping Project CAT CAAS AGROVOC FAO ExactMatch InexactMatch BroadMatch NarrowMatch AND,OR,NOT No mapping mapping Mapping RulesResourceTarget 7 th AOS Introduction

Objective 1: Enrich AOS Terminology Domain Knowledge  Key words have problems in search information;  Thesauri are still working in information management;  Research on conversion from thesaurus to ontology;  Mapping can add more new domain knowledge. 7 th AOSObjective

Objective 2: Develop Cross-Language Search System Chinese users Mapping Information ( e, b,n… ) Chinese data AGRIS data AGROVOC CAT English Users Search Search end 7 th AOSObjective

The Time and Tools of Mapping Project  The time of mapping project: From September 2005 to September 2006;  Mapping rules: a revision method of SKOS Mapping Vocabulary Specification;  Mapping direction: from CAT (resource) to AGROVOC (target)  Mapping tools: Prot é g é, Excel sheet, CAT and AGROVOC CD- ROM. 7 th AOS Organizing

Working Flow  From to : make plans of mapping methods, prepare and test the mapping data;  From to : the training and mapping with Excel sheet;  From to : convert the Excel sheet information to OWL mapping data, Protégé can read this information. 7 th AOS Organizing

The specialists  we organized about 16 agricultural domain specialists in CAAS, many of them are PhD students, they were chosen based on the domain.  The main domain are biological science, agricultural environmental science, agricultural meteorology, fertilizer science, horticulture, forestry practice, plant protection, agronomy, agricultural products processing and storage and comprehensive utilization, veterinary medicine, biological control, Industrial technology and equipment, fishery science, and so on.  Some of them have knowledge of thesaurus. 7 th AOS Organizing

AGROVOC and CAT AGROVOC: English terms: descriptors, non descriptors Chinese terms: descriptors, 8432 non descriptors 1240 top terms organized in 130 categories (AGRIS/CARIS) includes biological taxonomy and geographical names CAT: Chinese terms: descriptors, non-descriptors descriptors has at least one translation 2332 top terms organized in 40 categories (e.g. crops, etc.) includes biological taxonomy and geographical names 7 th AOS Organizing

To Finish the Mapping Work in Two Steps  First, Excel sheet: We split CAT into 36 documents based on the domain, we use Excel sheet, try to find all mapping information and input it in the Excel sheet, all these sheets will be kept as original data;  Second,convert information to OWL document: After we finish the all Excel sheets, we convert and input these mapping information into OWL documents, they can be read in Protégé after import CAT and AGROVOC. 7 th AOS Organizing

Excel sheets ABCDEFGHIJ C-term code C- termRelation A-term code A- term combine relation C-revise suggestion C- comment A-revise suggestion A- comment 7 th AOS Organizing

Mapping Standards and Methods  Exact Match, Inexact Match ;  Broad Match,Narrow Match ;  AND ; OR ; NOT ; 7 th AOSMethods

Mapping relationships Exact match SKOS: exactMatch OWL: equivalentTo Broader/Narrower match SKOS: broadMatch, narrowMatch OWL: subClassOf OR, AND, NOT operators SKOS: OR, AND, NOT OWL unionOf, intersectionOf, complementOf Partial equivalences SKOS: minorMatch, majorMatch 7 th AOSMethods

Exact Match CAT AGROVOC Mapping Exact Match Such as : ‘ 禾谷类作物 ’ Exact Match ‘25512-Cereal crops’ 7 th AOSMethods

equivalentClass: One of main mapping relation (13105) 7 th AOSMethods

Inexact Match CAT Mapping AGROVOC Inexact Such as : ‘ 经济大国 ’ Inexact match ‘Developed countries’ 7 th AOSMethods

55581_ 玉米芯 _Maizecob ie <rdfs:comment rdf:datatype=" >inexact mapping with Inexact Match : We seldom use this mapping relation 7 th AOSMethods

Broad Match CAT Mapping AGROVOCBroad Match Such as : “ 普及教育 ” Broad Match ‘2488-Education’ 7 th AOSMethods

subClassOf: BroadMatch (another main mapping relation 11408) 7 th AOSMethods

Narrow Match CAT Mapping AGROVOC Narrow Match Such as : “8341_ 岛屿 _Islands” Narrow Match “695_Atolls_ 环礁 ” 7 th AOSMethods

subClassOf: Narrow Match (173) 7 th AOSMethods

AND ; OR ; NOT AND “ 自动标引 ” Exact Match ‘11729-Indexing of information’ AND ‘ Automation’ ORNOT “7536_ 大麦 _Barley” Exact Match ‘823_Barley_ 大麦 OR 3662_Hordeum vulgare_ 大麦植物 ’ ‘ 非传染性病害 ’ Exact match ‘5962-Plant diseases’ NOT ‘34024-Infectious diseases’ 7 th AOSMethods

AND “59683_ 自动标引 _Automaticindexing” Exact Match 11729_Indexingofinformation_ 信息编目 and 15855_Automation_ 自动化 7 th AOSMethods

AND: intersectionOf 7 th AOSMethods

OR 7536_ 大麦 _Barley” Exact Match ‘823_Barley_ 大麦 OR 3662_Hordeum vulgare_ 大麦植物 Methods7 th AOS

OR: unionOf 7 th AOSMethods

NOT ‘12114_ 非传染性病害 _Non-infectiousdiseases’ Exact match ‘5962_Plantdiseases_ 植物病害 ’ AND NOT ‘34024_Infectiousdiseases_ 侵染性病害 ’ 7 th AOSMethods

NOT: complementOf 7 th AOSMethods

No mapping: 13867_ 干扰 _Interference 7 th AOSMethods

<rdfs:comment rdf:datatype=" >AGROVOC hasn't this concept NoMapping: comment 7 th AOSMethods

How to get OWL documents  Convert the Excel sheet information to Protégé (machine convert and human input ), and get OWL mapping data;  Use the tools of ‘import ontology’, import one domain of CAT and whole AGROVOC, and input the mapping relations, after save the working, we can get different domain OWL documents; 7 th AOSMethods

Combine the OWL documents  Delete the top and the end of all OWL documents, then paste them together,we get the whole middle part of mapping project;  Create a new OWL document, import whole CAT and AGROVOC, and save the document;  Insert the whole middle part of mapping project into the upper document, then we get a whole mapping OWL document, it works with whole CAT and AGROVOC. Methods7 th AOS

1 Candidate and the True mapping Conclusions 7 th AOS Classification Exact match bn e-b-n total Other relation Classification total Total Num.Taxon.Geogr. TotalAction Match English and Chinese Exact match Match English but different Chinese Match not ensured Match Chinese but different English Tentative exact match Automatic identification of candidate exact matches The statistics of true mapping matches relation

2 The Series Mapping Knowledge Data Files Conclusions 7 th AOS The contribution include the following documents: (a) cat_agrovco_mapping.owl; (b) ag_ owl; (c) cat_all_u.owl; (d) agrovoc-zh-revise.xls; (e) agrovoc-usefor-comment.xls; Users can use Protégé create a new ontology with the data of (a), the machine will ask to import (b) and (c), and then you can open the (a), the open time is a little slow, our computer need about 4 minutes, the computer CPU 3.4, RAM: 1 G. (d) notes the information which need to be revised about the terms of AGROVOC; (e) is the comments about AGROVOC terms

Discussions  No mapping ;  InexactMatch;  Begin from the top term;  Mapping document need work with CAT and AGROVOC;  There are many broadMatch relations;  The comment and the suggestion; 7 th AOS Discussions

The Heredity of Mapping Relation About 60% CAT concepts obtain mapping relation with AGROVOC by heredity. They normally follow the ExactMatch, BroadMatch (24 513) 7 th AOS Discussions C1A ExactMatch BroadMatch CATAGROVOC

Different Thesauri with Different Classification A few concepts have different domain trees in two thesauri, means different thesauri have their own classification. 7 th AOS Discussions C1A ExactMatch CATAGROVOC

The Resource and Target  ExactMatch: same concepts;  BroadMatch: Chinese users get more broad concept, or get some useless information;English users get more specific concept, or can’t find all information.  NarrowMatch: the opposite.  CAT has more than 60,000 terms, AGROVOC has only about 30,000 terms, so take CAT as resource is better. 7 th AOS Discussions C1A ExactMatch BroadMatch CATAGROVOC A4 NarrowMatch

Discussions 2  Different knowledge taxonomy ;  Difference on noun and verb ;  Different social ideas ;  Different cultures ;  Different translations. 7 th AOS Discussions

Chinese Academy of Agricultural Sciences (CAAS) and Food and Agriculture Organization (FAO) Thank you 7 th AOSThanks