Reengineering AGROVOC to Ontologies Step towards better semantic structure NKOS Workshop 31 May 2003 Rice University Houston, Texas, USA Frehiwot Fisseha Anita Liang Johannes Keizer
About the Presenter Name: Frehiwot Fisseha Job Title: Information Management Specialist Works at: AGRIS/CARIS and Documentation Unit, FAO of the United Nations, Rome/Italy Major area of work: D evelopment, maintenance and use of metadata and semantic standards in agricultural information management.
What this talk is all about 1.Similarities and differences between thesauri and ontologies 2.Need to reengineer AGROVOC to an ontology 3.Envisaged benefits 4.Problems with semantic representation in AGROVOC 5.Some ideas to transform AGROVOC to an ontology 6.Conclusion
Thesauri and Ontology Similarities Both provide a representation of a shared understanding of a domain in order to facilitate efficient communication. Both are concept based systems representing highly complex knowledge. Both are concerned with the terminology used to represent the concepts in a particular domain. Both utilize hierarchies to group terms into categories and subcategories. Both can be applied to cataloguing and organizing information resources.
Thesauri and Ontology Differences Users: –Thesauri are intended for human users. –Ontologies can be used by humans for knowledge sharing and by software agents for knowledge processing. Semantics: –Thesauri may contain prose definitions to help the human user understand the meaning of a term but they do not provide a formal specification of concepts. –Ontologies specify conceptual knowledge explicitly using a formal language with clear semantics, which allows an unambiguous interpretation of terms for use by machines. Computational support: –Knowledge representation. Thesauri: limited or no means Ontologies: explicit and formalised
Why do we need to reengineer AGROVOC to an ontology? Because AGROVOC has the following limitations: –It contains semantic ambiguities due to limited semantic coverage. The terms 'broader', 'narrower', 'used for', 'related' are not defined by precise semantics. –It lacks explicit and formal representation of meaning that can be utilised by machines. Meanings in AGROVOC are not represented using metadata technologies –It is stored and maintained in a proprietary relational database system which prohibits reuse and sharing. –It has general domain scope and lacks domain specific concepts. Emphasis is on general Agricultural concepts and not specialized disciplines like forestry, fishery, nutrition, etc
Formal semantics due to standardized meaning Internal consistency due to integrity constraints Inferencing capability Easy to re-use and share –Ontologies are represented in standard languages such as RDF, DAML/OIL, OWL, etc. –Possibility of using a unique concept identifier to standardize the meaning of the concept globally. –Version management can be incorporated to reflect updates. What do we achieve by reengineering AGROVOC to an ontology?
.....just a glimpse on some problems of representing semantics in AGROVOC....
Broader Term (BT) and Narrower Term (NT) relations in AGROVOC BT and NT are typical hierarchical relations in a thesaurus. However, their semantics is not explicitly defined.It is common for BT/NT relations within a thesauri to include at least the following: Is-A (e.g. Milk/ Cow Milk; Development Agency/IDRC)) Ingredient of (e.g. Milk/ Milk Fat) –Milk fat is an ingredient of milk Property of (e.g. Maize/Sweet corn) –Sweetness is a property of corn Some examples from AGROVOC MAIZE NT dent maizedent maize NT flint maize NT popcorn NT soft maize NT sweet corn NT waxy maize flint maizepopcornsoft maizesweet cornwaxy maize MILK NT Milk FatMilk Fat NT ColostrumColostrum NT Cow Milk Development Agencies NT development banks NT voluntary agencies NT IDRCdevelopment banksvoluntary agenciesIDRC
Used For (UF), USE in AGROVOC UF and Use represent equivalence relationship in a thesaurus. However, the semantics is again blurred since the way equivalence relationship is used could include the following. –genuine synonymy, or identical meanings; –near- synonymy, or similar meanings; In some thesauri, equivalence can include antonymic or opposite meanings (cf. Eurovoc) Some examples from AGROVOC DEVLOPMENT AGENCIES UF aid institutions NOVEL FOODS UF Novelty Foods SEX UF sex differences UF gender UF sex chromosomes similar but not necessarily equivalent concept completely different concepts
Related Term (RT) in AGROVOC RT represents the associative relation. The RT usually involves the most ambiguous semantics. RT can include the following. –causality –agency or instrument –hierarchy - where polyhierarchy has not been allowed the missing hierarchical relationships are replaced by associative relationships –sequence in time or space –constituency –characteristic feature –object of an action, process or discipline –location –similarity (in cases where two near- synonyms have been included as descriptors) –antonym Some examples from AGROVOC DEGRADATION RT chemical reactions RT discoloration RT hydrolysischemical reactionsdiscolorationhydrolysis RT shrinkageshrinkage IDRC RT canadacanada causality location
....just a glimpse on how ontology modeling can solve these problems....
Some ideas for reengineering AGROVOC Most of the problems could be solved by: 1.Re-analyzing the existing relations to introduce explicit semantics: for instance, – BT/NT relationship could be resolved to ‘Is-A’ relation – RT relationship could be refined to more specific relationships (such as “produces”, “used by”, “made for”). 2.Specifying composite concepts in terms of basic concepts that can be un-ambiguously represented: for instance – Perishable product could be represented as “ product” with attribute “ perishable “ – Fencing sword could be represented as “sword” used for “fencing” – Mother could be represented as “parent with an attribute female”
Conclusion Ontologies are natural successors of thesauri, particularly for information retrieval and knowledge management. Ontologies provide better semantic representation and machine understandable representation of knowledge. They are meant both for human as well as machine use. AGROVOC does not have the required level of semantic specificity. Transforming AGROVOC into an ontology brings increased precision of semantics particularly for information retrieval purposes.
Thank you for your attention ! For more information on AGROVOC and the current ontology development initiative in FAO, please visit the following sites. We look forward to hearing from you. Send us your comments and suggestions