Download presentation
Presentation is loading. Please wait.
Published byDiana Jefferson Modified over 9 years ago
1
Ontology-Driven Information Retrieval Nicola Guarino Laboratory for Applied Ontology Institute for Cognitive Sciences and Technology (ISTC-CNR) Trento-Roma, Italy ICDL 2004, New Dehli, India
2
Summary The role of ontologies in (generalized) IR Semantic Matching Increasing precision Increasing recall Problems with WordNet The role of Foundational Ontologies
3
The key problem Semantic matching
4
Simple queries: need more knowledge about what the user wants Search for “Washington” (the person) –Google: 26,000,000 hits –45th entry is the first relevant –Noise: places Search for “George Washington” –Google: 2,200,00 hits –3rd entry is relevant –Noise: institutions, other people, places
5
Solution: ontology+semantic markup Ontology –Person George Washington George Washington Carver –Place Washington, D.C. –Artifact George Washington Bridge –Organization George Washington University Semantic disambiguation/markup of questions and corpora –What Washington are you talking about?
6
The role of taxonomy and lexical knowledge Search for “Artificial Intelligence Research” –Misses subfields of the general field –Misses references to “AI” and “Machine Intelligence” (synonyms) –Noise: non-research pages, other fields…
7
Solution: Extra knowledge –Ontology: Sub-fields (of AI) Knowledge Representation Machine Vision etc. Neural networks –Lexicon: Synonyms (for AI) Artificial Intelligence Machine Intelligence Techniques –Query Expansion Add disjuncted sub-fields to search Add disjuncted synonyms to search –Semantic Markup of question and corpora Add “general terms” (categories) Add “synonyms”
8
Ontology-driven search engines Idealized view –Ontology-driven search engines act as virtual librarians Determine what you “really mean” Discover relevant sources Find what you “really want” Requires common knowledge on all ends –Semantic linkage between questioning agent, answering agent and knowledge sources Hence the “Semantic Web”?
9
Two main roles of ontologies in IT Semantic Interoperability –Generalized database integration –Virtual Enterprises, Concurrent Engineering –e-commerce –Web services Information Retrieval –Documents –Facts (query answering) –Products –Services [Not mentioning IS analysis and design…]
10
The role of linguistic ontologies coupled with structured representation formalisms Why not just simple thesauri based on fixed keyword hierarchies? –Data’s intrinsic dynamics needs to keep track of new terms –Need of understanding a rigid set of terms –Heterogeneous descriptions need a broad-coverage vocabulary Linguistic ontologies with structured representation formalisms –Decouple user vocabulary from data vocabulary –Increase recall (synonyms, generalizations) –Increase precision (disambiguation, ontology navigation) –Further increase precision (by capturing the structure of queries and data)
11
Using WordNet as an ontology Unclear semantic interpretation of hyperonimy Instantiation vs. subsumption Object-level vs. meta-level Hyperonymy used to account for polysemy (law both a document and a rule) Unclear taxonomic structure Glosses not consistent with taxonomic structure Heterogeneous leves of generality Formal constraints violations (especially concerning roles) Polysemous use of antonymy (child/parent vs. daughter/son) Poor ontology of adjectives and qualities Shallow taxonomy of verbs
12
When subtle distinctions are important “Trying to engage with too many partners too fast is one of the main reasons that so many online market makers have foundered. The transactions they had viewed as simple and routine actually involved many subtle distinctions in terminology and meaning” Harvard Business Review, October 2001
13
Ontologies and intended meaning Models M D (L) Language L Commitment: K = Conceptualization Ontology Intended models I K (L) Interpretation
14
Levels of Ontological Precision Ontological precision Axiomatized theory Glossary Thesaurus Taxonomy DB/OO scheme tennis football game field game court game athletic game outdoor game Catalog game athletic game court game tennis outdoor game field game football game NT athletic game NT court game RT court NT tennis RT double fault game(x) activity(x) athletic game(x) game(x) court game(x) athletic game(x) y. played_in(x,y) court(y) tennis(x) court game(x) double fault(x) fault(x) y. part_of(x,y) tennis(y)
15
Ontology Quality: Precision and Coverage Low precision, max coverage Less good Good High precision, max coverage BAD Max precision, low coverage WORSE Low precision and coverage
16
I A (L) M D (L) I B (L) Why precision is important False agreement!
17
DOLCE a Descriptive Ontology for Linguistic and Cognitive Engineering Strong cognitive bias: descriptive (as opposite to prescriptive) attitude Emphasis on cognitive invariants Categories as conceptual containers: no “deep” metaphysical implications wrt “true” reality Clear branching points to allow easy comparison with different ontological options Rich axiomatization
18
DOLCE’s basic taxonomy Endurant Physical Amount of matter Physical object Feature Non-Physical Mental object Social object … Perdurant Static State Process Dynamic Achievement Accomplishment Quality Physical Spatial location … Temporal Temporal location … Abstract Quality region Time region Space region Color region …
19
Foundational ontologies and ontological analysis Domain ontologies –Physical objects –Information and information processing –Social interaction –Ontology of legal and financial entities Ontology, language, cognition Ontology-driven information systems –Ontology-driven conceptual modeling –Ontology-driven information access –Ontology-driven information integration Research priorities at LOA-CNR www.loa-cnr.it
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.