Ontologies (What they are; Why you should care; What you should know) Deborah L. McGuinness Associate Director and Senior Research Scientist Knowledge Systems Laboratory Stanford University Stanford, CA 94305 650-723-9770 dlm@ksl.stanford.edu
Disjointness, Inverse, part-of… What is an Ontology? Thesauri “narrower term” relation Frames (properties) Formal is-a General Logical constraints Catalog/ ID Informal is-a Formal instance Disjointness, Inverse, part-of… Terms/ glossary Value Restrs.
Ontologies and importance to E-Commerce Simple ontologies (taxonomies) provide: Controlled shared vocabulary (search engines, authors, users, databases, programs/agents all speak same language) Site Organization and Navigation Support Expectation setting (left side of many web pages) “Umbrella” Upper Level Structures (for extension) Browsing support (tagged structures such as Yahoo!) Search support (query expansion approaches such as FindUR, e-Cyc) Sense disambiguation
Ontologies and importance to E-Commerce II Consistency Checking Completion Interoperability Support Support for validation and verification testing (e.g. http://ksl.stanford.edu/projects/DAML/chimaera-jtp-cardinality-test1.daml ) Configuration support Structured, “surgical” comparative customized search Generalization/ Specialization … Foundation for expansion and leverage
A Few Observations about Ontologies Simple ontologies can be built by non-experts Verity’s Topic Editor, Collaborative Topic Builder, GFP, Chimaera, Protégé, OIL-ED, etc. Ontologies can be semi-automatically generated from crawls of site such as yahoo!, amazon, excite, etc. Semi-structured sites can provide starting points Ontologies are exploding (business pull instead of technology push) most e-commerce sites are using them - MySimon, Amazon, Yahoo! Shopping, VerticalNet, etc. Controlled vocabularies (for the web) abound - SIC codes, UMLS, UN/SPSC, Open Directory (DMOZ), Rosetta Net, SUO Business interest expanding – ontology directors, business ontologies are becoming more complicated (roles, value restrictions, …), VC firms interested, DTDs are making more ontology information available Markup Languages growing XML, RDF, DAML, RuleML, xxML “Real” ontologies are becoming more central to applications
Implications and Needs Ontology Language Syntax and Semantics (DAML+OIL) Environments for Creation and Maintenance of Ontologies Training (Conceptual Modeling, reasoning implications, …)
Issues Collaboration among distributed teams Interconnectivity with many systems/standards Analysis and diagnosis Scale Versioning Security Ease of use Diverse training levels /user support Presentation style Lifecycle Extensibility
Chimaera – A Ontology Environment Tool An interactive web-based tool aimed at supporting: Ontology analysis (correctness, completeness, style, …) Merging of ontological terms from varied sources Maintaining ontologies over time Validation of input Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language Used in commercial and academic environments Available as a hosted service from www-ksl-svc.stanford.edu Information: www.ksl.stanford.edu/software/chimaera
Discussion/Conclusion Ontologies are exploding; core of many applications Business “pull” is driving ontology language tools and languages New generation applications need more expressive ontologies and more back end reasoning New generation users (the general public) need more support than previous users of KR&R systems Distributed ontologies need more support: merging, analysis, incompleteness, versioning, etc. Scale and distribution of the web force mind shift Everyone is in the game – US Government (DARPA, NSF, NIST, …), EU, W3C, consortiums, business, … This is THE time for ontology work!!!
Some Pointers Ontologies Come of Age Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html Ontologies and Online Commerce Paper: http://www.ksl.stanford.edu/people/dlm/papers/ontologies-and-online-commerce-abstract.html DAML+OIL: http://www.daml.org/
Extras
E-Commerce Search (starting point Forrester Research modified by McGuinness) Ask Queries - multiple search interfaces (surgical shoppers, advice seekers, window shoppers) - set user expectations (interactive query refinement) - anticipate anomalies Get Answers - basic information (multiple sorts, filtering, structuring) - modify results (user defined parameters for refining, user profile info, narrow query, broaden query, disambiguate query) - suggest alternatives (suggest other comparable products even from competitor’s sites) Make Decisions - manipulate results (enable side by side comparison) - dive deeper (provide additional info, multimedia, other views) - take action (buy)
The Need For KB Analysis Large-scale knowledge repositories will necessarily contain KBs produced by multiple authors in multiple settings KBs for applications will typically be built by assembling and extending multiple modular KBs from repositories that may not be consistent KBs developed by multiple authors will frequently Express overlapping knowledge in different, possibly contradictory ways Use differing assumptions and styles For such KBs to be used as building blocks - They must be reviewed for appropriateness and “correctness” That is, they must be analyzed
Our KB Analysis Task Review KBs that: Were developed using differing standards May be syntactically but not semantically validated May use differing modeling representations Produce KB logs (in interactive environments) Identify provable problems Suggest possible problems in style and/or modeling Are extensible by being user programmable