Mark A. Musen Stanford University

Slides:



Advertisements
Similar presentations
Experiences from the NCBO OBO-to-OWL Mapping Effort Dilvan A. Moreira, University of São Paulo Mark A. Musen, Stanford University.
Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Design, prototyping and construction
Intelligence Step 5 - Capacity Analysis Capacity Analysis Without capacity, the most innovative and brilliant interventions will not be implemented, wont.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Teacher Excellence and Support System
Mine Action Information Center
The National Center for Biomedical Ontology Online Knowledge Resources for the Industrial Age Mark A. Musen Stanford University
Fit to Learn Using the Employability Skills Framework to improve your performance at College The Employability Skills Framework has been developed by business.
IS214 Recap. IS214 Understanding Users and Their Work –User and task analysis –Ethnographic methods –Site visits: observation, interviews –Contextual.
Case Tools Trisha Cummings. Our Definition of CASE  CASE is the use of computer-based support in the software development process.  A CASE tool is a.
Chapter 11 Artificial Intelligence and Expert Systems.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Software Process and Product Metrics
Approaches to ---Testing Software Some of us “hope” that our software works as opposed to “ensuring” that our software works? Why? Just foolish Lazy Believe.
Evaluating Ontology-Mapping Tools: Requirements and Experience Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Carlos Lamsfus. ISWDS 2005 Galway, November 7th 2005 CENTRO DE TECNOLOGÍAS DE INTERACCIÓN VISUAL Y COMUNICACIONES VISUAL INTERACTION AND COMMUNICATIONS.
Web 2.0 for Government Knowledge Management Everyone benefits by sharing knowledge March 24, 2010 Emerging Technologies Work Group Rich Zaziski, CEO FYI.
Overview of the Database Development Process
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Describing Methodologies PART II Rapid Application Development*
Principles of User Centred Design Howell Istance.
Conceptual Framework for the College of Education Created by: Dr. Joe P. Brasher.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
11 C H A P T E R Artificial Intelligence and Expert Systems.
Ensemble Computing in the National Science Digital Library (NSDL)
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
Ontology Summit2007 Survey Response Analysis -- Issues Ken Baclawski Northeastern University.
=_A-ZVCjfWf8 Nets for students 2007.
THE DANIELSON FRAMEWORK. LEARNING TARGET I will be be able to identify to others the value of the classroom teacher, the Domains of the Danielson framework.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
1 Introduction to Software Engineering Lecture 1.
Shruthi(s) II M.Sc(CS) msccomputerscience.com. Introduction Digital Libraries have become the source of information sharing across the globe for education,
Knowledge-based flexible workflow to support decision follow-ups Carla Valle Fraunhofer FIT - Germany.
Blogs, Wikis and Podcasting  By Zach, Andrew and Sam.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
MNP1163 (Software Construction).  SDLC and Construction Models  Construction Planning  Construction Measurement.
What is Artificial Intelligence?
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
On-To-Knowledge review Juan-Les-Pins/France, October 06, 2000 Hans Akkermans, VUA Hans-Peter Schnurr, AIFB Rudi Studer, AIFB York Sure, AIFB KMKMMethodology.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
Be.wi-ol.de User-friendly ontology design Nikolai Dahlem Universität Oldenburg.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Ontologies for the Semantic Web Prepared By: Tseliso Molukanele Rapelang Rabana Supervisor: Associate Professor Sonia Burman 20 July 2005.
Supporting Collaborative Ontology Development in Protégé International Semantic Web Conference 2008 Tania Tudorache, Natalya F. Noy, Mark A. Musen Stanford.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
Big Data: Every Word Managing Data Data Mining TerminologyData Collection CrowdsourcingSecurity & Validation Universal Translation Monolingual Dictionaries.
The National Center for Biomedical Ontology
Modern Systems Analysis and Design Third Edition
CASE Tools and Joint and Rapid Application Development
Development of the Amphibian Anatomical Ontology
A Guide to SDG Interactions: from Science to Implementation
Chapter 1 The Systems Development Environment
Design, prototyping and construction
Modern Systems Analysis and Design Third Edition
An Introduction to Data Warehousing
Modern Systems Analysis and Design Third Edition
Ontology-Based Approaches to Data Integration
The National Center for Biomedical Ontology
Design, prototyping and construction
Presentation transcript:

Mark A. Musen Stanford University Musen@Stanford.EDU The End of a Cottage Industry: The coming industrial revolution for biomedical ontologies Mark A. Musen Stanford University Musen@Stanford.EDU

Major technologies have radically changed our culture Agriculture The printing press The Industrial Revolution The World Wide Web

Major technologies have radically changed our culture Agriculture The printing press The Industrial Revolution The World Wide Web Computer-based representation of and access to knowledge?

The locus of knowledge publication determines knowledge “ownership” When textual information could be reproduced only by hand, knowledge effectively was owned by institutions such as the Church When textual information could be printed, knowledge was owned by those with printing presses and a means of distribution When textual information could be posted to the Web, knowledge began to become democratized

Knowledge workers seem trapped in a pre-industrial age Most ontologies are of relatively small scale Most ontologies are built and refined by small groups working arduously in isolation Success rests heavily on the particular talents of individual artisans, rather than on standard operating procedures There are few technologies on the horizon to make this process “faster, better, cheaper”

A Portion of the OBO Library

Throughout this cottage industry Lots of ontology development, principally by content experts with little training in conceptual modeling Use of development tools and ontology-definition languages that may be Extremely limited in their expressiveness Useless for detecting potential errors and guiding correction Nonadherent to recognized standards Proprietary and expensive

Our community needs Technologies Processes To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

Some people think that we are already there …

Our community needs Technologies Processes To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

Ontologies need to support multiple end-user goals Summarization and annotation of data Integration of data from multiple sources Support for natural-language processing Mediation among different software components Formal specification of biomedical knowledge

The paradox of ontology development Ontologies became popularized in biomedicine in part because tools such as DAG-Edit made development extremely manageable Developers of editing tools and languages have rushed to make their approaches accommodate more expressivity and to offer more power—and to comply with industry standards The result is the “Microsoft Word” problem

The NCI Thesaurus in OWL

We need steam engines for ontology development DAGs are too simple for developers to define specific concepts in machine-processable terms OWL is much too complex for most developers to use correctly There are no scalable tools that address the early, conceptual modeling stage How can we maximize expressivity while helping developers to manage complexity?

Our community needs Processes Technologies To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

We need to relate ontologies to one another We keep reinventing the wheel (e.g., how many different vertebrate anatomy ontologies do we need?) We don’t even know what’s out there! We need to be able to make comparisons between ontologies automatically We need to keep track of ontology history and to compare versions

We need to compute both similarities and differences Merging ontologies Mapping ontologies Differences Versioning

Different tasks lead to different tools B A B A B C=Merge(A, B) Map(A, B) Articulation ontology iPROMPT, Chimaera Anchor-PROMPT, GLUE FCA-Merge ONION

Industrialization requires Common platforms for locating, comparing, and integrating ontologies Environments for ontology engineering that are as comprehensive and robust as our environments for software engineering Technologies that can work with ontologies distributed anywhere in cyberspace

Ontology development is already a global activity!

Our community needs Technologies Processes To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

Ontology engineering requires management of complexity How can we keep track of hundreds, or even thousands, of relationships? How can we understand the implications of changes to a large ontology? How can we know where ontologies are underspecified? And where they are over constrained?

AT&T’s GraphViz system

It’s a bad sign that there are so many alternatives How do we know which visualization system is the “right” one for our situation? Why is there no visualization system that is uniformly loved and appreciated? Why can’t we apply the same energy to the problem of ontology visualization that we apply to that of visualizing huge data sets?

Our community needs Technologies Processes To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

Ontologies are not like journal articles It is difficult to judge methodological soundness simply by inspection We may wish to use an ontology even though some portions Are not well designed Make distinctions that are different from those that we might want

Ontologies are not like journal articles II The utility of ontologies Depends on the task May be highly subjective The expertise and biases of reviewers may vary widely with respect to different portions of an ontology Users should want the opinions of more than 2–3 hand-selected reviewers Peer review needs to scale to the entire user community

Solution Snapshot

In an “open” rating system: Anyone can annotate an ontology to say anything that one would like Users can “rate the raters” to express preferences for those reviewers whom they trust A “web of trust” may allow users to create transitive trust relationships to filter unwanted reviews

Possible Review Criteria What is the level of user support? What documentation is available? What is the granularity of the ontology content in specific areas? How well does the ontology cover a particular domain? In what applications has the ontology been used successfully? Where has it failed?

Ontologies need standard meta-data For provenance information For indexing For alignment with other ontologies For peer review

We need platforms for acquiring and displaying meta-data

Bringing ontologies to the industrial age will require: Environments that support community-based peer review Standard meta-data for storing reviews and annotations Environments for both ontology engineering and ontology access that can take advantage of these meta-data

Our community needs Technologies Processes To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

We have growing experience with large-scale ontology engineering CYC Open Directory Project Gene Ontology Consortium NCI Thesaurus

NCI Process for Ontology Editing and Maintenance : NCI Process for Ontology Editing and Maintenance

Evaluation of development processes remains a problem What are appropriate outcome metrics for judging success? How do we distinguish the contribution of the process from that of particular tools? How do we distinguish the contribution of the process from that of individual participants?

Our community needs Technologies Processes To help build and extend ontologies To locate ontologies and to relate them to one another To visualize relationships and to aid understanding To facilitate evaluation and annotation of ontologies Processes To aid in ontology management and evolution To enable end users to incorporate ontologies in their professional activities

A Portion of the OBO Library

Toward industrial-strength ontology repositories

The Industrial Revolution: The Good News Standardized, interchangeable parts Technologies for creating new technologies Tremendous increase in output Unparalleled incentives for innovation

The Industrial Revolution: The Bad News Decreased importance of skill and judgment of individual artisans Increased abilities of managers to define and control activities of laborers Loss of skills and judgment to deal with failures in processes that have been automated More mundane work

Moving from cottage industry to the industrial age There must be widely available tools that are open-source easy to use adhere to standards There must be a large community of workers who use the tools can provide feedback to one another and to the tool builders

Moving from cottage industry to the industrial age II Government and professional societies must set expectations regarding the need for appropriate standards Government and professional societies must invest in educational programs targeted for Ontology developers Ontology consumers Demonstration projects must document the strengths and weaknesses of tools, processes, and languages

A thousand flowers are blooming! Ontologies are being developed by interested groups from every sector of academia, industry, and government Many of these ontologies have been proven to be extraordinarily useful to wide communities We finally have tools and representation languages that can enable us to create durable and maintainable ontologies with rich semantic content

The foundation is in place Scientific culture now recognizes the importance of ontologies We are beginning to articulate best practices for ontology construction We have a burgeoning cottage industry at work

We need to move beyond individual, one-off ontologies and one-off tools to: Integrated ontology libraries in cyberspace Meta-data standards for ontology annotation Comprehensive methods for ontology indexing and retrieval Easy-to-use portals for ontology access, annotation, and peer review End-user platforms for putting ontologies to use for Data annotation Decision support Natural-language processing Information retrieval And applications that we have not yet thought of!