Environmental Health Science— Cross Domain Ontology Research (EHS-CORE) Project Collaborative Expedition Workshop #38, February 22, 2005, National Science Foundation Jane Greenberg, Associate Professor, School of Information and Library Science, University of North Carolina at Chapel Hill (SILS/UNC—CH) Abe Crystal, Research Assistant and Doctoral Student, SILS/UNC W. Davenport Robertson, Library Director, National Institute of Environmental Health Sciences
Obesity and the Built Environment: An Interdisciplinary Challenge Obesity in America has become an “epidemic.” (Health and Human Services Secretary Tommy Thompson) Accounts for more than 300,000 premature deaths each year, direct health care costs in excess of $61 billion Burden significantly greater in the lower socioeconomic strata, minority and vulnerable populations. Promising solution—integrate physical activity into daily life by improving the built environment—the physical surroundings in which one lives and works. Interdisciplinary nature of obesity and the built environment
Problem: “Information Silos” Researchers are unaware of useful data and literature sources in related disciplines, beyond their immediate scope, because they are confronted with information silos Scenario 1: we know it’s there, but “it’s roll the dice whether or not we find it” Scenario 2: we don’t know it’s there (student PubMed search misses many relevant databases) Researchers aware of resources in other domains must locate all relevant and independent data sources, interact with each data source in isolation, and manually combine results
Problem impact Researchers face: A labor-intensive and inefficient interdisciplinary research experience (hard to find/integrate data and literature from outside own domain) Difficulty in locating “undiscovered public knowledge” (Swanson, 1986)—research from disparate disciplines, that when combined can solve an open problem Duplicative research resulting from the absence of knowledge about research in related, but pertinent disciplines
Solution: information integration Research goals of proposed project: Integrate existing domain-specific ontologies to provide uniform intellectual access to interdisciplinary data and literature on obesity and the built environment. Use Semantic Web metadata and technologies to provide powerful querying and inferencing capabilities on the integrated ontology. Develop an ontology server capable of dynamically incorporating changes (i.e., “just-in-time” integration) in domain-specific ontologies (e.g., new or revised vocabularies) into the integrated ontology.
Proposed Research Team Domain science (nutrition and public health) UNC School of Public Health, Active Living by Design Ontology engineering and systems development (computer science) MINDSWAP/UMD Ontology and Web semantics development and evaluation (information science) Metadata Research Center/SILS/UNC-CH
Information Integration: Ontological Solutions Functional criteria Integrate ontologies from different domains/disciplines, using standard languages such as OWL Provide access to disparate and distributed data and literature Update vocabulary dynamically (on the fly, or at frequent intervals) based on changes in host ontologies
Information Integration: Ontological Solutions (2) Technical criteria The components must be openly accessible, preferably open source, and listed in a standard registry. They must use open enabling technologies and standards, such as: Uniform Resource Identifiers (URIs) Resource Descriptor Format (RDF), RDFS, and OWL (Web Ontology Language)
Implementation Domain research Multi-method approach (interviews, log analysis…) Ontology mapping Standardization, pruning, mapping, testing, reviewing, etc. Ontology server Define functional requirements, system architecture, prototyping, evaluation Document Cataloging Document sampling, cataloging (Dublin Core), metadata evaluation Unified interface Define functional requirements, prototyping, connect to ontology server, usability testing
Three Key Impacts Addresses a major social problem, epidemic obesity Validates an approach to dynamic ontological integration approach, which may be applicable to many domains Facilitates cross-domain research, leading to increased scientific productivity and discovery
Project Status Beginning preliminary fieldwork Pending proposals: NSF (system design and ontological integration), IMLS (user access to resource collection at ALbD) Environmental Health Science Thesaurus Forum (buy-in by many)
Selected References Greenberg, J. (2004a). Metadata Extraction and Harvesting: A Comparison of Two Automatic Metadata Generation Applications. Journal of Internet Cataloging, 6(4): Gruber, TR. (1993). A Translation Approach to Portable Ontology Specification. Knowledge Acquisition, 5: Gruber, TR. (1994). Toward Principles for the Design of Ontolgoies Used for Knowledge Sharing. IJHSC, 43 (5/6): Guarino, N. (1998). Formal Ontology and Information Systems. In: N. Guarino, editor, Proceedings of the 1st International Conference on Formal Ontologies in Information Systems, FOIS '98, Trento, Italy, June, 1998, ISO Press, pp Kalyanpur, A, Sirin E, Parsia B, and Hendler, J. (2004). Hypermedia inspired Ontology Engineering Environment: Swoop. Submitted to ISWC 2004 as a poster. [Online]. Available Lauser, B., Wildemann, T., Poulos, A., Fisseha, F., Keizer, J., and Katz, S. A Comprehensive Framework for Building Multilingual Domain Ontologies: Creating a Prototype Biosecurity Ontology. In Proceedings of the International Conference on Dublin Core and Metadata for e-Communities, 2002, Florence, Italy. October Firenze: Firenze University Press, pp , [Online] Robertson, WD, and Greenberg, J. (2004). Architecting a Cross-Disciplinary Thesaurus for the Semantic Web. DC-2004: Metadata across Languages and Cultures. Proceedings of the International Conference on Dublin Core and Metadata Applications, October 11-14, 2004, Shanghai, China. Sowa, J. F. (2002). Ontology, Metadata, and Semiotics, International Conference on Conceptual Structures, ICCS '2000, August 14-18, Darmstadt, Germany. Swanson, D. R. (1986). Undiscovered Public Knowledge. Library Quarterly, 56: Shanghai: Shanghai Scientific & Technological Literature Publishing House, pp