1 Harmonizing Taxonomies: Draft for Discussion at the OASIS eGov Technical Committee Meeting Brand Niemann US Environmental Protection Agency January 6, 2004
2 Why Do We Need Taxonomies? The U.S. Federal Government has a Chief Information Officer (CIO) Council: –The former Chair was Mark Forman, the eGov Chief, and the Vice Chair is Karen Evans who is now the new eGov Chief! Karen Evans supports “Semantic XML Web Services” in her December 17, 2002, CIO Council Vision Statement: –“We see the Council’s mission as…. developing taxonomy and XML data definitions that apply across government …so the information we create can be shared and easily accessed regardless of its origins.”
3 Why Do We Need Taxonomies? The eGov Act of 2002: –SEC ACCESSIBILITY, USABILITY, AND PRESERVATION OF GOVERNMENT INFORMATION. (a) PURPOSE.—The purpose of this section is to improve the methods by which Government information, including information on the Internet, is organized, preserved, and made accessible to the public. (b) DEFINITIONS.—In this section, the term— (1) ‘‘Committee’’ means the Interagency Committee on Government Information established under subsection (c); and (2) ‘‘directory’’ means a taxonomy of subjects linked to websites that— (A) organizes Government information on the Internet according to subject matter; and (B) may be created with the participation of human editors.
4 Why Do We Need Taxonomies? The eGov Act of 2002 (continued): –SEC ACCESSIBILITY, USABILITY, AND PRESERVATION OF GOVERNMENT INFORMATION. (d) CATEGORIZING OF INFORMATION.— (1) COMMITTEE FUNCTIONS.—Not later than 2 years after the date of enactment of this Act, the Committee shall submit recommendations to the Director on— (A) the adoption of standards, which are open to the maximum extent feasible, to enable the organization and categorization of Government information— (i) in a way that is searchable electronically, including by searchable identifiers; and (ii) in ways that are interoperable across agencies; (B) the definition of categories of Government information which should be classified under the standards; and (C) determining priorities and developing schedules for the initial implementation of the standards by agencies.
5 Why Do We Need Taxonomies? The U.S. Federal Enterprise Architecture (FEA) Data and Information Reference Model (DRM): –Volume 1 – Bob Haycock, OMB Chief Architect, will soon release with guidance to the agencies. The E-Government Act 2002, Section 207, Interagency Committee on Government Information, will use top two layers of the DRM structure for categorization of government information (see next slide). The E-Government Act 2002, Section 212, calls for a series of no more than 5 pilot projects that integrate data elements to encourage integrated collection and management of data and interoperability of Federal Information systems. –Data Management Strategy – In process and draft to be released soon. Have several critiques of the ISO to improve the DRM Model including the suggested use of the Meta Object Facility (MOF) from the Object Management Group (OMG) by MetaMatrix (see slide 6). –Volumes 2-4 – Released by July DRM business context, DRM information exchange, and DRM data elements.
6 The Current DRM Model A model for discovery of information: –Context and classification. –To determine available packages and elements. A model for exchange of information: –Information packages, built from common data elements. –Sharing mechanism. A model for representation of information: –Data elements defined in standard way. BUSINESS CONTEXT Subject Area Super Type BUSINESS DATA FLOW Information Exchange Package DATA ELEMENT Data Object Data Property Data Representation ISO 11179
7 Expanding the DRM Model MetaMatrix vision: –Generic classification to tag metadata with context: vs. 2-level context. –Packages built from complex datatypes and deployable for exchange or data access : vs. exchange-only packaging of ISO data elements. –Formal datatype model: vs. more conceptual ISO model. –Formal reference information to add semantic value to data definitions : vs. nothing. BUSINESS CONTEXT Subject Area Super Type BUSINESS DATA FLOW Info Exch Package DATA ELEMENT DRM ModelMetaMatrix Model ISO CLASSIFICATION Context PACKAGE Virtual Database Category REFERENCE Glossary Thesaurus Bibliography Exchange Package TYPE Complex Datatype Abstract Datatype Simple Datatype Schema/Association INSTANCE Transform Virtual Physical Data Property Data Representation Data Object
8 What is a Taxonomy? From Tim Berners-Lee, ISWC 2003 Regardless of end goals, look to a future where taxonomies interoperate (domains connect) Expect new stakeholders to take an interest… … but have their own viewpoints Technology Recommendation: RDF(S) Goals for enterprise taxonomies
9 What is a Taxonomy? A taxonomy is a model of knowledge organized as a hierarchical arrangement (tree structure) of concepts : –parent nodes denote more general ideas than their children. animal horsesheep marestallioneweram animal horsesheep dales pony arabian horse swaledalecheviot OR [A][B]
10 What is a Taxonomy? A taxonomy can be: –A classification hierarchy, eg: Natural Taxonomy: Unique Beginner (plant) -> Life-Form (bush) -> Generic (rose) - > Specific (hybrid tea) -> Varietal (Peace) –A part hierarchy (Meronomy) –A category hierarchy Taxonomies can intersect – intersection means there are different relationships at work: Reference: D.A. Cruise, “Lexical Semantics”, Cambridge University Press, 1986 building cinemaOffice-blocksynagoguemosquepubchurch shrine holy place
11 topSAIL/tdf ™ – Taxonomy Development Framework: A five-step method for taxonomy development Focus What is the taxonomy for? What business challenges will it overcome? What results will it achieve? How to measure stakeholder benefit? Analysis What is the context for the taxonomy? What are the types & sources of knowledge? How does knowledge map to processes? Design What types of taxonomy concepts are needed? What to do first? What system capabilities are needed? What will be the impact? Is the taxonomy design correct, complete and consistent? Construct Have we enough content mapped? How to connect taxonomies to content? How to integrate with IT systems? Deploy How do we ensure there will be feedback for assessment? Have we accomplished set objectives? What should be done next? 12345
12 What Are Some Next Steps? Identify Existing eGovernment Taxonomies: –E.g., U.S. Federal Enterprise Architecture Business Reference Model (BRM), etc. Encourage Building of eGovernment Taxonomies Through Best Practice Examples and Training Materials. Pilot the Harmonization of Several eGovernment Taxonomies.