Vocabulary Markup Language (Voc-ML) Project Joseph A. Busch Content Intelligence Evangelist Interwoven
Agenda Background Revision summary Issues and Next steps
Soergel’s SemWeb Proposal System of integrated access to data on concepts and terminology. Bring together variety of sources that exist largely in separate worlds, including dictionaries, thesauri, classification schemes, etc. Federated system with multiple collaborators. Common interface to all concept & terminology knowledge bases on the Internet.
The Real Semantic Web Namespace for uniquely identifying a semantic scheme & each concept within each scheme. Broad template or conceptual schema for holding all types of semantic information & specifying relationships among them. Definitions of services for interacting with the System.
Vocabulary Markup Language (Voc-ML) XML schema for the Semantic Web. Broad template for structured representation of semantic schemes. Dublin Core metadata. Tags and syntax for uniquely identifying each concept. Typed relationships (hierarchical, associative, etc.) Typed notes. Host agency: Networked Knowledge Organization Systems (nkos.slis.kent.edu)nkos.slis.kent.edu
DFSIC-1998 Standard Industrial Classification (1987) Interwoven U.S. Department of Commerce … Field Crops, except Cash Grains, not elsewhere classified Establishments primarily engaged in the production of field crops, except cash grains, not elsewhere classified. This industry also includes establishments deriving 50 percent or more of their total value of sales of agricultural products from field crops, except cash grains (Industry Group 013), but less than 50 percent from products of any single industry … Dublin Core Unique ID Typed Relationships
Agenda Background Revision summary Issues and Next steps
Voc-ML Version 1.7 Revisions Added editDate and source attributes to all “customer” updatable elements. Removed most remnants of Datafusion Concept Catalog, including CCID, XWalkHeader, etc.
Voc-ML Version 1.8 Revisions Defined and commented Path ID syntax and usage. PID allows encoding of complex polyhierarchies with context- dependent children, e.g., MeSH. * *This is still in flux. Currently all path information can be encoded in parent/child tags, but current Interwoven software will not read path tags.
Voc-ML Version 1.9 Revisions Removed more excess elements from Datafusion Concept Catalog, including CCLoadFile, ForbiddenIDs. Removed all instances of CTYPE attribute. (New ID reference semantics). Changed optional Note & Misc elements to be allowed only once, not many times. * *This is still in flux. Currently reflects what Interwoven software does, not what the standard should do.
Voc-ML Version 1.95 Revisions Added editHistory and Edit elements in SVHeader. Added editDate and Source attributes to Path element. Edited and updated comments on Path, Parent, & Child to reflect new semantics of the ID reference attributes in these elements. Added comments on commonHeaderBlock parameter entity to explain Dublin Core and cite DC reference. Cleaned up many other comments for readability.
Voc-ML Version 1.99 Revisions Added xmlns:dc attribute to root SrcVocab element. Changed PID attribute in path to CDATA. Changed UREF attribute in child to CDATA. Made editHistory in the SVHeader optional.
Agenda Background Revision summary Issues and Next steps
Voc-ML Issues UID syntax (Light, 5/25/01) Topic maps vs Voc-ML Functions to be served by standards for machine-readable thesauri (Soergel, 11/21/01) Data input, transfer Query and view URIs Normalization vs specialization (ASIS meeting, 11/13/01) Type all relationships (remove Parent, Child, RelatedTerm elements) Type all notes (remove Definition element, add type for Note)
Proposed Next Steps Post Voc-ML 2.0 Provide examples of marked-up resources Prepare W3C RFC
Joseph A. Busch Director, Solutions Architecture Interwoven fax