MIT CSAIL/IBM Watson Research © 2004 IBM Corporation Haystack: Bringing Good Metadata to Life Dennis Quan
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 2Haystack: Bringing Good Metadata to LifeMay 22, 2004 Outline Exposing the benefits of RDF data integration Demonstration Prototyping in the Haystack environment –Hooking in different RDF sources –Designing visualizations –Adenine scripting language Example: Open Directory browser
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 3Haystack: Bringing Good Metadata to LifeMay 22, 2004 Show me the metadata Common questions regarding Semantic Web applications: –“Is this stuff practical?” –“Are you just overloading me with more information?” –“What can I do with this data today?” –“What is RDF giving me over databases and XML?” Asked by developers, not just users and observers Approach: easy prototyping environment for visualizing connections within and among metadata sources –The “museum” approach versus the “brochure” approach
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 4Haystack: Bringing Good Metadata to LifeMay 22, 2004 The Haystack Semantic Web browser Allows users to create, explore, and organize RDF information spaces –Web browser-style navigation of Semantic Web resources –Metadata can be fetched from a variety of sources –User-selectable presentation templates (“views”) –Flexible bookmark management system (“collections”) –Access to Semantic Web Services Research project originating from MIT CSAIL Open Source Java project built on top of Eclipse, IBM’s Open Source rich client platform
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 5Haystack: Bringing Good Metadata to LifeMay 22, 2004 Fictitious example: a rock star’s manager A day in the life of the manager of the famous-physicist- turned-rock-star, Johnny Doe Some of the backend services and data have been mocked up, but presentation services are real –Point of demonstration is to show what can be seen through Haystack, which is acting as a front end Key concepts to watch for: –Views –Lenses –Collections –Semantic Web Services
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 6Haystack: Bringing Good Metadata to LifeMay 22, 2004 View Ontology Web Language (VOWL) RDF Schema, DAML+OIL, and/or OWL used to describe ontologies to Semantic Web agents Similarly, VOWL is used to describe presentation knowledge about ontologies to user agents –Views: different ways of looking at resources –Lenses: sets of properties that make sense being shown together –Operations: mini Semantic Web Services with type information that specify what kinds of resources can be used with them –VOWL definitions, like OWL definitions, are encoded in RDF
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 7Haystack: Bringing Good Metadata to LifeMay 22, 2004 Process diagram Metadata Presentation recommendations Ontological specifications Applicable service descriptions + Point and click, hyper- linked UI
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 8Haystack: Bringing Good Metadata to LifeMay 22, 2004 Incrementality in the user interface The more Haystack knows about an ontology, the better job it can do presenting objects to the user –With no knowledge, Haystack shows a property listing –With rdfs:label and dc:title attributes, Haystack shows human-readable names –With rdfs:domain, rdfs:range, daml:UniqueProperty, daml:ObjectProperty, and daml:DatatypeProperty, specialized forms can be produced –With lenses, Haystack shows filtered property listings in All Information and Explore Relationships views –With custom views defined, Haystack can show a completely custom presentation These specifications do not have to all come from the same place; different pieces of presentation knowledge can be fused together
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 9Haystack: Bringing Good Metadata to LifeMay 22, 2004 Getting metadata into the system Metadata can come from: –File system –Web servers –LSID servers –Jena stores –Joseki servers –Annotea servers –Web Services In a number of formats: –RDF/XML –Notation3 –Adenine –RSS and other XML formats (via XSLT)
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 10Haystack: Bringing Good Metadata to LifeMay 22, 2004 Life Science Identifiers (LSID) Hyperlinking metaphor and URLs on billboards depends on there being a metadata retrieval mechanism Life Sciences community coming together around LSID –urn:lsid:[server name]:[db-specific identifier] –Retrieval protocol based on SOAP and RDF –Undergoing standardization by OMG and I3C –Open Source client/server libraries provided by IBM –Many public data sources accessible via LSID today— beginnings of a Biological Semantic Web Not specific to Life Sciences Support built into Haystack
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 11Haystack: Bringing Good Metadata to LifeMay 22, 2004 Adenine Adenine is Haystack’s RDF scripting language –Syntactically, a cross between Notation3 and Python –Both a data definition language (RDF) and an imperative scripting language –Native support for RDF manipulation –Access to Java classes and methods Haystack system built like a Lisp machine –Everything is accessible from the “Adenine console” Leveraging the Eclipse platform –Powerful Adenine text editor with outline and syntax highlighting
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 12Haystack: Bringing Good Metadata to LifeMay 22, 2004 Example: an Open Directory browser
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 13Haystack: Bringing Good Metadata to LifeMay 22, 2004 Current status of prototype Open Source, Java/Eclipse-based implementation Runs on Windows, Linux, and Mac OS X Easy to hook in new data sources Stable, but still some usability issues Provides stable platform for extensions (Eclipse plug-ins) GuruGrandmaPower user GoalWe are here
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 14Haystack: Bringing Good Metadata to LifeMay 22, 2004 Packaging VOWL specifications can be: –Made available for download from a Web site –Packaged with instance metadata coming from the server –Put into an Eclipse plug-in Distributing your own custom Haystack is easy –Documentation describes process to create a stripped- down, specialized version of your own Semantic Web browser –Can integrate custom RDF metadata, ontologies, VOWL specifications, and even Java and Eclipse components
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 15Haystack: Bringing Good Metadata to LifeMay 22, 2004 Real life Haystack application: myGrid provenance Courtesy of Professor Carole Goble, University of Manchester
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 16Haystack: Bringing Good Metadata to LifeMay 22, 2004 Key ideas Demonstrating the value of RDF is easiest when the user can experience the benefits for him or herself Haystack is an extensible Semantic Web browser: –Connects to a variety of RDF sources –Exposes an intuitive, Web browser-like interface –Incrementally improves experience as more ontological and presentation knowledge is provided –Built on Eclipse, providing a solid basis for extensions –Scriptable using Adenine Haystack addresses important HCI concerns, e.g., personalization and organization, that must be supported in information applications but are often taken for granted
MIT CSAIL/IBM Watson Research © 2004 IBM Corporation 17Haystack: Bringing Good Metadata to LifeMay 22, 2004 Thank you for your attention Dennis Quan, Haystack project home page (new download coming May 24) – Documentation! – IBM LSID home page – Eclipse home page – myGrid home page –