Download presentation
Presentation is loading. Please wait.
Published byLeslie Crawford Modified over 11 years ago
1
DCMI Workshop on Metadata and Search Vendor Panel Presentation Bradley P. Allen ballen@siderean.com http://www.siderean.com
2
Copyright © 2003 Siderean Software LLC. All rights reserved. Overview Our perspective is that of a Semantic Web application vendor Our belief is that faceted search will be the first killer application of the Semantic Web Our goal is to show how this is possible and what the benefits are But first, some general statements…
3
Copyright © 2003 Siderean Software LLC. All rights reserved. Tools that leverage Dublin Core Do supportable tools exist that take advantage of Dublin Core and other metadata standards to enhance search results? Yes, our work is a case in point Also relevant: Weblog CMS RSS aggregators Other RDF applications
4
Copyright © 2003 Siderean Software LLC. All rights reserved. What's missing? What do people need to be able to do to actually use metadata effectively on their intranets? Start using whats out there Data in relational tables CMS-generated metadata A lot of metadata is lying around unexploited
5
Copyright © 2003 Siderean Software LLC. All rights reserved. Are Dublin Core guidelines sufficient? What additional specifications are needed? None: DC is an excellent minimal vocabulary that has achieved broad acceptance What we need are best practices, e.g.: Encouraging resource values over literal values for DC attributes as good style dc:subject using controlled vocabularies dc:creator using authority records dc:date using temporal hierarchies Implementing DCMI validation services
6
Copyright © 2003 Siderean Software LLC. All rights reserved. Is XML the primary coding language? Is it being used for Dublin Core and other metadata applications? Yes, for all the right reasons Open standards Leverage of existing tools What other encoding methods are being used? RDF/N3 for some RDF-based applications
7
Copyright © 2003 Siderean Software LLC. All rights reserved. Our application: Seamark A navigation engine built on three key ideas Metadata represented in Resource Description Framework (RDF) is aggregated from existing enterprise content and data Faceted metadata retrieval turns the RDF into a navigation web service Web services make navigation applications easy to install and integrate with existing Web applications
8
Copyright © 2003 Siderean Software LLC. All rights reserved. Faceted search and RDF: why? Enabling more effective retrieval is a major goal for the Semantic Web RDF is a superb foundation for faceted search RDF as an open standard for metadata exchange RDF Schema as a framework for defining facets The Semantic Web will enable faceted search to become pervasive Widespread sharing and reuse of ontologies, vocabularies and DC instance data becomes possible The blogosphere as an existence proof View Source for the Semantic Web
9
Copyright © 2003 Siderean Software LLC. All rights reserved. Seamark, Dublin Core, and CVs Enables Dublin Core Using RDF encodings of DC Handles controlled vocabularies Using emerging RDF-based standards like TIF(S) Supports building and maintaining controlled vocabularies Concepts and terms represented as resources and encoded in RDF in the same way as other content Therefore the same tools apply
10
Copyright © 2003 Siderean Software LLC. All rights reserved. Seamarks search interface Use of flat or hierarchical controlled vocabularies Transparency and customizability of results ranking Parametric search with customizable pull-down menus
11
Copyright © 2003 Siderean Software LLC. All rights reserved. Lookups into large CVs in Seamark Use of standard vocabularies represented in RDF (e.g. LCs Thesaurus of Graphical Materials Faceted search over controlled vocabulary terms Syndication of CVs, instance data and ontologies for sharing
12
Copyright © 2003 Siderean Software LLC. All rights reserved. Query processing in Seamark Based on XML for Retrieval By Reformulation (XRBR) A query language that Provides support for query reformulation and refinement while minimizing roundtrips Supports a stateless protocol for faceted metadata retrieval with SOAP as a transport mechanism Handles very large result sets gracefully Think of XRBR as an application profile in the digital library sense Specifies a view over heterogeneous metadata schemas with hints as to its interpretation and display
13
Copyright © 2003 Siderean Software LLC. All rights reserved. Query processing in Seamark Disambiguation Suggestions provide this implicitly Query expansion and concept mapping RDF models plus XRBR structure queries provide a general mechanism for this Entity extraction XSLT extensions at import augments raw metadata with additional extracted attributes Natural language processing Direct manipulation now; QA to come
14
Copyright © 2003 Siderean Software LLC. All rights reserved. Searching across collections Metadata aggregation using RDF provides a general platform for federated search We can directly leverage emerging SW approaches to: Thesaurus mapping tif:concept-equivalence Schema mapping rdfs:subPropertyOf
15
Copyright © 2003 Siderean Software LLC. All rights reserved. Setup and maintenance Installation and configuration for Windows, Linux and Mac OS X Administration Simple web-based administration interface for aggregating feeds and specifying initial queries Training 135 page tutorial Extensive on-line API documentation Courses One-day on-site introduction
16
Copyright © 2003 Siderean Software LLC. All rights reserved. Setup and maintenance Shelley Powers, Practical RDF, O'Reilly & Associates, 2003:... the application is easily installed and configured, and comes with considerable documentation What I was most impressed with about the product, though, was how quickly and easily it integrated my RDF/XML data … into a sophisticated query engine with little or no effort.
17
Copyright © 2003 Siderean Software LLC. All rights reserved. Seamarks administration interface Users can specify URLs serving RDF to load into a given model … then load them manually or on a schedule basis Alternatively, queries can be executed against an SQL database XSLT stylesheets transform XML documents and SQL result sets into RDF Aggregated models can be dumped to RDF
18
Copyright © 2003 Siderean Software LLC. All rights reserved. Sites using Seamark
19
Copyright © 2003 Siderean Software LLC. All rights reserved.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.