Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making Metadata Work for the NSDL. Starting from Sept. 2001 with...  A prototype with not much behind it that was re-usable (http://siteforscience.org)http://siteforscience.org.

Similar presentations


Presentation on theme: "Making Metadata Work for the NSDL. Starting from Sept. 2001 with...  A prototype with not much behind it that was re-usable (http://siteforscience.org)http://siteforscience.org."— Presentation transcript:

1 Making Metadata Work for the NSDL

2 Starting from Sept. 2001 with...  A prototype with not much behind it that was re-usable (http://siteforscience.org)http://siteforscience.org  Lots of good ideas based on that prototype  An Oracle license  A very small group of people with many different visions of what we were doing  The management structure of a research project (e.g., none)

3 ... jump to Dec. 2, 2002, when you will see (http://nsdl.org):  A Metadata Repository with roughly 250,000 metadata records (items and collections)  A uPortal-based user interface, containing:  a search service  a simple topic browse of collections  featured collection exhibits  views of future enhancements  A developing plan for the future

4 Getting from there to here  Designing the Metadata Repository  Working with unfinished standards  Dublin Core in transition  XML schema for qualified DC in early stages  OAI 2.0 not yet cooked  Concerns from partners and funders around quality issues  Envisioning Simple Metadata-Based Services (SiMBaS)

5 The Metadata Repository  Designed to scale  Based on an automated harvest/expose model with OAI at each end  A notion of “normalized metadata” with qualified Dublin Core as its base  Transformations on the way in, native and transformed re-exposed

6 Standards at the bleeding edge  Metadata strategy based on crosswalking from 8 formats to one (NSDL-DC)  The reality: a Baskin-Robbins model of “standard metadata”  Standards badly documented, little organized support, very little training available at any price  Projects not obligated to offer metadata, even if they had it (in whatever form)

7 OAI in transition—the story in 2002  Version 1.1 was not yet widely used  Version 2 not yet available; NSDL became beta-tester (!)  Final version of OAI 2.0 delayed by NSDL needs (definition of change)  Now working with collection partners to bring up servers, ensure validation  Lower end option for OAI on the way

8 The DC schema wars  DC-Architecture group working primarily on RDF schema  Gang of Five began work outside DC, presented version for comment to DC- Architecture Oct. 2002  Process of approval not yet complete, NSDL using “final” version

9 A few schema issues...  Three namespaces  Restricted to “simple literal” values  Refinements expressed as elements  Encoding schemes expressed as new complexTypes (schemes not limited to a single element)  NSDL Schema types:  NSDL-DC  NSDL-Search  NSDL-All

10 The Process  Data harvesting  Data evaluation  Transform specification  DB_insert file creation  Database ingest  OAI re-exposure

11 Data evaluation  XML validity  DC conformance (whether simple or qualified)  Emphasis on Date, Type, Format, Identifier  Potential problem areas:  Special characters  “funky text”  Tools: XML Spy, Spotfire

12 Specifying transform  Simple transforms (DC simple—>DC qualified)  Scheme identification for standard values (date, type, format, language)  Quality transforms  improving functionality of search limits by ensuring appropriate values for type and format  improving user experience by deleting funky text and special characters that affect display

13 DB_Insert file  Header  First harvest?  Category (item, collection, annotation...)  Harvest date  Source  Link to “native” metadata

14 OAI exposure  OAI “About”  OAI “Provenance”  Metadata origin and rights assertions  Alterations to originally harvested data  Re-harvest information  Collection (& brand) association

15 Still to do...  Currently running on “manual”  Ingest process not yet completed or documented  data validation routines  additional metadata types (annotation) and services linked to metadata

16 Automation opportunities  Collection registration  assignment of unique identities  “responsible entities”  harvest/re-harvest, transform/re- transform, and associated record keeping  integrating/linking new information with metadata record (service model)

17 Other challenges  Educating data providers and aggregators about GOOD METADATA  Better techniques for evaluation and transformation  Coping with users? (Uh, oh)

18 For more information  The NSDL Metadata Primer (http://metamanagement.comm.nsdlib.o rg/outline.html)http://metamanagement.comm.nsdlib.o rg/outline.html  NSDL XML schema (http://ns.nsdl.org/schemas/nsdl_dc/nsdl _dc_v1.00.xsd)http://ns.nsdl.org/schemas/nsdl_dc/nsdl _dc_v1.00.xsd


Download ppt "Making Metadata Work for the NSDL. Starting from Sept. 2001 with...  A prototype with not much behind it that was re-usable (http://siteforscience.org)http://siteforscience.org."

Similar presentations


Ads by Google