Using Multiple Metadata Formats in DSpace ARD Prasad Indian Statistical Institute Bangalore, India
MARC & Metadata Covers all types of documents (more than 1000 elements) Basically used for OPACs Plethora of MARCs Requires librarians Uses ISO-2709, XML Separate schemes for each type of document Web documents, digital libraries Plethora of meta data formats Meant for non-librarians Uses XML
Metadata Formats More than two dozen formats available for every conceivable digital object ETD E-Learning E-Governance Geo-Spatial Data Architectural Drawings Museum items
List of Some Metadata Formats DC METS MODS VRA Core SCORM LOM GEM EAD TEI CIMI PB Core VRA Core IMRC CDWA CSD GM MIDAS VERS DDI PREMIS CIDOC ETDMS AGLS GILS ONIX
DSpace Default workflow supports Qualified Dublin Core DSpace OAI supports unqualified Dublin Core DSpace v allows you to extend to Non-DC formats
Metadata Issues in DSpace Adding New Elements Input Forms Indexing Display of search results Import/Export OAI-PMH and crosswalks
Adding New Elements Dublin Core Registry using DSpace administration Directly adding to ‘dctyperegistry’ table in PostgreSQL
Input Workflow Using the new facility by modifying –$DSPACE_HOME/config/inputforms.xml
Indexing Adding the elements to be indexed in dspace.cfg file, so that Lucene generates indexes on desired elements
Search Result Display Display full record need not be modified Default display can be changed by modifying ItemTag.Java file
Import & Export Within DSpace community, it really does not matter, though DSpace produces DC- like format in a file called dublin_core.xml file However, across other DL software, we should evolve interoperability mechanism
OAI-PMH New format should appear in OAICat.properties file Java programs should be written similar to that of OAIDCCrosswalk.java, for each metadata format
Issues of Crosswalk Crosswalk will always result in some data loss One should use ‘selective harvesting by collection’, using appropriate –‘metadataPrefix’ verb and –‘set’ verb for limiting the collection One may consider DC as the lowest common denominator
Possible relations between two metadata formats Crosswalk can be achieved, in case of –One to one – ideal, but not real –Many to one Crosswalk will be lossy, in case of –One to many –One to none –None to one
Suggestions for DSpace Development Inputforms.xml can be modular, so that inputforms can be defined in separate files for each format Dctyperegistry should have an element ‘metadata format’ so that OAICat exposes metadata of records which were created using a specific format Perhaps OAI-PMH protocol itself requires modification (Imagine harvesting repositories with varied items and metadata formats)
Thank You Please visit: LDL: Librarians’ Digital Library – SDL: Search Digital Libraries (Harvester) – Our Discussion Forum (DLRG) –