Download presentation
Presentation is loading. Please wait.
Published byGeorgina Morgan Modified over 9 years ago
1
Digitization with Millennium & CONTENTdm Stuart Hunt IUG17 Anaheim May 2009
2
Overview Background Digitisation Metadata Workflows Now
3
University of Warwick Royal Charter 1965 Russell Group 16,000 FTE students 5000 staff
4
University Library Approx 1.1 million volumes 170 staff (110 FTE) Millennium 2003 Approx 100,000 issues/renewals per yr Approx 28,000 new books per yr RLUK member OCLC member
5
Content Marandet Collection 4000+ French plays 1720 to 1900 Acquired 1970s Guide published 1979 Bibliographic records in Millennium, RLUK, COPAC, & WorldCat No IPR issues
7
Projects Revolutionary Drama (1789-1800) –339 plays Empire Period Drama (1801-1815) –123 plays JISC Digitisation Programme: Enriching Digital Resources ‘Exposing Marandet’ –1500 plays/75,000 pages
8
Objectives Cross-searching Full-text searching Integration with existing & future systems –Millennium –Web –Vertical search solution
9
Options Existing solutions –Millennium –In-house web publishing tool Separate product –Digital collection management software –CONTENTdm Solution would drive approach taken
10
Digital production Image files –TIFF & JPEG derivative –Full colour & greyscale –Outsourced Text files/full-text transcripts –OCR quality initially not acceptable –Re-keying –Outsourced
11
Media Management Tried & tested solution Quick & easy Link digital content D2D process simplified Existing bibs New bibs Use existing authentication if required
13
Media Management No full-text searching No cross-collection searching (unless in separate scope) Tied to MARC metadata Metadata enrichment difficult Image file format Not a total solution
14
CONTENTdm Full-text & cross-collection searching Not tied to MARC metadata Metadata enrichment simple Local Windows server Initial licence <50K images Upgraded to unlimited licence 2008
15
Local metadata context Separate bibs –Print vs electronic –Describes what is –Supports better (future) FRBRisation –Ease of maintenance –Location & format based scoping 793 for local added entry/uniform title –Collection name
16
Metadata option 1 Create metadata within CONTENTdm Play-by-play Metadata already present in Millennium
17
Metadata option 1 Assumes that metadata is already available Not scalable Poor use of resources Does not allow data to work harder or smarter
18
Metadata option 2 Create metadata outside of Millennium Metadata not already present in Millennium Play-by-play Harvest from CONTENTdm into Millennium via XML Harvester
19
XML Harvester Single configuration file Needs to be edited for each separate resource Uses XSLT not load table(s) Major changes (e.g. harvest different schema) may need to be done by III
20
Configuration file triggers @XML_TYPE=DC (or MARCXML) @OAI_FORMAT=oai_dc @DBNAME=[Repository name] @URL=[url for OAI-PMH] @USEOAI=true (or false) @OAISET=[Name of set] @RECID_MARCTAG=001
21
XML Harvester
22
Harvested metadata Loaded through Data Exchange Significant re-editing Tags & indicators Diacritics Creating attached items or holdings records
23
Harvested metadata
24
Metadata option 3 Batchload into CONTENTdm via delimited file from Create Lists Cross-walk MARC21 to DC Directory structure
25
MARC to Simple DC crosswalk Record# dc:identifier 008/07-10 dc:language 100 dc:creator 245 dc:title 260|ab dc:publisher 260|c dc:date 300 dc:format 5XX dc:description 6XX dc:subject 700 dc:contributor 700|t dc:relation 793 dc:source
26
MARC – DC Crosswalk
27
Additional DC elements dc:rights dc:type Transcript mapped to dc:description
28
Metadata workflow Create separate bibs for e-versions Export print records via Data Exchange MarcEdit to remove extraneous tags (907, etc) Insert 006, 007, 008/23, GMD, 533 Re-import into Millennium as new bibs [856 CONTENTdm reference url added]
29
Metadata workflow Review file of newly loaded bibs exported from Create Lists Cross-walked from MARC to DC Additional DC elements added Item level metadata added Loaded to CDM as delimited files with directory structure
30
Metadata in CONTENTdm Compound objects Document level Page level –Less rich than document level Hospitable to multiple schemas Deliberate attempt to stay close to DC Administrative metadata –Later feature
31
Document level AACR in DC wrapper All descriptive metadata from bib (except LDR, 006, 007, 008, GMD) Authority control (names, subjects, uniform titles) Rights (dc:rights) Identifier (.b number) Mapped to DC for OAI harvesting
32
Page level Basic descriptive metadata (creator, title, publisher, date) Rights (dc:rights) Identifier (.b number) Transcript (dc:description) No OAI harvesting at page level –Local decision
33
Access & availability Availability across local → global continuum Metadata contribution Collection level descriptions OAI Collapse D2D
34
Metadata in WorldCat Local CDM server – not able to use Connexion Digital Import Bug between WorldCat and CDM for compound objects FRBRized display in worldcat.org potentially impedes discovery
35
Now ‘Exposing Marandet’ completes 9/2009 Established service 4 collections –Ancien Régime Drama –Revolutionary Drama –Empire Period Drama –Restoration Drama Integration with course delivery Metadata enrichment to/from CÉSAR
36
Links http://go.warwick.ac.uk/fac/arts/french/m arandet/http://go.warwick.ac.uk/fac/arts/french/m arandet/ http://www.jisc.ac.uk/whatwedo/program mes/digitisation/enrichingdigi/marandet. aspxhttp://www.jisc.ac.uk/whatwedo/program mes/digitisation/enrichingdigi/marandet. aspx http://webcat.warwick.ac.uk http://contentdm.warwick.ac.uk
37
stuart.hunt@warwick.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.