Digital Libraries In a Nutshell Roy Tennant The California Digital Library
Outline The Vision Definitions Perspectives How to Keep Current Research Production Services Collections How to Keep Current
The Vision Anyone, anywhere, will be able to easily locate and use any image, text, database, or other type of digital resource — often in sophisticated ways or in association with other related objects The only requirements: access to the Internet authorization or payment if required
Definitions: Part I “electronic” “digital” “virtual” information stored and accessed by electronic devices “digital” information stored and accessed by computers (an electronic device) “virtual” in essence rather than in actual fact
Definitions: Part II From the Association of Research Libraries -http://sunsite.berkeley.edu/ARL/definition.html Not a single entity Requires technology to link the resources of many Linkages are transparent to the user Collections are not limited to document surrogates, but include items that are exclusively digital
Perspectives: Research Research Perspective Goal: to further knowledge Participants: computer science/library/information science faculty, a few line librarians Example: U.S. Digital Library Initiatives (also called the National Science Foundation DL projects)
Sample Research Issues Advanced search techniques e.g., query by image content Federation of large, disparate, and distant collections Complex digital object behaviors GIS overlays, moving image navigation, etc.
Perspectives: Production Production Perspective Goal: to create digital library collections and services Participants: libraries (mainly larger research libraries, but not exclusively) Examples: Library of Congress American Memory (memory.loc.gov/) eLib Programme (www.ukoln.ac.uk/services/elib/) Digital Library Federation (www.clir.org/diglib/)
Production Issues Services Collections Selecting Acquiring Organizing Providing Access Preserving All the classic activities of libraries through the centuries, but in the digital realm they present both additional opportunities as well as challenges. I will be focusing on the unique aspects of these activities as they relate to building digital libraries, but in most cases all the usual considerations of these activities still apply. For example, when selecting material one must consider such issues as the cost of the material, its relative importance for your audience, etc.
Services The challenge: providing services when and where they are needed Examples: Guides to Internet resources Librarians’ Index - lii.org/ KidsClick! - kidsclick.org/ Network-based reference Reference 24x7 - 247ref.org/
Selecting Digital Material The process: how do you discover what is available? how can you evaluate the quality of resources? how can cost effectiveness be determined? (books remain, databases frequently don’t) Considerations: Purchase or license agreement funding source infrastructure required staff time to mount and maintain
Selecting Material to Digitize Focus on unique materials that are likely to have broad interest Build on strengths (seek critical mass) Consider infrastructure required Consider technical limitations
Acquiring: Digital Collections The digital acquisition continuum: New procedures and workflows are required tape loading, scanning, format translation, etc. linking mirroring hosting archiving LESS MORE Amount of Responsibility
Acquiring: Non-Digital Collections Digitization methods: scanner (flatbed, slide, handheld, etc.) digital camera: low-resolution - $US300-3,000+ high-resolution - $US25,000-35,000+ Kodak PhotoCD Additional step for text conversion Optical Character Recognition or Re-keying
Acquiring: Image File Formats Archival version: high-resolution TIFF Online versions: Preview: low-resolution GIF Full: medium-resolution JPEG High: med./high-resolution JPEG or TIFF Up-and-coming: MrSID, Flashpix, PNG
Acquiring: Text File Formats Original: MS Word, Adobe Pagemaker, etc. Adobe Acrobat Plain text HTML SGML or XML
Organizing: Naming and Addressing Object naming: Objects should be named in a fashion that promotes longevity (e.g., stay away from any kind of implied meaning) Object addressing: URLs (www.w3.org) DOI/Handles (www.cnri.reston.va.us) PURLs (purl.org) ARKs (www.ckm.ucsf.edu/people/jak/home/)
Organizing: Metadata Structured description of an object or collection of objects Three basic types: descriptive — e.g., title, creator, subject — used for discovery administrative — e.g., resolution, bit depth — used for managing the collection structural — e.g., table of contents page, page 34, etc. — used for navigation
Organizing: Metadata Appropriate standards or draft standards: Collection Level: Encoded Archival Description (EAD) -lcweb.loc.gov/ead/ Item Level: MARC Dublin Core - dublincore.org METS - www.loc.gov/standards/mets/
Providing Access How can we make our resources easily available to a diversity of users with a multiplicity of purposes? How can we integrate access to both print and digital resources? How can we interoperate with other digital collections?
Preserving Accepted preservation methods: Acid-free paper microfilm photographic reproduction The digital preservation strategy: Storing Refreshing Migrating The single most important aspect: institutional commitment
Interoperability The capability of two or more different digital collections to be used as one in a transparent fashion One example: Open Archives Initiative: http://www.openarchives.org/ Requires standards (at minimum) or a common platform
How to Keep Current Electronic Discussions: Publications: DIGLIB: www.ifla.org/II/lists/diglib.html Web4Lib: sunsite.berkeley.edu/Web4Lib/ XML4Lib: sunsite.berkeley.edu/XML4Lib/ Publications: “Digital Libraries” column in LJ — libraryjournal.reviewsnews.com D-Lib Magazine — www.dlib.org Current Cites — sunsite.berkeley.edu/CurrentCites/ RLG DigiNews — www.rlg.org/preserv/diginews/