Download presentation
Presentation is loading. Please wait.
1
ISP 433/533 Week 8 IR in libraries
2
Goal Universal Access to Information Vannevar Bush 1945 article Memex A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.
3
History 1970s - commercial retrieval systems –Search remote databases to provide reference services 1980s – online public access catalog (OPAC), full text files –Provide online access to end users 1990s – digital library programs, WWW 2000s - ?
4
Bibliographic Databases Chemical Abstracts (CA), Engineering Index, MEDLINE, PsycINFO, etc. Manually selected, indexed, abstracted and entered into system Record format depends on field –Controlled vocabulary
5
Database Vendors DIALOG, LEXIS-NEXIS, OCLC, Wilson etc. Provide a common search interface Search on multiple bibliographic databases –Cross-databases search Mostly Boolean retrieval Cater to professional search intermediaries, e.g. reference librarians
6
OPACs Provide patrons access to library holdings Author, title, call number, subject heading, keywords Machine Readable Catalogue (MARC) Boolean search Web interface to legacy systems OPACs at Albany
7
Digital Libraries DL is a collection of information that is both digitized and organized - Lesk
8
How DL differ? Vs. Traditional bibliographic databases and OPACs –Extension and superset –Provide both metadata and data –New technology Vs. WWW –Organization –tightly controlled, and have a targeted customer set
9
Vs. Traditional Library Physical objects –You have it, I can’t have it –Travel to access –Expensive to maintain –Anything else? TL doesn’t collect “Grey Literature” –technical reports, government reports, unedited proceedings etc.
10
Converting to Digital Format Scanning –basically “photographing” a page Optical Character Recognition (OCR) –generally when scanning, additional s/w deduces semantic content from the photographed page (“guesses the words”) Keying –retyping it all back in... All too time-consuming and $$$! Best to avoid conversion altogether if possible
11
Better Way Publishing with a DL in mind Publishing in electronic form What format? ArchivalOriginalIntermediatePresentation PastTIFF--GIF, JPEG Present/Fut ure XML, RTFWord, TeX/LaTeX RTFPS, PDF, HTML
12
DL Architecture A Framework for Distributed Digital Object Services –Kahn/Wilensky Framework (KWF) digital objects (DOs) –a unit of exchange for the DL with a particular data structure and characteristics repository –the place where DOs live handles –a unique, persistent name for a DO
13
Kahn/Wilensky Framework
14
Digital Objects Typed data: –E.g type: computer-science-tech-report, bit- sequence… –with metadata: author, institution, series, etc. Composite DOs: –a DO with data of type digital-object –composite DOs can be used to collect similar works together composite DO than contains a DO for each work of Shakespeare...
15
Handles Handles can be thought of as a Uniform Resource Name (URN) implementation http://www.handle.net/ contains info about the handle system –persistence –location independence Handles are of the general form: GlobalAuthority.LocalAuthority/LocallyUniqueString or, for example: NASA.LaRC/tm112871 Possible project – evaluate various URN implementations (e.g. Handle, Purl, DOI )
16
Repository Access Protocol (RAP) “Protocol” may be misleading, its really just the skeleton for a protocol RAP is designed to be simple –repositories themselves should be simple KWF defines 3 basic operation classes: –ACCESS_DO –DEPOSIT_DO –ACCESS_REF Return reference to the repository server, this is the catch-all operation for all meta-services... –More operations were defined in implementations
17
DL Points The underlying architecture should be separate from the content stored in the library Names and identifiers are the basic building block for the digital library Digital library objects are more than collections of bits Users want intellectual works, not digital objects
18
5S Model Streams Structures Spaces Scenarios Societies
19
Many, many research projects Multilingual Multimedia Structured documents Distributed collections/federated search User interface Institution: creation, access, and use
20
Commercial DL Journal Storage Project –http://www.jstor.org/ –started as a University of Michigan project funded by the Andrew Mellon foundation, now a commercial organization Roughly 100 journals –mostly humanities, social science, math, economics Only WWW access –keeps a list of “allowed” IP names / addresses Provides only images for the pages OCR done, but the results are used for searching and not displaying to the user
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.