Download presentation
Presentation is loading. Please wait.
Published byDale Warren Modified over 9 years ago
1
UC Libraries and the Implications of Mass Digitization Robin L. Chandler User’s Council May 11, 2007
2
Seek to achieve in this talk: Status report on UC Libraries’ mass digitization projects Impact of mass digitization on our collections and our users
3
UC Libraries’ Mass Digitization Projects Overview of two projects –Microsoft/Internet Archive –Google Books Look at Operations April 2007 status report on scanning
4
Understanding Participant Roles UC Libraries –supply & curate books –preserve digital files created –supply onsite scanning facilities when appropriate Third-parties (Google, Microsoft, Yahoo) –provide funding for book scanning –manage digitization vendor
5
Microsoft / Internet Archive Production scanning began April 2006 Internet Archive: Digitization Agent Projected scope 100 K books (public domain) per year –Scanning books from all campus libraries Scanning Centers (20 scanning machines) –Location: UC at NRLF and SRLF
6
Google Production scanning began October 2006 –Scanning books from NRLF currently Projected Scope –2.5 million books during 6 year period –Public domain /in-copyright Scanning Center –Books transported to offsite scanning facility –Over 3K book / per day
7
Workflow Steps (1) #1 Project management #2 Select, retrieve, inspect, mass charge /physical charge, physical transfer #3 Sharing bibliographic records (over 3 K daily) #4: Digitization: creating content files & metadata –JP2000, PDF, OCR –Metadata created during scanning including image coordinates
8
Workflow Steps (2) #5 Mass discharge / manual charge; books returned to shelves #6 Quality control on digital files prior to ingest #7 ingest of metadata and content files for preservation storage #8 Enhance union and local catalog records with link to hosted content
9
Motives: UC Libraries exploring models Collection Management: Digital reformatting can help support our efforts to build shared print collections Curating through Collaboration: Digitization of local materials creates access (for our patrons) to third-party materials not currently available Funding Reallocation: Funds invested in licensing online collections of out of copyright materials could be reallocated to digital reformatting our unique content
10
Mass Digitization Collection Advisory Group (MDCAG) Approved by University Librarians First meeting March 2007 Charge: –Develop process for selection of book collections for scanning from across UC Libraries Collection Development Committee (CDC) will approve collection selections
11
April 2007 Status Report Google –249,485 books transferred –235,633 books scanned –11,320 books rejected –55,264 books live Microsoft/Internet Archive –84,315 books transferred –58,543 books scanned / books live –25,772 books rejected
12
Success due to our Systemwide collaboration! UCB & UCLA Libraries / Northern and Southern Regional Library Facility teams UC Library systemwide groups: ULs, SOPAG CDC, PAG, HOPS, Bibliographer Groups Mass Digitization Collection Advisory Group (MDCAG) CDL Programs: Bibliographic Services, Collections, Data Acquisitions, Digital Preservation Repository
13
Microsoft: Sample Book
14
Internet Archive: Sample Book
15
Google: Sample Book
16
Impacts of Mass Dig Will we re-define our collections ? How should we make collections available to our users?
17
Mass Dig: Collections & Users: All Libraries can be bigger than before –Leveraging the collections of other libraries to bring content to our users Leveraging our collections ala the Long Tail –Libraries can learn from Netflix Digitize local content – we all have special stuff! –Unique holdings support specialized disciplines Prepare: demand for the physical item may increase –Digital access may increase relevance of analog Book discovery increasingly happens outside the library –Information discovery (Google, MSM, Yahoo!) –Bibliographic discovery (Amazon)
18
Our Users Today Faculty, Graduates and Undergrads Working in range of disciplines Seeking efficiencies Define their tool space Resource needs are diverse –Can very day by day They judge resource’s worth
19
Dawn of the Embedded Library (1) Web services embed library content into the browsing experience of users –Enable discovery, locate, request, and delivery –Library content must be exposed to aggregators Examples: Library Thing, NCSU’s Catalog WS, LibX Firefox, Google Book Search –integrating web services for users and customizing software –Leveraging Catalog, Open URLs, COinS, APIs, etc.
20
Dawn of the Embedded Library (2) Providing user services –Find in a library, POD, download mobile devices, ILL, order from Amazon, etc. Expose our content to aggregators and consume the data of others –OAI-PMH, SRU, Google Sitemap, Open Search, RSS feeds, mobile device searching
21
Library Thing: Catalog Your Books Online – social bookmarking
22
NCSU’s CatalogWS
23
LibX: Providing direct access to your library’s resources
27
Mass Dig & New Library Services What systems are required to extract meaning from massive text collections? –Machine translation, data mining, etc. What new modes of reading, representation and understanding are needed to interact with texts? –Linguistic, visual, and statistical processing What collaborations between librarians, computer scientists and scholars are needed to do this exploration? –Standards, search queries, visualization, social networks
28
Epilogue: Mr. Peabody’s WABAC (wayback) Machine 1992 Conference on “Technology, Scholarship and the Humanities: The Implications of Electronic Information asked certain questions: Will scholarship be better if it takes advantage of technology? How will technology affect –The book? –The lecture? –The library? –The classroom?
29
1992: Historical Context Cold War formally ended & US lifted trade sanctions against China Bill Clinton was elected U.S. President Four police officers were acquitted in Rodney King Trial Johnny Carson left the Tonight Show Earth Summit held in Rio de Janeiro CD sales surpassed cassette tapes OPACs and Gopher were in the library and a text-based web browser was first made available to the public…..
30
Technology, Scholarship, & Humanities Conference: Viewpoints (1) Richard A. Lanham, Professor of English, UCLA “ As traditionally taught, each class exists in a temporal, conceptual and social vacuum…but if an electronic library were employed…students could read papers submitted in earlier classes, read scholarly articles on the same topics, read before-and-after examples of revised work, do searches of Shakespearean texts for imagery or rhetorical figuration, and make excerpts of videotaped performances to illustrate their papers – all without going to the campus library. Most importantly, a course like this would have a history and could be accessed by people in other courses; it would constitute a continuing society, its students becoming citizens of a commonwealth”
31
Technology, Scholarship, & Humanities Conference: Viewpoints (2) William Y. Arms, VP Computing Services Carnegie Mellon “The scientific community has long-funded its capital-intensive projects with support from government and industry. In contrast, only 2 percent of humanities research funding comes from the U.S. government. As a result, the humanities can undertake few large, interdisciplinary projects unless the government and other funding agencies perceive the outcome to benefit the entire academic community…..”
32
Thank you Please feel free to contact me at robin.chandler@ucop.edu robin.chandler@ucop.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.