Download presentation
Presentation is loading. Please wait.
Published byKailee Hutter Modified over 9 years ago
1
Overwhelmed by Large-scale Digitization Projects
Xiaocan (Lucy) Wang Digital Repository Librarian Eric Holt University Archivist Cunningham Memorial Library Indiana State University
2
Agenda Project background Implementation Outcome Lesson learned
Equipment Software choices Process Ingestion Workflow Outcome Lesson learned Conclusion
3
Project Background Indiana State University
4
Project Background ETD (electronic theses and dissertations)
ETD Digital Initiative 2010 and onward Access
5
Project background (cont.)
RTD (retrospective theses and dissertations) Number: 3,802 Where: Archives + Library basement Condition: most in usable condition, but… Access
6
Project Background (cont.)
Purposes Centralize: ETD & RTD Improve access, search and retrieval Support teaching, learning and research Improve preservation
7
Project Background (cont.)
Consideration Format Copyright Privacy
8
Equipment Bookdrive DIY
9
Disclosure Not currently or previously an employee of the corporations whose products I discuss I am not compensated for my comments or opinions Older software version being used
10
Capture New Book window
11
Capture in action
12
Batch entry
13
Irfanview
14
GIMP Open source equivalent to Photoshop
Batch processing requires additional plugin Supervisor unfamiliarity
15
Photoshop Can record action to perform batch processing
Graphical interface while setting up recorded action
18
Changing DPI
24
Color Grayscale B/W
25
PDF Compression All items being converted are compressed
Some formats compress better than others Compression artifacts can also become visible
27
Original image of page is visible
Searchable text layer is hidden
29
First Review All pages present? All text legible?
No shadows covering text? Page in focus? Essential color elements retained?
30
PDF/a Copy saved to Archives server Only accessible to staff
33
Final Review and cleanup
Review metadata Correct if necessary Approve and publish Remove original camera images, processed images, and extra copies of pdf
34
Workflow Imaging original theses or dissertations
35
Workflow (cont.) Processing image files
36
Workflow (cont.) Converting to PDF/A
37
Workflow (cont.) Publishing on ISU IR
38
Outcomes Volume finished: 848 Average volume size: 96 pages
Average student time: 1.3 hours Average supervisor time: 5-10 minutes Average file size: 5.5 MB Total Disk Space: 4.6 GB Approximate cost: $15-18
39
Worth It? Centralize Improve access Via digital repository
Search engines Digital repository registries WorldCat
40
Worth it? (cont.) Support teaching, learning and research
Improve preservation strategies Multiple digital copies Backup Bitstream preservation Distributed preservation network via MetaArchive Cooperative
41
Lesson learned Control quality: Supervise students Add MARC 856 field
monochrome and grayscale Supervise students Add MARC 856 field Secure continued funds
42
Conclusion Complex Various issues In-house vs. outsourcing Funding
Technical standards Quality control Format selection In-house vs. outsourcing Metadata Delivery Preservation Rights management Workflow development
44
Contact info Xiaocan (Lucy) Wang Eric Holt
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.