HBCU-CUL Digital Imaging Workshop, November 2005 From Analog to Digital Peter Hirtle HBCU-CUL Digital Imaging Workshop II July 31, 2007 (c) Cornell University Library 2005
Session Goal Workshop participants will understand the issues involved in converting analog materials into digital formats for preservation and access.
Session Objectives Overview the application of the scanning process in the digitization chain In-depth presentation on the factors affecting image quality Discuss the issues involved in presentation Participate in “reality checks” on lessons learned
Learner Outcomes Be able to Discuss issues involved in converting analog materials to digital images Compare and Contrast three types of image scanning Understand factors affecting image presentation
Other 3D images are tape-based, such as: Analog to Digital We’re scanning 2D images – (textual and graphical) & 3D images (photographic) Other 3D images are tape-based, such as: Video Audio
What are Digital Images? HBCU-CUL Digital Imaging Workshop, November 2005 What are Digital Images? “electronic photographs” created through scanning or digital photography Sampled and mapped as a grid of dots or picture elements (pixels) pixel assigned a tonal value (black, white, grays, colors), represented in binary code code stored or reduced (compressed) read and interpreted to create analog version (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Digital Image Also known as a raster or bitmap image (c) Cornell University Library 2005
Three types of scanning HBCU-CUL Digital Imaging Workshop, November 2005 Three types of scanning BITONAL GRAYSCALE COLOR (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Bitonal Scanning Information is presented as either black or white Gray shades simulated by clustering black dots Suitable for printed materials (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Grayscale Scanning Shades of gray represented; normally 8 bits per pixel Suitable for manuscripts, photographs, halftones, and stained documents (c) Cornell University Library 2005
Bitonal vs. Grayscale Capture of Stained Manuscript HBCU-CUL Digital Imaging Workshop, November 2005 Bitonal vs. Grayscale Capture of Stained Manuscript Bitonal scan Grayscale scan (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Color Scanning Full color range; normally 24 bits per pixel Suitable for documents with significant color information, such as photographs, works of art Helpful in determining a document’s age, physical condition, or previous use (c) Cornell University Library 2005
Digital Image Quality Affected By HBCU-CUL Digital Imaging Workshop, November 2005 Digital Image Quality Affected By Document attributes Resolution and threshold Bit depth and dynamic range Image enhancement Compression/format Equipment and performance over time Operator judgment and care (c) Cornell University Library 2005
Important Document Attributes HBCU-CUL Digital Imaging Workshop, November 2005 Important Document Attributes Physical type, size, and presentation Physical condition Document type The Big Three Detail, Tone, Color (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Halftone: 1-bit image at 400 dpi Scanned from Original On H-P ScanJet 3c Scanned from 2N Film On Sunrise SRI 50 (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 HBCU-CUL Digital Imaging Workshop, November 2005 Matching Informational Content to Scanning Approach: Quality vs. File Size and Cost 600 dpi 1-bit 2.9 Mb file 400dpi 8-bit 10.3 Mb file 400 dpi 24-bit 30.9 Mb file (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Resolution Determined by number of pixels used to represent the image expressed in dots per square inch (dpi) zoom in (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Resolution 100 dpi = 1002 (100 x 100) or 10,000 dots 200 dpi = 2002 or 40,000 dots 400 dpi = 4002 or 160,000 dots increasing resolution increases detail captured and geometrically increases the file size (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Effects of Resolution 600 dpi 300 dpi 200 dpi (c) Cornell University Library 2005
Threshold Setting in Bitonal Scanning HBCU-CUL Digital Imaging Workshop, November 2005 Threshold Setting in Bitonal Scanning defines the point on a scale from 0 (black) to 255 (white) at which gray values will be interpreted either as black or white 255 dark gray light gray black white (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Effects of Threshold threshold = 60 threshold = 100 (c) Cornell University Library 2005
Defining Detail in Text HBCU-CUL Digital Imaging Workshop, November 2005 Defining Detail in Text Fixed metric: smallest lower case letter Variables: quality, resolution, feature size Bitonal QI formula for text dpi = 3QI/.039h QI values: 8(excellent), 5(good), 3.6(marginal), 3(barely legible) Grayscale/Color QI formula for text dpi = 2QI/.039h (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Example Text page with smallest character measuring 1mm, which must be fully captured (QI=8, h= 1) Bitonal Scanning: dpi = 3QI/.039h dpi = 3(8)/[.039(1)] = 615 dpi Grayscale: dpi = 2QI/.039h dpi = 2(8)/[.039(1)] = 410 dpi (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 http://images.library.uiuc.edu/projects/calculator/ (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Bit Depth Determined by the number of binary digits (bits) used to represent each pixel 8-bit 24-bit 1-bit (c) Cornell University Library 2005
Bitonal scanning has a bit-depth of 1 HBCU-CUL Digital Imaging Workshop, November 2005 Bitonal scanning has a bit-depth of 1 each pixel represented by one bit, with a tonal range of 2: 0 = black 1 = white (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Binary Calculations 21 = 2 22 = 4 23 = 8 24 = 16 28 = 256 210 = 1024 212 = 4096 224 = 16.7 million (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Bit Depth Grayscale is typically 8 bits or more, representing 256 (28) levels Color is 24 bits or more , representing 16.7 million (224) levels example: 8-bit grayscale pixel 00000000 = black 11111111 = white (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Bit Depth increasing bit depth increases the levels of gray or color that can be represented and arithmetically increases file size affects resolution requirements Add a slide on binary arithmetic in the “real” presentation, with reality check. (c) Cornell University Library 2005
Effects of Grayscale on Image Quality HBCU-CUL Digital Imaging Workshop, November 2005 Effects of Grayscale on Image Quality 3-bit gray 8-bit gray (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Tonal Reproduction Dynamic range: the range of tonal difference between lightest light and darkest dark The higher the range, the greater the number of potential shades Tone distribution as important as dynamic range (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Comparing Key Types (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Mapping Tones Correctly: Use of Histograms (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Color Appearance Difficult to evaluate Hue, saturation, brightness Translating between analog and digital, between color spaces, between reflected and transmitted light (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Representing Color Appearance Balanced Color Color Shift Towards Red (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Color Cast (c) Cornell University Library 2005
Color Imaging and Tone Distribution HBCU-CUL Digital Imaging Workshop, November 2005 Color Imaging and Tone Distribution Limited tones evident in highlights and shadows Balanced tonal distribution (c) Cornell University Library 2005
Image Processing and Enhancement HBCU-CUL Digital Imaging Workshop, November 2005 Image Processing and Enhancement Image editing to modify or improve an image filters (brightness, contrast, sharpness, blur) file size reduction (scaling) tone and color correction (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Effects of Filters no filters used maximum enhancement (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Compression/format image quality may be affected by the compression technique used to reduce file size Image quality affected by format support for: Bit depth Compression techniques Color management (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Compression /File Format Comparison GIF (lossless) File Size: 60 KB JPEG (lossy) File Size: 49 KB images courtesy of Edison Papers (c) Cornell University Library 2005
Equipment used and its performance over time HBCU-CUL Digital Imaging Workshop, November 2005 Equipment used and its performance over time scanners offer wide range of capabilities to capture detail, tone, dynamic range, and color scanners with same stated functionality can produce different results calibration, age of equipment, and environment are also factors (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Variations in Image Quality due to Scanner Performance 300 dpi, scanner A 300 dpi, scanner B (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Image Capture: Creating digital files rich enough to be useful over time in the most cost- effective manner. (c) Cornell University Library 2005
Why Create High Quality Images? HBCU-CUL Digital Imaging Workshop, November 2005 Why Create High Quality Images? preservation original may only withstand one scan cost one scan may be all that is affordable access one scan can be used to derive multiple images (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Calculating File Size F file size = height x width x bit depth x dpi2 8 (c) Cornell University Library 2005
File Size Naming Conventions HBCU-CUL Digital Imaging Workshop, November 2005 File Size Naming Conventions Represented in bytes 1 byte = 8 bits 1Kb ~ 1,000 bytes 1Mb ~ 1,000Kb 1Gb ~ 1,000Mb (c) Cornell University Library 2005
Example: grayscale image HBCU-CUL Digital Imaging Workshop, November 2005 Example: grayscale image file file size = height x width x bit depth x dpi2 8 7” file size = 10 x 7 x 8 x 3002 8 file size = 6,300,000 bytes = 6.3 megabytes 10” 300 dpi, 8-bit (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Example: Bitonal Image 800 file size = height x width x bit depth x dpi2 8 file size = 10 x 8 x 1x 2002 8 file size = 3,200,000 bytes = 3.2 megabytes 10” 8” 200 dpi , 1 bit (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Reality Check 1 Determine the file size of a letter size page captured bitonally at 200 dpi. (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Answer (8.5 x 11 x 2002)/8 = 468Kb or .47Mb (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Reality Check 2 Determine the file size of an 8” x 10” photograph scanned in color at 200 dpi. (c) Cornell University Library 2005
HBCU-CUL Digital Imaging Workshop, November 2005 Answer (8 x 10 x 24 x 2002) = 9.6 million bytes or 8 9.6Mb (c) Cornell University Library 2005
Estimating File Size for Compressed Images HBCU-CUL Digital Imaging Workshop, November 2005 Estimating File Size for Compressed Images Compressed file size = file size level of compression Example: Color photo compressed 20:1 9.6Mb =.48Mb 20 (c) Cornell University Library 2005
Aligning Document Attributes with Digital Requirements HBCU-CUL Digital Imaging Workshop, November 2005 Aligning Document Attributes with Digital Requirements Identify key document attributes Detail, tone, color Characterize them, if possible through objective measurements Determine quality requirements and tolerance levels Translate between analog and digital and between scanning requirements and scanning performance (c) Cornell University Library 2005
Aligning Document Attributes with Digital Requirements HBCU-CUL Digital Imaging Workshop, November 2005 Aligning Document Attributes with Digital Requirements Calibrate scanner, calibrate the rest of the system Control lighting and environment Objective and Subjective Evaluation: targets and software; evaluating images against the originals (c) Cornell University Library 2005
Aligning Document Attributes with Digital Requirements HBCU-CUL Digital Imaging Workshop, November 2005 Aligning Document Attributes with Digital Requirements Minimize post-processing in the master image Save in TIFF; use RGB for color; and avoid lossy compression Maintain scanning metadata (c) Cornell University Library 2005
One Size Does Not Fit All! HBCU-CUL Digital Imaging Workshop, November 2005 One Size Does Not Fit All! different document types will require different scanning processes the more complex the document, the higher the conversion/access requirements, the larger the file, and the greater the expense scan the original whenever possible Follow project recommendations (c) Cornell University Library 2005