Digitisation Mick Eadie Visual Arts Data Service.

Slides:



Advertisements
Similar presentations
Computing Intermediate 2 Multimedia Technology
Advertisements

Int 2 Multimedia Revision. Digitised Sound Analogue sound recorded from person, or real instruments.
Input & Output Devices ASHIMA KALRA.
Input to the Computer * Input * Keyboard * Pointing Devices
Dale & Lewis Chapter 3 Data Representation. Representing color Similarly to how color is perceived in the human eye, color information is encoded in combinations.
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
HAN Conference © History Data Service The History Data Service : Promoting Good Practice and Standards of Scholarship Cressida Chappell Head of.
S OFTWARE AND M ULTIMEDIA Chapter 6 Created by S. Cox.
Part of the Arts and Humanities Data Service and the UK Data Archive. Funded by the Joint Information Systems Committee and the Arts and Humanities Research.
Object Orientated Data Topic 5: Multimedia Technology.
What is it? The use of computers to present text, sound, graphics, animation and video in an integrated way.
Part A Multimedia Production Rico Yu. Part A Multimedia Production Ch.1 Text Ch.2 Graphics Ch.3 Sound Ch.4 Animations Ch.5 Video.
Digital Images. Scanned or digitally captured image Image created on computer using graphics software.
Nat 4/5 - Software Design and Development – Low Level Operations - 1 National 4/5 – Computing Science Information Systems Design and Development Media.
 Scanned or digitally captured image  Image created on computer using graphics software.
Skill Area 212 Introduction to Multimedia Internet and MultiMedia for SC 2.
IT Introduction to Information Technology CHAPTER 05 - INPUT.
Media File Formats Jon Ivins, DMU. Text Files n Two types n 1. Plain text (unformatted) u ASCII Character set is most common u 7 bits are used u This.
Fundamentals Rawesak Tanawongsuwan
Chapter 2 Data Representation. Define data types. Visualize how data are stored inside a computer. Understand the differences between text, numbers, images,
Higher Computing Computer Systems S. McCrossan 1 Higher Grade Computing Studies 4. Peripherals Input Devices Keyboard Mouse Scanners Microphone Digital.
Unit 30 P1 – Hardware & Software Required For Use In Digital Graphics
CS 1308 Computer Literacy and the Internet. Creating Digital Pictures  A traditional photograph is an analog representation of an image.  Digitizing.
2 pt 3 pt 4 pt 5pt 1 pt 2 pt 3 pt 4 pt 5 pt 1 pt 2pt 3 pt 4pt 5 pt 1pt 2pt 3 pt 4 pt 5 pt 1 pt 2 pt 3 pt 4pt 5 pt 1pt Terms 2 Terms 3 Terms 4 Terms 5 Terms.
Sem 1 v2 Chapter 14: Layer 6 - The Presentation layer.
COMP Bitmapped and Vector Graphics Pages Using Qwizdom.
CSCI-235 Micro-Computers in Science Hardware Part II.
Chapter 8: Digital Media1 Digital Media Chapter 8.
Institute of Technology Sligo - Dept of Computing Sem 1 Chapter 14: Layer 6 - The Presentation layer.
MULTIMEDIA - WHAT IS IT? DEFINITION 1: Uses a VARIETY of media ELEMENTS for instruction Media elements are: text, sound, graphics, moving images (real.
Chapter 11 Fluency with Information Technology 4 th edition by Lawrence Snyder (slides by Deborah Woodall : 1.
3. Multimedia Systems Technology
Chapter 10-Basic Software Tools. Overview Text-based editing tools. Graphical tools. Sound editing tools. Animation, video, and digital movie tools. Video.
1 CP Lecture 8 PC and Media exchange standards.
Multimedia Elements: Sound, Animation, and Video.
Object Orientated Data Topic 5: Multimedia Technology.
COMPUTER PARTS AND COMPONENTS INPUT DEVICES
Types of Data. Numbers Text Pictures Sound Video.
Chapter 2 : Business Information Business Data Communications, 6e.
Introduction to Interactive Media 03: The Nature of Digital Media.
Chapter 2 : Imaging and Image Representation Computer Vision Lab. Chonbuk National University.
Funded by: © AHDS What happens when you digitise? An introduction to some key themes Alastair Dunning Arts and Humanities Data Service
Multimedia ITGS. Multimedia Multimedia: Documents that contain information in more than one form: Text Sound Images Video Hypertext: A document or set.
Graphics workshop Library and Information Services University of St Andrews.
Marr CollegeHigher ComputingSlide 1 Higher Computing: COMPUTER SYSTEMS Part 1: Data Representation – 6 hours.
Digital Graphics. Formats: BMP – Bitmap image file which is used to store Bitmap digital images PNG – Portable Network Graphics GIF – Graphics Interchange.
 Scanned or digitally captured image  Image created on computer using graphics software.
Marwan Al-Namari 1 Digital Representations. Bits and Bytes Devices can only be in one of two states 0 or 1, yes or no, on or off, … Bit: a unit of data.
ITGS Application Software. ITGS Application software (productivity software) –Allows the user to perform tasks to solve problems, such as creating documents,
CSCI-100 Introduction to Computing Hardware Part II.
Chapter 1 Background 1. In this lecture, you will find answers to these questions Computers store and transmit information using digital data. What exactly.
By: Catyana Brown Information Technology in a Global Society: Multimedia.
Information Technology Images: Types, Resolution and Techniques.
COMP135/COMP535 Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 2 Lecture 2 – Digital Representations.
Processing Hardware, Software. Hardware Hardware Processing is performed by a computer ’ s central processing unit and is measured by the clock speed.
MULTIMEDIA Multimedia is the field concerned with the computer- controlled integration of text, graphics, drawings, still and moving images (Video), animation,
Introduction to Interactive Media Interactive Media Raw Materials: Digital Data.
Software Design and Development Storing Data Part 2 Text, sound and video Computing Science.
DATA Unit 2 Topic 2. Different Types of Data ASCII code: ASCII - The American Standard Code for Information Interchange is a standard seven-bit code that.
Multimedia Systems Dr. Wissam Alkhadour.
GCSE COMPUTER SCIENCE Topic 3 - Data 3.2 Data Representation.
Data Representation.
Presenter Name: Mahmood A.Moneim Supervised By: Prof. Hesham A.Hefny
Chapter III, Desktop Imaging Systems and Issues: Lesson IV Working With Images
Digital Images.
Software and Multimedia
Software and Multimedia
short term and long term speed, capacity, compression formats, access
Web Design and Development
Chapter 2 Data Representation.
Presentation transcript:

Digitisation Mick Eadie Visual Arts Data Service

The ‘input channels’ of digitisation (keyboard, scanner etc.) are narrow and can only capture a partial representation of the original source chose data model digital objects chose digitisation method identify sources to digitise Source – Digitisation - Resource

Photocopy Photograph Recording Original Source Copy of Source Item to Digitise Sound, Moving image Digital Object 2D Image 3D Model Digital Resource Digital audio/movie recording Scan Digital Camera 3D Scan OCR Line tracing Digitisation Pathways

Users Knowledge Experience Culture Environment Hardware Software (OS) (Network) Digital Objects Binary Data Data Models Relationships The environment of a digital resource often receives the most attention, but it is the users and digital objects that are most important Hardware and software selection should be based on the needs of the users and the types of digital objects to be used Fit for Purpose: Digital objects must be created with their intended use/purpose of paramount importance Elements of a Digital Resource

Digital Objects Text –Data stored as a stream of characters (numbers, letters, etc.) Image –Data primarily understood as a spatial pattern or shape –Bitmap and vector images/raster (bitmap) and vector spatial data Time –Data primarily understood as a sequence through time –Audio and/or video (multimedia)

Text Essentially, numeric codes used by the computer to represent specific characters –Fonts must be designed to provide a visual image for each code –Software must be designed to interpret the codes ASCII is the most well known text encoding scheme –1 byte per character = 256 unique characters, primarily the Latin alphabet –Other characters are handled by having multiple code pages –Each code page uses the same codes to represent different characters UNICODE is the replacement for ASCII –2 bytes to store each character = 60,000+ codes –Can represent characters from different alphabets simultaneously as each character has a unique code

Text Transcription Advantages: –Low overhead to start transcription: person, keyboard, document –Hand-written documents can be transcribed –A transcriber can follow complex disorganised documents Issues: –Slow and expensive –Human error Good practice: –Double entry (two transcribers both enter the same document and the transcriptions are checked for differences) –Keep copies of originals with transcriptions (preferably as digital images as this make post-transcription checking simple and quick)

Optical Character Recognition Advantages: –Automatic, suitable for digitising large numbers of documents –Highly accurate for clean, clear type written documents Issues: –Current technology is very poor on hand-writing –Complex document layout can become scrambled Good practice: –Proof-read, spell check OCR output for errors –Provide image of page with text so users can check the text themselves

Bitmap (Raster) Images The image is made up of many pixels Each pixel stores information about its colour The standard archival file format is uncompressed TIFF

Resolution Resolution is often expressed as dots per inch (dpi) More accurately pixels per inch (ppi) The ‘frequency’ at which samples are taken by the capture device from the original source Common misconceptions about ppi Not an indicator of image size or quality Unless we know the size (inches, cms) of the original A better guide to digital image size is pixel dimensions e.g x 3000 pixels, which allows us to work out the size of the image we will output to monitor or printer No of pixels/output res = output size

Scanners and Digital Cameras Advantages: –Accurate(?) visual representation of the source Issues: –Text and logical structure of a document is not captured (can be through OCR or line tracing) Good practice: –Capture master images at appropriate resolution and bit depth –Check the optical resolution of the scanner (avoid interpolated resolution) –Check the colour resolution (bit depth) –Check scanning time –Record details of scanner settings and any image editing done afterwards

Vectors A point represents an exact location in two or three dimensional space Two points define a line A series of connected lines define an area x,yx,y,z

Vector Data Advantages: –Can be zoomed (c.f. bitmap images) –Allows spatial analysis (spatial statistics, network analysis) Issues: –Precision versus accuracy (detail versus truthfulness) –Scale versus resolution Good practice: –Ensure polygon topology (the polygons each line belongs to) is stored

Digital Audio Human hearing –Frequency (pitch) - 20Khz to 20,000Khz –Intensity (loudness) - 0 and 120Db Full sound reproduction requires digitisation at more than 40,000 samples a second (44,100 is a common standard) –NYQUIST rate: for lossless digitisation, the sampling rate should be at least twice the maximum audio frequency One second of good quality uncompressed digital sound is equivalent to ¼ of the Complete plays of Shakespeare –MP3 offers good quality compressed (lossy) files Midi: not a digital recording of actual sounds, but a digital sample ‘library’ of how musical instruments sound

Digital Moving Images 1 second of uncompressed good quality digital video (without sound) is equivalent to about ¾ of the complete plays of Shakespeare MPEG - The Motion Pictures Experts Group standards are the most popular compression standards –The three standards, MPEG-1, MPEG- 2, MPEG-4 Compression basically works by selecting key frames and only recording changes between the frames (but it gets a lot more complicated!)

Data Models A data model is a set of rules that defines a particularly way of organising a collection of digital objects List, one item follows another Tree, each item can have several children Sets, items belong to one or more groups Geography/geometry, items are located using a co-ordinate system

Selecting a Data Model To be useful, digital objects must be: –Arranged according to the rules of an appropriate data model –Stored in a file format that can represent the data model –Accessed with software that understands the file format and the data model, and can present the data in an appropriate way When selecting a data model –Consider the ‘natural’ organisation of your source –Consider what method of organisation will be familiar to your users –Consider the method of organisation that best fits your purposes Then seek specialist advice if you need it!

Selecting Software Selecting the right data model is more important than selecting a particular piece of software Pick software that works with your preferred data model (can perform the right tasks) –Don’t use a webpage editor as a database –Don’t use a word processor as a spreadsheet Avoid little-used software with proprietary features Look for software with lots of export and import options Look for software that supports important standards –Trees  markup  XML (SGML) –Sets  relational databases  SQL –Coordinates  CAD or GIS  less clear, use file formats like DXF, ESRI shape files

Digitisation: a Balancing Act Successful digitisation involves several trade-offs: –Amount and detail versus time and cost of digitisation –Complexity of the digital resource versus ease of use –Flexibility of the digital resource versus suitability for a specific use –Digitisation with current technology versus future possibilities Your project should be guided by a firm understanding of the source and the intended purpose of the digital resource –Do not exceed available support (financial, technical, labour) –Minimise the loss of information from the original during the digitisation process –Keep information that tracks the origin and history of the digital resource with the digital resource

Where to get more advice AHDS Guides to Good Practice series Technical Advisory Service for Images (TASI) Text Encoding Workshops BUFVC Workshops