COMP 1321 Digital Infrastructure Richard Henson University of Worcester October 2017
Week 5: Data representation Data Transfer on/off motherboard Learning Objectives: Explain what stored data could represent Explain how storage media store data and how it can be retrieved Explain mechanisms for data transfer on/off motherboard Explain how data on a storage medium can be found & displayed by named software tools
CPU, Memory and Boot-up BOOT process requires CPU instructions… have to be stored somewhere (in files) Booting up is a matter of loading all these files & their programs Where to store? need to be accessed & processed as quickly as possible!
Memory… or Storage? Lot of confusion… memory interfaces directly with CPU also called primary storage held on motherboard, controlled by system clock fast (dynamic RAM), very fast (static RAM) quite fast (ROM)
Primary & Secondary Storage any other form of data storage not directly interfacing with CPU via bus accessible to CPU via i/o calls e.g. INT 21 (Intel 8086…) uses ports to connect electrically with CPU external e.g. USB, Ethernet internal e.g. SATA Slower than primary storage
Virtual Memory Use of fast(ish) secondary storage device locations as if they were primary storage locations… hard disk (especially SSD) storage addressed directly by the CPU Requires programmed mapping between extra primary storage locations & secondary storage locations adv: unlikely that the CPU will run out of “memory” disadv: hard disk performance “falls of a cliff” when virtual memory interfaces with CPU
“Firmware” Needs updating (like all software)… Software held on EPROM (erasable, programmable) chip can’t easily be tampered with IDEAL for low-level operating system programs, ensures rapid boot-up also embedded applications Needs updating (like all software)… some flexibility to overwrite
Questions Is virtual memory primary or secondary storage? What about firmware?
What could Stored Data Represent… With one 1 byte word: i.e. 1 byte per memory location Could be many things!!! Can be difficult to decide what the data really does represent… e.g: data has been recovered from a location; presented as 4E (hex) what is it… ?
What could “4E” represent? 0 1 0 0 1 1 1 0 Could be part of a program instruction in assembly language or source code (as ASCII code) Program data as a number as an ASCII character…
More possibilities for “4E” Over to you… In groups… Five minutes…
Putting meaning onto raw data… (1) Single item of data… at a single location… (e.g “peeked” as 4E) could be anything! Only find out context by studying other bytes around it… if next byte is… 4F (hex) and byte after that is… 57 (hex) the ASCII codes together spell “NOW” so the bytes are probably all ASCII codes
Putting meaning onto raw data… (2) What if the next bytes were 6B and 7D? ASCII codes would deliver… Nk} not a proper word data probably not ASCII codes What else? could be integers between 0 and 255 116 153 1275 maybe stored variables, or constants… Wrong to assume… keep an open mind!
Use of “Control Bits” The byte could also be broken up into two nibbles of data 0100… could be an integer of value 4 1110… could be an integer of value 14 It could also be made up of 8 “Boolean” values, which could control outputs to machinery i.e. 0 = off; 1 = on
Looking at Memory locations… (Peeking) Intel 8086 tool… debug available since early days of DOS Debug needs –d parameter to peek… shows 128 bytes at a time (& corresp ASCII codes) default memory location is the start of “free” memory locations may still contain data from previous usage Specified memory locations can be peeked e.g. –d 0200 for next 128 bytes starting from &0200
“Peeking” and “Poking”… & … represents address (as opposed to data) Debug –d can be used to present a whole range of memory e.g. –d 0200 0300 would display every byte between addresses &0200 (hex) and &0300 (hex) Debug –e can overwrite contents of a specified location (or sequence of locations) with new data called “poking” potentially can crash the computer… (!)
Protection against memory overwrite Operating system protects memory addresses used by “active” processes Use of debug -e bypasses protection!!! only protection for computer’s primary memory is to disable the debug program but could in theory still be executed remotely, if administrative access to local computer has been granted… (!)
How does data get between devices? Data usually needs to go in both directions… DEVICE A DEVICE B
Three Data Communication Alternatives Simplex one direction only Example: Broadcast data from a radio or TV mast
Data Transfer Half Duplex Example: one direction only at a time Data sent along a single copper wire first then
Data Transfer Full Duplex Example: both directions simultaneously Broadband telephone communications
i/o connections with the motherboard Normally connect digital i/o devices to the motherboard via: Direct connections through “ports” Click in expansion or “daughter” cards with their own ports
i/o Buses used with older expansion Cards ISA = Industry Standard Association early (1981-1984) communications standard speed: up to 16 MB s-1 8 or 16-bit parallel connections PCI = Peripheral Component Interconnect later (1990-1993) communications standard speed: up to 133 MB s-1 32-bit parallel connection ‘Plug and play’ – no set-up software needed, (depending on the operating system used…)
Legacy Motherboard: PCI & ISA slots from http://www. ibase-i. com PCI slot ISA slot
Peripheral Connectors on the Motherboard On-board IDE slot (now legacy) up to TWO hard disk or DVD-ROM 40-pin “ribbon” cable On-board SCSI slot (server board) connects a much larger number of devices
Other Hard Disk connections On-board SATA slot thinner ribbon cable 3.5” SATA hard disk 2.5” SATA hard disk External SATA hard disk Connected to motherboard via USB
STAR motherboard architecture Copied from “star” arrangement for networking computers one hub (MCH) connects fast components hub at centre; components at ends of ‘spokes’ other hub (IOCH) connects slower components and peripherals hubs communicate directly with each other
Motherboard Hubs MCH = Memory Control Hub connects very fast devices together in a ‘star’ configuration I(O)CH = Input-output Control Hub connects together slower devices, also in a star configuration
MCH and I(O)CH from http://www.3dnews.ru/motherboard/intel-ht-chipset/
Motherboard with MCH and ICH from http://www. tomshardware
Why arrange motherboard components like this? Longer wires… more time to send messages (good) degradation of message at high speed (bad) Therefore… important for fast components to be close together slower components can be further apart
Motherboard with MCH and ICH from http://www. tomshardware AGP slot Socket for processor MCH ICH Slots for RAM cards
Another PC Motherboard… from http://www. techiwarehouse
Finding data on Secondary Storage (1) “file” ~ conventional name for a package of bytes of data Primary storage : controlled directly by CPU instructions Secondary storage: controlled by hard disk controller programs & file system manager
Primary/Secondary Storage of data as files Secondary storage devices organise data for quick access logically structured into “partitions” If Windows = operating system, each partition allocated a letter (e.g. C:, D:, etc.
File Organisation on Disk Disk could be stretched out to form a long line of sectors size of sectors depends on formatting type 512 locations (i.e. bytes) 2048 locations files laid down in sectors
Finding data on Secondary Storage Essential for each partition to create a table or catalogue for starting address of files that are written to it otherwise the file becomes very difficult to retrieve… Method depends on filing system chosen when partition formatted… As well as formatting each filing system structures the media to receive data in its own unique way…
“Boot Sector” Important for hard disk boot up Process of loading operating system from on secondary storage starts from… provides configuration information for effective communication with CPU if damaged, boot up halted! should have a backup… needs to be copied to boot sector to overwrite corrupted data
Partitions Created by special program Areas of hard disk managed by a file system different partitions can use different file systems single boot partition containing boot sector can point to different operating systems Selectable via screen menu
Booting up: loading an Operating System… Needs to be loaded into RAM some operating systems load everything from ROM others use a combination… some loaded first from ROM rest from hard disk or other source Hard disk needs a bootable partition to load rest of operating system into RAM
Data Storage on Disk Partition Sectors numbered Files stored in specified sector address ranges
Disk Catalogues Organise files into directories/folders Top folder (C:) = root Rest of folders link hierarchically Catalogue logically allocates each file to a folder for ease of retrieval
Fragmentation of Data on Secondary Storage General problem with hierarchical data storage… deleted data items leave holes in the structure New data items saved try to fill the gaps large files can be broken into fragments fragments linked by address pointers slows down retrieval
Removing Fragmentation If disk only partly fragmented… defragmented files copied into memory remaining files moved around to close up holes Previously defragmented files copied back to disk as complete files If disk >75% fragmented most effective solution is to copy all files to another partition can copy back later once original partition has all data deleted
Is it true that deleted files aren’t really deleted? Absolutely! Two things happen when a file is deleted: the first data item stored in the file (first character of filename) is changed to “?” the catalogue entry ceases to recognise & display the filename starting address shows “?” Character file system is programmed to ignore ? at such a location Rest of the data is untouched… easily demonstrated through use of a Hex editor program: can show file contents “before” and “after”
“Normal” Loading of a File from Secondary Media File catalogue essential for data retrieval application reads file catalogue displays folders and files user chooses file, application uses disk addresses to load into memory What if file catalogue corrupted? backup copy on disk… what if both become corrupted?
Direct access by address on Secondary Media If both file catalogues are damaged… file (and its data) cannot be located “Hex editors” available to do the equivalent of debug –d (peek) and –e (poke) enables full search of all addresses for particular ASCII string(s) essential for recovery of data… Also used for restoring recently deleted files “?” character restored to a real character then be picked up & shown on catalogue display
WinHex Probably the most popular tool to examine hard disks readout quite similar to debug –e data presented byte-by-byte according to catalogue address range of options for extracting, overwriting data, and (like debug) writing consecutive raw data items to a file