Chapter 12 – Mass Storage Structures (Pgs )
Magnetic Disk Terminology Transfer Rate: Rate at which data flows from the drive to main memory Positioning Time (time before IO can occur) = Seek Time (time to move the head) + Rotational Latency (time for correct sector to come under head)
Disk Construction Block: 512 Bytes (typical)
Disk Bus Formats Always changing and improving SCSI: Small computer systems interface ATA: Advance technology attachment SATA: Serial ATA EIDE: Enhanced integrated drive electronics USB: Universal serial bus FC: Fibre channel
The Future of Disks Surviving for the near future Slower than persistent RAM Cheaper than persistent RAM Older technology tends to remain available if it fulfills a role e.g., magnetic tapes for second-level backups of financial data (still in use on mainframe systems – cost per bit very low compared to disks)
Head Scheduling Concerns Synchronicity: Must requests be processed in order? Average Wait Time: How long is a process typically blocked? Maximum Wait Time: How long could a process be blocked (Important in RTOS) Priority: Is all IO equally important? ** Almost identical to process scheduling **
Scheduling Algorithms First Come, First Served (FCFS) Fair but not most efficient Shortest seek time first (SSTF) Better, but far from optimal SCAN (Elevator) Moves back and forth, servicing requests in cylinder order C-SCAN (Circular) Same as SCAN but don't service on return trip LOOK (and also C-LOOK) Same as SCAN but only go/return as far as needed
Choosing One Often don't have to now being built into the disk controller SSTF or LOOK are most common Simple, decent performance Sophisticated systems change algorithm based on data properties (like sorting functions in libraries)
Disk Management Much of the low-level management (blocks, ECC, bad-sector replacement) is done by the disk controller Blocks typically 512K, but 256 and 1024 are sometimes possible OS uses "Clusters" which are larger than blocks and which function as virtual blocks Increases efficiency, better supports large files, reduces overhead, maps to page size
Bad Blocks Cheap disks don't do much Newer/High-end disks handle it via disk controller Spare sectors exist Can cause problems with head scheduling Controllers try to generate replacements on the same cylinder for this reason Small errors often repairable using ECC
RAID Redundant Array of Independent Disks Improved reliability Higher data transfer rate Less storage (due to redundancy) than if used separately Often provided as a "unit" with its own independent controller
Mirroring Simplest form of redundancy Both disks have to fail to lose data About the most effective method for providing redundancy Highly expensive with respect to resources "Independent Failure" is not fully realistic Bad batches, aging, fire/flood, etc. NOTHING prevents some data loss due to power-failure
Striping Various Levels Block-level (most common) Byte-level Bit-level
RAID Levels 0: Block-striping, no redundancy 1: Complete mirroring 2: Byte-striping, parity disks, ECC for single bit errors 3: Bit-interleaved parity (uses sector parity checks to reduce overhead of RAID 2) 4: Block-interleaved parity (RAID 0 with parity blocks) 5: Distributed block-interleaved parity: Same as 4 but with parity and data intermixed 6: P+Q Scheme: Same as 5, but with improved ECC such as Reed-Solomon coding (tolerates dual disk failures)
RAID RAID 0: Striping, improves performance RAID 1: Mirroring, improves reliability Put together they are highly effective Only half the storage capacity is available due to mirroring Simple to implement and maintain
Selecting a RAID Level Rebuild Time: How long does it take to react to a failure? Need for reliability? Cost effectiveness (mirroring costs more per bit)? Performance concerns Type of protection – RAID protects against hardware failures, not against software failures Properties of file systems – stability, volumes, sizes
Other Storage NVRAM – As a solid state disk or as a write- back cache Optical Disks (CD/DVD) – Various technologies for CD-ROM, CD-R ("WORM"), CD-RW Magnetic Tapes: Cartridges (kind of like VHS), slow but very economical
Storage Performance Issues Speed -- Bandwidth & Latency Effective Bandwidth: Overall transfer rate (including latency) Sustained Bandwidth: Rate of flow Average Latency: Time from request until transfer begins Reliability (Redundancy) – number and kind of failures from which recovery can occur Cost – price per bit Lifespan – Will data be required in 20 years? Access Frequency – How often will data be accessed?
To Do: Work on Assignment 2 (Due next week) Complete Lab 7 (optional lab) Read Chapter 12 (pgs ; this lecture)