Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1
Introduction to Database Systems 2 The Storage Hierarchy cache main memory reliable memory magnetic disk optical disk magnetic tape speed price
Introduction to Database Systems 3 What is the “speed” of storage? v Total response time is what counts v Response time = Latency + Transfer time v Latency = Time till data starts getting transferred v Transfer time depends on bandwidth v Average vs worst-case response time (depends on access patterns) v Speed of memory channel also matters v Contention for memory also matters
Introduction to Database Systems 4 Main Memory and Cache v 32 MB L2 cache = = $500 v 32 MB memory = = $175 v Uniform access times for random accesses v Not quite true with virtual memory v Latency involved in VM translations
Introduction to Database Systems 5 Why Not Store Everything in Main Memory? v Costs too much. $1000 will buy you either 128MB of RAM or 9 GB of disk. v Main memory is volatile. We want data to be persistent. v Typical storage hierarchy: –Main memory for subset of data in use. –Disk for the main database (secondary) –Tapes for archiving older versions of the data (tertiary)
Introduction to Database Systems 6 Magnetic Disks v Secondary storage device of choice v Main advantage over tapes: random access, vs. sequential access. v Data is stored and retrieved in units called disk blocks or pages. Why? v Unlike RAM, time to retrieve a disk page varies depending upon location. –Implication?
Introduction to Database Systems 7 Components of a Disk Platters v The platters spin (7200rpm) Spindle v The arm assembly is moved in or out to position a head on a desired track. Tracks under heads make a cylinder (imaginary!). Disk head Arm movement Arm assembly v Only one head reads/writes at any one time. Tracks Sector v Block size is a multiple of sector size (which is fixed).
Introduction to Database Systems 8 Disk Interfaces v Disk controller : special processor at disk v Disk drive interface : links controller to host along computer bus v EIDE : used in PCs -- cheaper, lower performance v SCSI : used in Macs, workstations v Disk is viewed as an array of sectors (logical) v Request scheduling: elevator algorithm (location of smarts varies for SCSI and EIDE)
Introduction to Database Systems 9 Servicing a Disk Request v Time to access (read/write) a disk block: – seek time ( moving arms to position disk head on track ) – rotational delay ( waiting for block to rotate under head ) – transfer time ( actually moving data to/from disk surface ) v Seek time and rotational delay dominate. –Seek time : 1 to 20msec (components?) –Rotational delay : 0 to 10msec –Transfer rate : 1msec per 4KB block (based on rpm) v Key to lower I/O cost: reduce seek/rotation delays! Hardware vs. software solutions?
Introduction to Database Systems 10 Hardware Solutions v Disk buffer : 512K to 1024 K (volatile?) v Read-ahead in disk buffer v Zoned bit recording v Track skewing v Embedded servo positioning v Higher spin speeds (but more power and vibr) v Higher recording density
Introduction to Database Systems 11 Software Solutions v ` Next ’ block concept: –blocks on same track, followed by –blocks on same cylinder, followed by –blocks on adjacent cylinder v Blocks in a file should be arranged sequentially on disk (by `next’), to minimize seek and rotational delay. Why? --- locality! v For a sequential scan, pre-fetching several pages at a time is a big win!
Introduction to Database Systems 12 Disk Space Management v Lowest layer of DBMS software manages space on disk. v Higher levels call upon this layer to: –allocate/de-allocate a page –read/write a page v Request for a sequence of pages should be sequential on disk!
Introduction to Database Systems 13 Summary: v Today’s key idea: Principle of Locality v Most important principle of database systems