CMPE 180-38 Database Systems Workshop June 16 Class Meeting Department of Computer Engineering San Jose State University Summer 2017 Instructor: Ron Mak www.cs.sjsu.edu/~mak
Computer System Architecture A major disparity exists between CPU speed and memory speed. CPU technology is advancing faster than memory technology. The CPU can process data faster than it can be fetched from compatibly priced main memory units. Main memory is the bottleneck in terms of system speed and cost. Systems need to include very fast and expensive cache memory.
Memory Hierarchy Small amount of very fast, expensive, volatile cache memory. volatile: contents disappear when the power goes off Gigabytes of medium-speed, medium-price, volatile main memory. RAM: random access memory Gigabytes or terabytes of slow, cheap, non-volatile disk storage.
Memory/Storage Speed Comparisons Suppose your computer ran at human speeds. 1 CPU cycle: 1 second Then the time to retrieve one byte from: SRAM 5 seconds DRAM 2 minutes Flash 1 day Hard drive 2 months Tape 1,000 years
B-Trees A B-tree is a tree data structure suitable for disk drives. It may take up to 11 ms to access data on disk. Today’s modern CPUs can execute billions of instructions per second. Therefore, it’s worth spending a few CPU cycles to reduce the number of disk accesses. B-trees are often used to implement databases.
B-Trees A B-tree is an m-ary tree. Mark Allen Weiss Data Structures and Algorithms in Java (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-257627-9
B-Trees A B-tree of order 5 for a disk drive: Mark Allen Weiss Data Structures and Algorithms in Java (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-257627-9
B-Tree Insertion Mark Allen Weiss Data Structures and Algorithms in Java (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-257627-9
B-Tree Insertion Mark Allen Weiss Data Structures and Algorithms in Java (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-257627-9
B-Tree Insertion Mark Allen Weiss Data Structures and Algorithms in Java (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-257627-9
B-Tree Deletion Mark Allen Weiss Data Structures and Algorithms in Java (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-257627-9
Price per Megabyte of DRAM Operating Systems Concepts, 9th edition Silberschatz, Galvin, and Gagne (c) 2013 John Wiley & Sons. All rights reserved. 978-1-118-06333-0
Price per Megabyte of Magnetic Hard Disk Operating Systems Concepts, 9th edition Silberschatz, Galvin, and Gagne (c) 2013 John Wiley & Sons. All rights reserved. 978-1-118-06333-0
RAID II Prototype (FrankenRAID) Computer History Museum Redundant Array of Independent Disks Initially, Redundant Array of Inexpensive Disks Term was coined in 1987 by David Patterson, Randy Katz and Garth A. Gibson. RAID II Prototype (FrankenRAID) Computer History Museum
RAID Redundancy RAID increases reliability by adding redundancy. Mirroring Maintain multiple copies of data. Striping Spread data blocks of a file across multiple disks. Error correcting codes (ECC) Use parity bits and checksums. Recover lost data.
RAID Levels Different levels of RAID have different levels of reliability and performance costs. RAID levels 0 through 6 RAID level 1+0 or 10 Example: 4 disks of data C: copy of the data P: error-correcting bits Operating Systems Concepts, 9th edition Silberschatz, Galvin, and Gagne (c) 2013 John Wiley & Sons. All rights reserved. 978-1-118-06333-0
RAID 0 Striped data blocks No mirroring Best performance No fault tolerance http://searchstorage.techtarget.com/definition/RAID
RAID 1 Disk mirroring No striping Improved read performance http://searchstorage.techtarget.com/definition/RAID
RAID 2 Striped data blocks Parity information Superseded by RAID 3 http://searchstorage.techtarget.com/definition/RAID
RAID 3 Cannot overlap I/O. An I/O operation addresses all the drives at the same time. Best for single-user systems with long record applications. Striped data blocks One drive stores parity information. http://searchstorage.techtarget.com/definition/RAID
RAID 4 All writes have to update the parity drive. Large stripes No overlapped writes. Large stripes Read entire records from any single drive. Overlapped reads. http://searchstorage.techtarget.com/definition/RAID
RAID 5 Poor write performance. Block-level striping Striped parity Array can function even if one drive fails. http://searchstorage.techtarget.com/definition/RAID
RAID 6 Continues to function even if two disks fail simultaneously. Higher cost per gigabyte and often has slower write performance than RAID 5. Similar to RAID 5. Includes a second parity scheme that is distributed across the drives in the array. http://searchstorage.techtarget.com/definition/RAID
RAID 10 (1+0) The data is mirrored and the mirrors are striped. Higher performance than RAID 1, but at a much higher cost. Combines RAID 1 and RAID 0. http://searchstorage.techtarget.com/definition/RAID
RAID Tutorials http://www.thegeekstuff.com/2010/08/raid-levels-tutorial/ http://searchstorage.techtarget.com/definition/RAID
Solid-State Drives (SSDs) No moving parts. Fail less often than hard drives. Lessens the need for RAID. Use wear leveling instead for data protection. Each block can tolerate a finite number of program/erase cycles before becoming unreliable. Distribute reads and writes evenly among all the blocks of the device.