CS5226 Hardware Tuning. 2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA,

Slides:



Advertisements
Similar presentations
Redundant Array of Independent Disks (RAID) Striping of data across multiple media for expansion, performance and reliability.
Advertisements

RAID (Redundant Arrays of Independent Disks). Disk organization technique that manages a large number of disks, providing a view of a single disk of High.
1 CS 5226: Database Administration and Performance Tuning.
Database Tuning Principles, Experiments and Troubleshooting Techniques Baseado nos slides do tutorial com o mesmo nome da autoria de: Dennis Shasha
Lock Tuning. overview © Dennis Shasha, Philippe Bonnet 2001 Sacrificing Isolation for Performance A transaction that holds locks during a screen interaction.
“Redundant Array of Inexpensive Disks”. CONTENTS Storage devices. Optical drives. Floppy disk. Hard disk. Components of Hard disks. RAID technology. Levels.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
R.A.I.D. Copyright © 2005 by James Hug Redundant Array of Independent (or Inexpensive) Disks.
CSE521: Introduction to Computer Architecture Mazin Yousif I/O Subsystem RAID (Redundant Array of Independent Disks)
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Computer ArchitectureFall 2007 © November 28, 2007 Karem A. Sakallah Lecture 24 Disk IO and RAID CS : Computer Architecture.
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
Computer ArchitectureFall 2008 © November 12, 2007 Nael Abu-Ghazaleh Lecture 24 Disk IO.
OS and Hardware Tuning. Tuning Considerations Hardware  Storage subsystem Configuring the disk array Using the controller cache  Components upgrades.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
RAID Redundancy is the factor for development of RAID in server environments. This allows for backup of the data in the storage in the event of failure.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
Storage & Peripherals Disks, Networks, and Other Devices.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
Redundant Array of Independent Disks
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
SCSI Richard Goldman April 2000
Managing Multi-User Databases AIMS 3710 R. Nakatsu.
Lecture 11: DMBS Internals
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
1 Recovery Tuning Main techniques Put the log on a dedicated disk Delay writing updates to the database disks as long as possible Setting proper intervals.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
Disks Chapter 5 Thursday, April 5, Today’s Schedule Input/Output – Disks (Chapter 5.4)  Magnetic vs. Optical Disks  RAID levels and functions.
RAID SECTION (2.3.5) ASHLEY BAILEY SEYEDFARAZ YASROBI GOKUL SHANKAR.
1 Performance Tuning Next, we focus on lock-based concurrency control, and look at optimising lock contention. The key is to combine the theory of concurrency.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
CS Hardware Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
CS Operating System & Database Performance Tuning Xiaofang Zhou School of Computing, NUS Office: S URL:
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
The concept of RAID in Databases By Junaid Ali Siddiqui.
RAID Systems Ver.2.0 Jan 09, 2005 Syam. RAID Primer Redundant Array of Inexpensive Disks random, real-time, redundant, array, assembly, interconnected,
Raid Techniques. Redundant Array of Independent Disks RAID is a great system for increasing speed and availability of data. More data protection than.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
W4118 Operating Systems Instructor: Junfeng Yang.
LECTURE 13 I/O. I/O CANNOT BE IGNORED Assume a program requires 100 seconds, 90 seconds for main memory, 10 seconds for I/O. Assume main memory access.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
Lecture 16: Data Storage Wednesday, November 6, 2006.
Vladimir Stojanovic & Nicholas Weaver
Lecture 11: DMBS Internals
RAID RAID Mukesh N Tekwani
UNIT IV RAID.
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

CS5226 Hardware Tuning

2 Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA, Tuner Hardware [Processor(s), Disk(s), Memory] Operating System Concurrency ControlRecovery Storage Subsystem Indexes Query Processor Application

3 Outline Part 1: Tuning the storage subsystem RAID storage system Choosing a proper RAID level Part 2: Enhancing the hardware configuration

4 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625 Kb/sec 1999: IBM MICRODRIVE first 1’’ disk drive 340Mb 6.1 MB/sec Controller read/write head disk arm tracks platter spindle actuator disk interface

5 Magnetic Disks Access Time (2001) Controller overhead (0.2 ms) Seek Time (4 to 9 ms) Rotational Delay (2 to 6 ms) Read/Write Time (10 to 500 KB/ms) Disk Interface IDE (16 bits, Ultra DMA - 25 MHz) SCSI: width (narrow 8 bits vs. wide 16 bits) - frequency (Ultra MHz).

6 Storage Metrics DRAMDiskTape Robot Unit Capacity 2GB18GB14x70Gb Unit Price1600$467$20900$ $/Gb Latency (sec)1.E-82.E-3 (15k RPM) 3.E+1 Bandwidth (Mbps) (up to 160) 40 (up to 100) Kaps1.E E-2 Maps1.E+3233.E-2 Scan time (sec/Tb)

7 Hardware Bandwidth System Bandwidth Yesterday in megabytes per second (not to scale!) 40 Hard Disk | SCSI | PCI | Memory | Processor 15 per disk The familiar bandwidth pyramid: The farther from the CPU, the less the bandwidth. Slide courtesy of J. Gray/L.Chung

8 Hardware Bandwidth 1,600 System Bandwidth Today in megabytes per second (not to scale!) Hard Disk | SCSI | PCI | Memory | Processor The familiar pyramid is gone! PCI is now the bottleneck! In practice, 3 disks can reach saturation using sequential IO Slide courtesy of J. Gray/L.Chung

9 RAID Storage System Redundant Array of Inexpensive Disks Combine multiple small, inexpensive disk drives into a group to yield performance exceeding that of one large, more expensive drive Appear to the computer as a single virtual drive Support fault-tolerance by redundantly storing information in various ways

10 RAID Types Five types of array architectures, RAID 1 ~ 5 Different disk fault-tolerance Different trade-offs in features and performance A non-redundant array of disk drives is often referred to as RAID 0 Only RAID 0, 1, 3 and 5 are commonly used RAID 2 and 4 do not offer any significant advantages over these other types Certain combination is possible (10, 35 etc) RAID 10 = RAID 1 + RAID 0

11 RAID 0 - Striping No redundancy No fault tolerance High I/O performance Parallel I/O

12 RAID 1 – Mirroring Provide good fault tolerance Works ok if one disk in a pair is down One write = a physical write on each disk One read = either read both or read the less busy one Could double the read rate

13 RAID 3 - Parallel Array with Parity Fast read/write All disk arms are synchronized Speed is limited by the slowest disk

14 Parity Check - Classical An extra bit added to a byte to detect errors in storage or transmission Even (odd) parity means that the parity bit is set so that there are an even (odd) number of one bits in the word, including the parity bit A single parity bit can only detect single bit errors since if an even number of bits are wrong then the parity bit will not change It is not possible to tell which bit is wrong

15 RAID 5 – Parity Checking For error detection, rather than full redundancy Each stripe unit has an extra parity stripe Parity stripes are distributed

16 RAID 5 Read/Write Read: parallel stripes read from multiple disks Good performance Write: 2 reads + 2 writes Read old data stripe; read parity stripe (2 reads) XOR old data stripe with new data stripe. XOR result into parity stripe. Write new data stripe and new parity stripe (2 writes).

17 RAID 10 – Striped Mirroring RAID 10 = Striping + mirroring A striped array of RAID 1 arrays High performance of RAID 0, and high tolerance of RAID 1 (at the cots of doubling disks).. More information about RAID disks at

18 Hardware vs. Software RAID Software RAID Software RAID: run on the server’s CPU Directly dependent on server CPU performance and load Occupies host system memory and CPU operation, degrading server performance Hardware RAID Hardware RAID: run on the RAID controller’s CPU Does not occupy any host system memory. Is not operating system dependent Host CPU can execute applications while the array adapter's processor simultaneously executes array functions: true hardware multi-tasking

19 RAID Levels - Data Settings: accounts( number, branchnum, balance); create clustered index c on accounts(number); rows Cold Buffer Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.

20 RAID Levels - Transactions No Concurrent Transactions: Read Intensive: select avg(balance) from accounts; Write Intensive, e.g. typical insert: insert into accounts values (690466,6840, ); Writes are uniformly distributed.

21 RAID Levels SQL Server7 on Windows 2000 (SoftRAID means striping/parity at host) Read-Intensive: Using multiple disks (RAID0, RAID 10, RAID5) increases throughput significantly. Write-Intensive: Without cache, RAID 5 suffers. With cache, it is ok.

22 Comparing RAID Levels RAID 0RAID 1RAID 5RAID 10 ReadHigh2XHigh WriteHigh1XMediumHigh Fault tolerance NoYes Disk utilization HighLowHighLow Key problems Data lost when any disk fails Use double the disk space Lower throughput with disk failure Very expensive, not scalable Key advantages High I/O performance Very high I/O performance A good overall balance High reliability with good performance

23 Controller Pre-fetching No, Write-back Yes Read-ahead: Prefetching at the disk controller level. No information on access pattern. Better to let database management system do it. Write-back vs. write through: Write back: transfer terminated as soon as data is written to cache. Batteries to guarantee write back in case of power failure Write through: transfer terminated as soon as data is written to disk.

24 SCSI Controller Cache - Data Settings: employees(ssnum, name, lat, long, hundreds1, hundreds2); create clustered index c on employees(hundreds2); Employees table partitioned over two disks; Log on a separate disk; same controller (same channel) rows per table Database buffer size limited to 400 Mb. Dual Xeon (550MHz,512Kb), 1Gb RAM, Internal RAID controller from Adaptec (80Mb), 4x18Gb drives (10000RPM), Windows 2000.

25 SCSI (not disk) Controller Cache - Transactions No Concurrent Transactions: update employees set lat = long, long = lat where hundreds2 = ?; cache friendly: update of 20,000 rows (~90Mb) cache unfriendly: update of 200,000 rows (~900Mb)

26 SCSI Controller Cache SQL Server 7 on Windows Adaptec ServerRaid controller: 80 Mb RAM Write-back mode Updates Controller cache increases throughput whether operation is cache friendly or not. Efficient replacement policy!

27 Which RAID Level to Use? Data and Index Files RAID 5 is best suited for read intensive apps or if the RAID controller cache is effective enough. RAID 10 is best suited for write intensive apps. Log File RAID 1 is appropriate Fault tolerance with high write throughput. Writes are synchronous and sequential. No benefits in striping. Temporary Files RAID 0 is appropriate. No fault tolerance. High throughput.

28 What RAID Provides Fault tolerance It does not prevent disk drive failures It enables real-time data recovery High I/O performance Mass data capacity Configuration flexibility Lower protected storage costs Easy maintenance

29 Enhancing Hardware Config. Add memory Cheapest option to get better performance Can be used to enlarge DB buffer pool Better hit ratio If used for enlarge OS buffer (as disk cache), it benefits but to other apps as well Add disks Add processors

30 Add Disks Larger disk ≠better performance Bottleneck is disk bandwidth Add disks for A dedicated disk for the log Switch RAID5 to RAID10 for update-intensive apps Move secondary indexes to another disk for write- intensive apps Partition read-intensive tables across many disks Consider intelligent disk systems Automatic replication and load balancing

31 Add Processors Function parallelism Use different processors for different tasks GUI, Query Optimisation, TT&CC, different types of apps, different users Operation pipelines: E.g., scan, sort, select, join… Easy for RO apps, hard for update apps Data partition parallelism Partition data, thus the operation on the data

32 Parallelism Some tasks are easier to parallelize E.g., join phase of GRACE hash join E.g., scan, join, sum, min Some tasks are not so easy E.g., sorting, avg, nested-queries

33 Summary We have covered: The storage subsystem RAID: what are they and which one to use? Memory, disks and processors When to add what?

34 Database Tuning Database Tuning is the activity of making a database application run more quickly. “More quickly” usually means higher throughput, though it may mean lower response time for time-critical applications.

35 Tuning Principles Think globally, fix locally Partitioning breaks bottlenecks (temporal and spatial) Start-up costs are high; running costs are low Render onto server what is due onto Server Be prepared for trade-offs (indexes and inserts)

36 Tuning Mindset Set reasonable performance tuning goals Measure and document current performance Identify current system performance bottleneck Identify current OS bottleneck Tune the required components eg: application, DB, I/O, contention, OS etc Track and exercise change-control procedures Measure and document current performance Repeat step 3 through 7 until the goal is met

37 Goals Met? Appreciation of DBMS architecture Study the effect of various components on the performance of the systems Tuning principle Troubleshooting techniques for chasing down performance problems Hands-on experience in Tuning