Slide 1 Initial Availability Benchmarking of a Database System Aaron Brown 2001 Winter ISTORE Retreat
Slide 2 Motivation Extend availability benchmarks to new areas –explore generality and limitations of approach –gain more understanding of system failure modes Why look at database availability? –databases hold the critical hard state for most enterprise and e-business applications »the most important system component to keep available –we trust databases to be highly reliable. Should we? »how do DBMSs react to hardware faults/failures? »what is the user-visible impact of such failures?
Slide 3 Approach Use our availability benchmarking methodology to evaluate database robustness –focus on storage system failures –study 3-tier OLTP workload »back-end: commercial database »middleware: transaction monitor & business logic »front-end: web-based form interface –measure availability in terms of performance »also possible to look at consistency of data
Slide 4 Refresher: availability benchmarks Goal: quantify variation in quality of service as system availability is compromised Leverage existing performance benchmark –to measure & trace quality of service metrics –to generate fair workloads Use fault injection to compromise system Observe results graphically
Slide 5 Availability metrics for databases Possible OLTP quality of service metrics –transaction throughput –transaction response time »better: % of transactions longer than a fixed cutoff –rate of transactions aborted due to errors –consistency of database –fraction of database content available Our experiments focused on throughput –rates of normal and failed transactions
Slide 6 Fault injection Disk subsystem faults only –realistic fault set based on Tertiary Disk study »correctable & uncorrectable media errors, hardware errors, power failures, disk hangs/timeouts »both transient and “sticky” faults »note: similar fault set to RAID benchmarks –injected via an emulated SCSI disk (~0.5ms overhead) –faults injected in one of two partitions: »database data partition »database’s write-ahead log partition
Slide 7 Experimental setup Database –Microsoft SQL Server 2000, default configuration Middleware/front-end software –Microsoft COM+ transaction monitor/coordinator –IIS 5.0 web server with Microsoft’s tpcc.dll HTML terminal interface and business logic –Microsoft BenchCraft remote terminal emulator TPC-C-like OLTP order-entry workload –10 warehouses, 100 active users, ~860 MB database Measured metrics –throughput of correct NewOrder transactions/min –rate of aborted NewOrder transactions (txn/min)
Slide 8 Experimental setup (2) Database installed in one of two configurations: –data on emulated disk, log on real (IBM) disk –data on real (IBM) disk, log on emulated disk IBM 18 GB 10k RPM DB Server IDE system disk = Fast/Wide SCSI bus, 20 MB/sec Adaptec 3940 Emulated Disk DB data/ log disks Front End SCSI system disk 100mb Ethernet IBM 18 GB 10k RPM SCSI system disk Disk Emulator Intel P-II/ MB DRAM Windows NT 4.0 Adaptec 2940 emulator backing disk (NTFS) AdvStor ASC-U2W UltraSCSI ASC VirtualSCSI lib. Intel P-III/ MB DRAM Windows 2000 AS MS BenchCraft RTE IIS + MS tpcc.dll MS COM+ AMD K6-2/ MB DRAM Windows 2000 AS SQL Server 2000
Slide 9 Results All results are from single-fault micro- benchmarks 14 different fault types –injected once for each of data and log partitions 4 categories of behavior detected 1) normal 2) transient glitch 3)degraded 4)failed
Slide 10 Type 1: normal behavior System tolerates fault Demonstrated for all sector-level faults except: –sticky uncorrectable read, data partition –sticky uncorrectable write, log partition
Slide 11 Type 2: transient glitch One transaction is affected, aborts with error Subsequent transactions using same data would fail Demonstrated for one fault only: –sticky uncorrectable read, data partition
Slide 12 Type 3: degraded behavior DBMS survives error after running log recovery Middleware partially fails, results in degraded perf. Demonstrated for one fault only: –sticky uncorrectable write, log partition
Slide 13 Type 4: failure DBMS hangs or aborts all transactions Middleware behaves erratically, sometimes crashing Demonstrated for all fatal disk-level faults –SCSI hangs, disk power failures Example behaviors (10 distinct variants observed) Disk hang during write to data diskSimulated log disk power failure
Slide 14 Results: summary DBMS was robust to a wide range of faults –tolerated all transient and recoverable errors –tolerated some unrecoverable faults »transparently (e.g., uncorrectable data writes) »or by reflecting fault back via transaction abort »these were not tolerated by the SW RAID systems Overall, DBMS is significantly more robust to disk faults than software RAID systems!
Slide 15 Results: discussion DBMS’s extra robustness comes from: –redundant data representation in form of log –transactions »standard mechanism for reporting errors (txn abort) »encapsulate meaningful unit of work, providing consistent rollback upon failure But, middleware was not robust, compromising overall system availability –crashed or behaved erratically when DBMS recovered or returned errors –user cannot distinguish DBMS and middleware failure –system is only as robust as its weakest component! compare RAID: blocks don’t let you do this
Slide 16 Discussion of methodology General availability benchmarking methodology does work on more than just RAID systems Issues in adapting the methodology –defining appropriate metrics –measuring non-performance availability metrics –understanding layered (multi-tier) systems with only end-to-end instrumentation
Slide 17 Discussion of methodology General availability benchmarking methodology does work on more than just RAID systems Issues in adapting the methodology –defining appropriate metrics »metrics to capture database ACID properties »adapting “binary” metrics such as data consistency –measuring non-performance availability metrics »existing benchmarks (like TPC-C) may not do this –understanding layered (multi-tier) systems with only end-to-end instrumentation »teasing apart availability impact of different layers DO NOT PROJECT THIS SLIDE!
Slide 18 Future directions Last retreat: James Hamilton proposed availability/maintainability extensions to TPC This work is a (small) step toward that goal –exposed limitations, capabilities of disk fault injection –revealed importance of middleware, which clearly must be considered as part of the benchmark –hints at poor state-of-the-art in TPC-C benchmark middleware fault handling Next: –expand metrics, including tests of ACID properties –consider other fault injection points besides disks –investigate clustered database designs –study issues in benchmarking layered systems
Slide 19 Thanks! Microsoft SQL Server group –for generously providing access to SQL Server 2000 and the Microsoft TPC-C Benchmark Kit –James Hamilton –Jamie Redding and Charles Levine
Slide 20 Backup slides
Slide 21 Example results: failing data disk Transient, correctable read fault (system tolerates fault) Sticky, uncorrectable read fault (transaction is aborted with error) Disk hang between SCSI commands (DBMS hangs, middleware returns errors) Disk hang during a data write (DBMS hangs, middleware crashes)
Slide 22 Example results: failing log disk Transient, correctable write fault (system tolerates fault) Sticky, uncorrectable write fault (DBMS recovers, middleware degrades) Simulated disk power failure (DBMS aborts all txns with errors) Disk hang between SCSI commands (DBMS hangs, middleware hangs)