Disk Array Performance Estimation AGH University of Science and Technology Department of Computer Science Jacek Marmuszewski Darin Nikołow, Marek Pogoda, Renata Slota, Jacek Kitowski
Outline Introduction motivation for performance estimation of disk array problems connected with estimation of disk array requirements for disk array estimator Solution Environment Disk Array Performance Tests Estimation Model Estimation Quality Tests Future work
Introduction motivation for performance estimation of disk array Proper and efficient performance estimation of storage systems is essential for many processes occurring in distributed computational environments such as: replica selection, new replica creation, creating VO specifying data storage performance requirements in SLA, guarantying the fulfillment of SLA within VO.
Introduction problems connected with estimation of disk array Complexity of algorithms used to determine best solution for storing data Shared resources Virtualization
Introduction requirements for disk array estimator Estimator response time Estimation quality
Solution Model identification via active experiments
Environment general view - Disk Array - Host / Server - User / Application
Test Environment Disk Array 1 Infortrend A16F-G2430 2GB cache 16x 1TB HDD – SATA, RAID6 2x 4/8 Gbit/s fiber channel interface Server 1 Xeon QuadCore 4GB RAM Disk Array 2 Intel Entery Storage System SS4200-E 4 x 500GB HDD - SATA RAID5 1 Gbit Ethernet Server 2 Intel Core2Duo E Ghz 2GB RAM
Test Environment - Disk Array - Host / Server Monitoring daemon Estimator Service & database Sending data using ICE ICE – Internet Communication Engine form ZeroC
Disk Array Performance Tests Tests written in C Using 'fwrite' Synchronizing (flushing) once – before ending test usu
Disk Array Performance Tests
Model How to obtain those values automatically ?
Model size speed - stored in cache - stored on HDDs Cache size
Model Cache usage estimation Monitoring i/o operation on every host! Knowledge of NIC speed and HDDs speed Best way : Get this information directly form Disk Array
Model Multiple users – average bandwidth usage Divide bandwidth equally to all hosts On host, divide bandwidth equally to all users If host/user in not using its all bandwidth divide it to others Use this value as „Max Disk Array Speed” Do the same for „Max HDD's speed” value
Model Multiple users – estimating future r/w speed Don't use current speed! Use weighted mean for statistical r/w data. History time speed time s0s1s2s3 s0s1s2s3
Estimation quality tests
Estimation quality tests for multiple users
Average absolute performance estimation error: 34,8 MB/s Maximal absolute performance estimation error: 64,4 MB/s Average absolute performance estimation error: 8,2 % Maximal absolute performance estimation error: 13,9 % Average estimator response time (+ ICE) = ~1.2ms ICE – Internet Communication Engine form ZeroC
Summary and Future Works Collecting more data directly form Disk Array administration / diagnostic tools Analyzing more data – searching for patterns in Disk Array usage.
Thank You