Answering the Database Scale Out Problem: SSDs in the Data Center April 14, 2010 Dan Marriott Director - Production Operations
Answers.com The world’s leading Q&A site Rank in top Web properties #18 in the U.S. (02/2010) (1) #31 worldwide (02/2010) (1) Unique monthly visitors 50 million in the U.S. (02/2010) (1) 72 million worldwide (02/2010) (1) 2 (1) Source: comScore – Hybrid Measurement Methodology (U.S. only) beginning August 2009
ReferenceAnswers 3
WikiAnswers: Q&A the Wiki Way 4
Database layer MySQL b20-percona MySQL a 5
Challenges Keep site fast while site traffic and stored data are ever-increasing Replication lag = 0 or users get stale data Forever being forced to further optimize queries constantly vying for dev resources to do this Controlling hardware growth (Cap & OpEx $$$) regularly adding servers to handle growth 6
Handling high growth – database tier Separate reads and writes Add more read DB slaves Use Memcached where possible Optimize Queries Partition large databases _________ Started hitting a wall: Replication Lag even when servers handling modest # queries/sec 7
Typical DB read cluster 8
Fusion-io for HP Blade Servers March ’09: HP announce IO Accelerator card for blades (manuf. by Fusion-io) Sizes: 80 & 160GB SLC 320GB MLC April ’09: received two cards – began testing 9
Easy to install One man job. Takes 60 secs. 10
Performance Tests Test Blade Server – SAS HDDs Blade Server – Fusion-io cardImprovement 11 Additionally, CPU load dropped from 30% to 18% (even with Fusion-io driver overhead) Replication catch-up time (after restore) > 6 hours12½ mins 3,000% Max Queries/sec (Seconds_Behind_Master: 0) 350 Q/sec3,500 Q/sec 900% Application response time 100 ms70 ms 30% Full DB server recovery > 8 hours 55 mins800%
Typical DB read cluster 12
Fusion-io Value Add for Answers.com Scalability. >Twice the performance capacity on 1/4 servers 100% ROI on day of purchase (repurpose other 3/4) 75% reduction in operating costs: Rack Space, Power and Cooling Server Administration Database Administration 75% fewer failure points 13
Other SSD uses in the Data Center Varnish (Web caching layer) DB Backup Servers Log Analysis Data Warehouse 14
15 Thank you. Slideshow: