Download presentation
Presentation is loading. Please wait.
Published byHerbert McCormick Modified over 9 years ago
1
Scalable Data Management@facebook
2
Scale
3
#2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries Over 300 million active users More than 2 32 photos … 100 million search queries per day > 3.9 trillion feed actions processed per day 2 billion pieces of content per week 6 billion minutes per day
4
Growth Rate 2009 300M Active Users
5
Social Networks
6
nikos | METIS | 2012 6 OSNs are popular! OSNs have become wildly popular over last few years, FB > 800M, Twitter > 230M etc. Distributed across the planet Changed how content is created + consumed: inherently long-tailed as only ‘ friends ’ are interested Explosion of smartphones: -Photos/HD videos easy to shoot and share
7
Scaling Social Networks ▪ Much harder than typical websites where... ▪ Typically 1-2% online: easy to cache the data ▪ Partitioning & scaling relatively easy ▪ What do you do when everything is interconnected?
8
name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo name, status, privacy, video thumbnail name, status, privacy, profile photo
9
System Architecture
10
Overall achitecture: Facebook ▪ Facebook has 2 datacenters, 1 per coast ▪ reads spread across both ▪ writes only to W. Coast; periodically (~10 minutes) replicated to E. Coast ▪ >2000 MySQL servers, >25TB RAM for memcached ▪ Challenge: inconsistency due to stale data ▪ I change status message => Friends on East Coast datacenter don’t see change for 10 min ▪ What if E.Coast person changes own status?? 10
11
Web at 100 feet: georeplication & CDN’s Source: “How Facebook Works”,Technology Review, Jul/Aug 2008 11
12
Architecture Database (slow, persistent) Load Balancer (assigns a web server) Web Server (PHP assembles data) Memcache (fast, simple)
13
▪ Simple in-memory hash table ▪ Supports get/set,delete,multiget, multiset ▪ Not a write-through cache ▪ Pros and Cons ▪ The Database Shield! ▪ Low latency, very high request rates ▪ Can be easy to corrupt, inefficient for very small items Memcache
14
▪ Multithreading and efficient protocol code - 50k req/s ▪ Polling network drivers - 150k req/s ▪ Breaking up stats lock - 200k req/s ▪ Batching packet handling - 250k req/s ▪ Breaking up cache lock - future Memcache Optimization
15
Network Incast Many Small Get Requests Memcache Switch PHP Client
16
Memcache Switch PHP Client Many big data packets Network Incast
17
Memcache Switch PHP Client Network Incast
18
Memcache Switch PHP Client Network Incast
19
Memcache 3 Objects PHP Client 3 round trips total1 round trip per server 4 Objects Memcache 3 Objects Memcache Clustering
20
ScribeScribeScribe ScribeScribeScribe ScribeScribeScribe Thousands of MySQL servers in two datacenters MySQL has played a role from the beginning
21
Photos
22
Photos + Social Graph = Awesome!
23
Photos: Scale ▪ 20 billion photos x4 = 80 billion ▪ Would wrap around the world more than 10 times! ▪ Over 40M new photos per day ▪ 600K photos / second
24
Photos Scaling - The easy wins ▪ Upload tier - handles uploads, scales images, stores on NFS ▪ Serving tier: Images served from NFS via HTTP ▪ However... ▪ File systems are not good at supporting large number of files ▪ Metadata too large to fit in memory causing too many IOs for each file read ▪ Limited by I/O not storage density ▪ Easy wins ▪ CDN ▪ Cachr (http server + caching) ▪ NFS file handle cache
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.