Download presentation
Presentation is loading. Please wait.
Published bySamson Thompson Modified over 9 years ago
1
THE LITTLE ENGINE(S) THAT COULD: SCALING ONLINE SOCIAL NETWORKS B99106017 圖資三 謝宗昊
2
Outline Background SPAR Evaluation and Comparison Conclusion
3
Outline Background SPAR Evaluation and Comparison Conclusion
4
New challenge for system design Online Social Networks(OSNs) are hugely interconnected OSNs grow rapidly in a short period of time Twitter grew by 1382% between Feb And Mar 2009 Cause costly re-architecting for service Conventional vertical scaling is not a good solution Horizontal scaling leads to interconnecting issue Performance bottleneck
5
Full Replication
6
Random Partition (DHT)
7
Random Partition (DHT) with replication of the neighbors
8
SPAR
9
Designer’s Dilemma Commit resources to develop the feature +Appealing feature to attract new user -“Death by success” Ensure the scalability first +Low risk on “Death by success” -Hard to compete with other creative competitors
10
Outline BackGround SPAR Evaluation and Comparison Conclusion
11
SPAR A Social Partitioning And Replication middle-ware for social applications.
12
What does SPAR do and not do? DO Solves the Designer’s Dilemma for early stage OSNs Avoids performance bottlenecks in established OSNs. Minimizes the effect of provider lock-ins NOT DO Not designed for the distribution of content such as pictures and videos Not the solution for storage or for batch data analysis such as Hadoop
13
How does SPAR do it? Provides local semantics Handles node and edge dynamics with minimal overhead Serves more requests while reducing network traffic
14
Problem Statement Maintain local semantics Balance loads Be resilient to machine failures Be amenable to online operation Be stable Minimize replication overhead
15
Why not graph/social partitioning? Not incremental Community detection is too sensitive Reduce inter-partition edges ≠ Reduction of replicas
16
Description Node addition/removal Edge addition/removal Server addition/removal
17
Node addition/removal Node Addition New node to the partition with fewest master replicas Node Removal The master and all slaves should be removed The node have and edge with it should be updated
18
Edge addition/removal Edge Addition: Three possible configurations No movement of master Master of u goes to the partition containing master of v Master of v goes to the partition containing master of u Edge Removal Remove the replica of u in the partition holding the master of node v if no other requires it
19
Edge addition
20
Server addition/removal Server Addition: Two solution Force redistribution for load balance immediately Redistributing the master by node/edge processes Server Removal The highly connected nodes choose the server first
21
Implementation SPAR is a middle-ware(MW) between datacenter and application SPAR includes 4 components: Directory service (DS) Local Directory Service (LDS) Partition Manager (PM) Replication Manager (RM)
22
Implementation
23
Outline Background SPAR Evaluation and Comparison Conclusion
24
Evaluation methodology Metrics Replication overhead K-redundancy requirement Dataset Twitter: 12M tweet generated by 2.4M users Facebook:60,290 nodes and 1,545,686 edges Orkut:3M nodes and 223M edges
25
Evaluation methodology Algorithm for comparison Random Partitioning Solutions used by Facebook, Twitter Graph Partitioning (METIS) Minimize inter-partition edges Modularity Optimization (MO+) algorithm Community detection
26
Evaluation of replication overhead
27
SPAR Versus Random for K=2
28
Dynamic operations and SPAR
30
Adding/Removing Server Adding server has two policies: 1. Wait for new arrivals to fiull up the server 2. Re-distribute existing master from other server into the new server Wait for new arrival Re-distributeStarted with 32 servers overhead2.782.822.74
31
Adding/Removing Server Removing Server Average number of movements: 485k Overhead increases from 2.74 to 2.87 Reduce overhead to 2.77 if additional 180k transmissions Painful but not common to scale down
32
SPAR IN THE WILD Testbed: 16 low-end commodity servers Pentium Duo CPU 2.33GHz 2GB of RAM Single hard drive Evaluation with Cassandra Evaluation with MySQL
33
Evaluation with Cassandra
34
Evaluation with MySQL Full ReplicationSPAR 99 th percentiles of the response time without insertion of updates 152ms for 16 req/s150ms for 2,500 req/s Full ReplicationSPAR 95 th percentiles of the response time with insertion of updates N/A(Too poor)260ms for 50 read req/s 380ms for 200 read req/s
35
Outline Background SPAR Evaluation and Comparison Conclusion
36
Preserving local semantics has many benefit SPAR can achieve it in low replication overhead SPAR can deal with the dynamics experienced by an OSN gracefully Evaluation with RDBMS(MySQL) and a Key-Value Store(Cassandra) shows that SPAR offer significant gains in throughput(req/s) while reducing network traffic To sum up, SPAR would be a good solution for Scaling OSNs.
37
Reference http://www.cl.cam.ac.uk/~ey204/teaching/ACS/R202_201 1_2012/presentation/S6/Arman_SPAR.pdf - Arman Idani http://www.cl.cam.ac.uk/~ey204/teaching/ACS/R202_201 1_2012/presentation/S6/Arman_SPAR.pdf
38
Q&A time
39
Thanks for Listening
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.