Download presentation
Presentation is loading. Please wait.
Published byAnabel Berringer Modified over 9 years ago
1
April 2008 1 Brahms Byzantine-Resilient Random Membership Sampling Bortnikov, Gurevich, Keidar, Kliot, and Shraer
2
April 2008 2 Edward (Eddie) BortnikovMaxim (Max) GurevichIdit Keidar Gabriel (Gabi) Kliot Alexander (Alex) Shraer
3
April 2008 3 Why Random Node Sampling Gossip partners Random choices make gossip protocols work Unstructured overlay networks E.g., among super-peers Random links provide robustness, expansion Gathering statistics Probe random nodes Choosing cache locations
4
April 2008 4 The Setting Many nodes – n 10,000s, 100,000s, 1,000,000s, … Come and go Churn Every joining node knows some others Connectivity Full network Like the Internet Byzantine failures
5
April 2008 5 Adversary Attacks Faulty nodes (portion f of ids) Attack other nodes May want to bias samples Isolate nodes, DoS nodes Promote themselves, bias statistics
6
April 2008 6 Previous Work Benign gossip membership Small (logarithmic) views Robust to churn and benign failures Empirical study [Lpbcast,Scamp,Cyclon,PSS] Analytical study [Allavena et al.] Never proven uniform samples Spatial correlation among neighbors’ views [PSS] Byzantine-resilient gossip Full views [MMR,MS,Fireflies,Drum,BAR] Small views, some resilience [SPSS] We are not aware of any analytical work
7
April 2008 7 Our Contributions 1. Gossip-based attack-tolerant membership Linear portion f of failures O(n 1/3 )-size partial views Correct nodes remain connected Mathematically analyzed, validated in simulations 2. Random sampling Novel memory-efficient approach Converges to proven independent uniform samples The view is not all bad Better than benign gossip
8
April 2008 8 Brahms 1. Sampling - local component 2. Gossip - distributed component sample Sampler Gossip view
9
April 2008 9 Sampler Building Block Input: data stream, one element at a time Bias: some values appear more than others Used with stream of gossiped ids Output: uniform random sample of unique elements seen thus far Independent of other Samplers One element at a time (converging) next sample Sampler
10
April 2008 10 Sampler Implementation Memory: stores one element at a time Use random hash function h From min-wise independent family [Broder et al.] For each set X, and all, Sampler next sample init Keep id with smallest hash so far Choose random hash function
11
April 2008 11 Component S: Sampling and Validation Sampler sample Sampler next init using pings id stream from gossip Validator S
12
April 2008 12 Gossip Process Provides the stream of ids for S Needs to ensure connectivity Use a bag of tricks to overcome attacks
13
April 2008 13 Gossip-Based Membership Primer Small (sub-linear) local view V V constantly changes - essential due to churn Typically, evolves in (unsynchronized) rounds Push: send my id to some node in V Reinforce underrepresented nodes Pull: retrieve view from some node in V Spread knowledge within the network [ Allavena et al. ‘05]: both are essential Low probability for partitions and star topologies
14
April 2008 14 Brahms Gossip Rounds Each round: Send pushes, pulls to random nodes from V Wait to receive pulls, pushes Update S with all received ids (Sometimes) re-compute V Tricky! Beware of adversary attacks
15
April 2008 15 Problem 1: Push Drowning Push Alice Push Bob Push Dana Push Carol Push Ed Push Mallory Push M&M Push Malfoy A D B E M M M M M
16
April 2008 16 Trick 1: Rate-Limit Pushes Use limited messages to bound faulty pushes system-wide E.g., computational puzzles/virtual currency Faulty nodes can send portion p of them Views won’t be all bad
17
April 2008 17 Problem 2: Quick Isolation Push Alice Push Bob Push Dana Push Carol Push Ed Push Mallory Push M&M Push Malfoy A C E M Ha! She’s out! Now let’s move on to the next guy! D
18
April 2008 18 Trick 2: Detection & Recovery Do not re-compute V in rounds when too many pushes are received Slows down isolation; does not prevent it Push Mallory Push M&M Push Malfoy Hey! I’m swamped! I better ignore all of ‘em pushes… Push Bob
19
April 2008 19 Problem 3: Pull Deterioration Pull M8M8 ABM1M1 M2M2 CDM3M3 M4M4 M7M7 EFM5M5 M6M6 E M3M3 M3M3 EM7M7 M8M8 50% faulty ids in views 75% faulty ids in views
20
April 2008 20 Trick 3: Balance Pulls & Pushes Control contribution of push - α|V| ids versus contribution of pull - β|V| ids Parameters α, β Pull-only eventually all faulty ids Push-only quick isolation of attacked node Push ensures: system-wide not all bad ids Pull slows down (does not prevent) isolation
21
April 2008 21 Trick 4: History Samples Attacker influences both push and pull Feedback γ|V| random ids from S Parameters α + β + γ = 1 Attacker loses control - samples are eventually perfectly uniform Yoo-hoo, is there any good process out there?
22
April 2008 22 View and Sample Maintenance Pushed ids Pulled ids S |V| |V||V| |V||V| View VSample
23
April 2008 23 Key Property Samples take time to help Assume attack starts when samples are empty With appropriate parameters E.g., Time to isolation > time to convergence Prove lower bound using tricks 1,2,3 (not using samples yet) Prove upper bound until some good sample persists forever Self-healing from partitions
24
April 2008 24 History Samples: Rationale Judicious use essential Bootstrap, avoid slow convergence Deal with churn With a little bit of history samples (10%) we can cope with any adversary Amplification!
25
April 2008 25 Analysis 1. Sampling - mathematical analysis 2. Connectivity - analysis and simulation 3. Full system simulation
26
April 2008 26 Connectivity Sampling Theorem: If overlay remains connected indefinitely, samples are eventually uniform
27
April 2008 27 Sampling Connectivity Ever After Perfect sample of a sampler with hash h: the id with the lowest h(id) system-wide If correct, sticks once the sampler sees it Correct perfect sample self-healing from partitions ever after We analyze PSP(t) – probability of perfect sample at time t
28
April 2008 28 Convergence to 1 st Perfect Sample n = 1000 f = 0.2 40% unique ids in stream
29
April 2008 29 Scalability Analysis says: For scalability, want small and constant convergence time independent of system size, e.g., when
30
April 2008 30 Connectivity Analysis 1: Balanced Attacks Attack all nodes the same Maximizes faulty ids in views system-wide in any single round If repeated, system converges to fixed point ratio of faulty ids in views, which is < 1 if γ=0 (no history) and p < 1/3 or History samples are used, any p There are always good ids in views!
31
April 2008 31 Fixed Point Analysis: Push i Local view node 1 Local view node i Time t: push 1 Time t+1: push from faulty node lost push x(t) – portion of faulty nodes in views at round t; portion of faulty pushes to correct nodes : p / ( p + ( 1 − p )( 1 − x(t) ) )
32
April 2008 32 Fixed Point Analysis: Pull i Local view node 1 Local view node i Time t: pull from i: faulty with probability x(t) Time t+1: E[x(t+1)] = p / (p + (1 − p)(1 − x(t))) + ( x(t) + (1-x(t)) x(t) ) + γf pull from faulty
33
April 2008 33 Faulty Ids in Fixed Point With a few history samples, any portion of bad nodes can be tolerated Perfectly validated fixed points and convergence Assumed perfect in analysis, real history in simulations
34
April 2008 34 Convergence to Fixed Point n = 1000 p = 0.2 α=β=0.5 γ=0
35
April 2008 35 Connectivity Analysis 2: Targeted Attack – Roadmap Step 1: analysis without history samples Isolation in logarithmic time … but not too fast, thanks to tricks 1,2,3 Step 2: analysis of history sample convergence Time-to-perfect-sample < Time-to-Isolation Step 3: putting it all together Empirical evaluation No isolation happens
36
April 2008 36 Targeted Attack – Step 1 Q: How fast (lower bound) can an attacker isolate one node from the rest? Worst-case assumptions No use of history samples ( = 0) Unrealistically strong adversary Observes the exact number of correct pushes and complements it to α|V| Attacked node not represented initially Balanced attack on the rest of the system
37
April 2008 37 Isolation w/out History Samples n = 1000 p = 0.2 α=β=0.5 γ=0 Depend on α,β,p Isolation time for |V|=60
38
April 2008 38 Step 2: Sample Convergence Perfect sample in 2-3 rounds n = 1000 p = 0.2 α=β=0.5, γ=0 40% unique ids Empirically verified
39
April 2008 39 Step 3: Putting It All Together No Isolation with History Samples Works well despite small PSP n = 1000 p = 0.2 α=β=0.45 γ=0.1
40
April 2008 40 p = 0.2 α=β=0.45 γ=0.1 Sample Convergence (Balanced) Convergence twice as fast with
41
April 2008 41 Summary O(n 1/3 )-size views Resist attacks / failures of linear portion Converge to proven uniform samples Precise analysis of impact of attacks
42
April 2008 42 Balanced Attack Analysis (1) Assume (roughly) equal initial node degrees x(t) = portion of faulty ids in correct node views at time t Compute E[x(t+1)] as function of x(t), p, , , Result #1: Short-term Optimality Any non-balanced schedule imposes a smaller x(t) in a single round
43
April 2008 43 Balanced Attack Analysis (2) Result #2: Existence of Fixed Point X^ E[x(t+1)] = x(t) = X^ Analyze X^ (function of p, , , ) Conditions for uniqueness For = =0.5, p < 1/3, exists X^ < 1 The view is not entirely poisoned – history samples are not essential Result #3: Convergence to fixed point From any initial portion < 1 of faulty ids From [ Hillam 1975 ] (sequence convergence)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.