Download presentation
Presentation is loading. Please wait.
Published byCecilia Edwards Modified over 9 years ago
1
An Analysis of Parallel Mixing with Attacker-Controlled Inputs Nikita Borisov formerly of UC Berkeley
2
Definitions “Parallel Mixing” A latency optimization for synchronous re-encryption mixnets [Golle & Juels 2004] “Attacker-Controlled Inputs” Inputs to a mixnet which can be linked to corresponding outputs Either directly controlled by attackers or discovered through other means “Analysis” Low anonymity if most inputs are known If few inputs are known, anonymity loss can be amplified with repeated mixings
3
Synchronous Re-encryption Mixes Messages are mixed by all mix servers Re-encryption of each message under the same decryption key M1M2M3M4M1M2M3M4 Mix 1 M’ 1 M’ 2 M’ 3 M’ 4 Mix 2 M’’ 1 M’’ 2 M’’ 3 M’’ 4
4
Parallel Mixing M1M2M1M2 M3M4M3M4 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 Rotation Distribution
5
Properties Initial public permutation to assign inputs to batches T rotations, followed by 1 distribution, followed by T more rotations Defends against up to T dishonest mixes Latency is 2(T+1)*N/M re-encryptions N - number of messages M - number of mix servers Even with T=M-1, faster than conventional cascade with N*M re-encryptions (for M>2)
6
Attacker-Controlled Inputs M1M2M1M2 M3M4M3M4 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 1 2 1 2
7
Overview Introduction Analysis Methods Analysis Results Multiple-round analysis Open problems Conclusions
8
Theorem 1 Definitions (j) = # of known inputs in batch j ( (1) = 1) (j’) = # of known outputs in batch j’ ( (1) = 1) (j,j’) = # of known inputs in batch j matching outputs in batch j’ ( (1,1) = 0) M1M2M1M2 M3M4M3M4 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 1 2 1 2
9
Theorem 1 M1M2M1M2 M3M4M3M4 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 Mix 2 Mix 1 1 2 1 2 Pr[s 1 -> s 1 ] = (1-0)/((2-1)(2-1)) = 1
10
Anonymity Metrics Anon [Golle and Juels ‘04] Entropy [SD’02, DSCP’02] Can compute either metric using Theorem 1 Need to know (j), (j’), and (j,j’) for each j,j’
11
Scenarios Given a scenario: # of known inputs Distribution of known inputs among input batches Distribution of known outputs among output batches We can compute: (j), (j’), and (j,j’) Anonymity metrics What’s a typical scenario? Distribution of anonymity metrics
12
Combinatorial Enumeration Given # of known inputs, enumerate through all scenarios All initial permutations All mix shuffle choices Compute (j), (j’), and (j,j’) for each possibility Improvements: Partition states into equivalence classes Combinatorial enumeration
13
3 Mixes, 18 Inputs 17311151454831150294756284149883771654705774592 000000000000000000000 possible scenarios
14
Sampling Full enumeration still impractical for large systems Instead, we use sampling: Given a # of known inputs, simulate a random scenario Compute (j), (j’), and (j,j’) and anonymity metrics Repeat Get a sampled distribution of metrics Misses the tail of distribution, but we don’t care
15
1008 Inputs, 900 unknown
16
1008 inputs, 100 unknown
17
Multiple-Round Analysis Anonymity may be short of optimal, but with Anon > 10, who cares? Consider repeated mixing of the same inputs Unlikely to happen with e-voting Likely if parallel mixing used for TCP forwarding Each mixing is a new, random observation Reveals new information each time Over time, input-output correspondence identified w.h.p.
18
Repeated mixing with 500 unknown inputs Note: all mixes here are honest!
19
Open Problem: Classification Attacks We may not be able to link individual inputs and outputs, but classes of them E.g. voting: all votes for one party E.g. intersection attack: all previously unseen streams E.g. dummy inputs Attackers still have some information, but can’t use Theorem 1 anymore Cannot compute (j,j’)
20
Open Problem: Extra Rounds Add another distribution and T rotation rounds 50% latency increase Can be shown to generate all permutations But not with equal probabilities Small scale analysis shows anonymity much closer to optimal Large scale analysis thus far intractable Once again, can’t use Theorem 1
21
Conclusions Parallel mixing reveals information when attackers control some inputs Big problem if most inputs are controlled When fewer inputs are known, repeated mixings may still be a problem This problem exists even if all mixes are honest Statistical approximations should be checked by simulations
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.