Download presentation
Presentation is loading. Please wait.
Published byTerence Dalton Modified over 9 years ago
1
GraphSC: Parallel Secure Computation Made Easy Kartik Nayak With Xiao Shaun Wang, Stratis Ioannidis, Udi Weinsberg, Nina Taft, Elaine Shi 1
2
2 Users Data Privacy concern! Data Mining Engine Data Model Data Mining on User Data
3
3 Graph representing social connections Graph representing professional connections Compute user’s influence in both circles Companies Computing on Private Data
4
4 Companies want to run machine learning algorithms Users/Companies do NOT want to reveal data Can we enable this in practice?
5
5 Cryptography to the rescue: Secure Multiparty Computation Ensures that we learn only the outcome
6
Key Challenges 6 Generic Solutions 1 Lot of work improving individual algorithms Departure from one-at-a-time approach
7
Key Challenges 7 Convert Program to Run on Secure Computation (Cost of obliviousness) 2
8
Key Challenges 8 Parallelizability 3 There’s a lot of data – maintain benefits of parallelism in the insecure setting With cryptography, expensive computation
9
9 Key Contributions
10
10 Generic Framework for “Graph-parallel” Algorithms Pregel by PageRa nk Matrix Factorization using gradient descent Risk Minimization using ADMM And many more Matrix Factorization using ALS Challenge: Generic Solutions
11
Key Contributions 11 Efficiently Convert Graph-parallel Programs to Oblivious Programs Total work blowup is O(log |V|) Blowup for naïve solution: O(|V|) for sparse graphs Challenge: Convert program to run on Secure Computation
12
Key Contributions 12 Maintain Parallelizability Depth of the computation is O(log |V|) Matrix Factorization: 4K ratings, 32 threads [NIWJTB’13] 1.4 hours Challenge: Parallelizability < 4 mins
13
13 1 2 3 Efficiently Convert to Oblivious Programs Maintain Parallelizability Generic Framework for Graph-parallel Algorithms Key Contributions
14
14 function bs(val, s, t) mid = (s + t) / 2; if (val < mem[mid]) bs(val, 0, mid) else bs(val, mid+1, t) Programmer’s favorite model Cryptographer’s favorite model
15
15 Programmer’s model: Programs Oblivious Programs Cryptographer’s model: Circuits Intuitively, Program traces should not depend on input data
16
16 function bs(val, s, t) mid = (s + t) / 2; if (val < mem[mid]) bs(val, 0, mid) else bs(val, mid+1, t) Programmer’s favorite model Cryptographer’s favorite model
17
17 Programmer’s model: Programs Oblivious Programs Cryptographer’s model: Circuits Intuitively, Program traces should not depend on input data Easy Hard
18
18 Achieving Parallelism Goal: Low Depth Circuits Oblivious Parallel RAM [BCP’14] Polylogarithmic Blowup: Not practical GraphSC: O(log |V|) blowup
19
19 Pregel by “Graph-parallel” algorithms [LGKB’10, GLGBG’12, MABDHLC’10, ZCF’10]
20
20 Graph-parallel Algorithms A B C D 1 2 4 5 1 1 1 2 4 7 1 0 1 Scatter: Send data to edges Gather: Aggregate data from edges Apply: Perform some computation
21
21 Obliviousness of Graph-parallel Algorithms Do not reveal edge/vertex data Do not reveal structure of the graph Naïve Solution: O(|V| 2 ) A B C D 1 1 1 2 4 Our Solution: O(|E| log|V|) 7 1 0 1
22
Oblivious Gather – Key Trick 22 3 4 1 2
23
Oblivious Gather – Key Trick 23 Oblivious Sort with (v, isVertex) Single pass Sort: O(|E| log |V|) Single pass: O(|E|) Oblivious Gather: (|E| log |V|) Gather in clear: O(|E|)
24
Scatte r Complexity of Our Algorithms 24 Gather Apply Sequential Insecure (Total Work) Parallel Oblivious (Total Work) Parallel Oblivious (Parallel Time) O(|E|) O(|V|) O(|E| log |V|) O(|E|) O(log |V|) O(1) Naïve Oblivious (Total Work) O(|V| 2 ) O(|E|)
25
Algorithms on GraphSC Histogram computation PageRank Matrix Factorization using gradient descent Matrix Factorization using alternating least squares Bellman-Ford shortest path Bipartite matching Parallel empirical risk minimization through alternating direction method of multipliers (ADMM) 25 Pregel by
26
Experimental Setup 26 … … Cloud 1 (Garblers) Cloud 2 (Evaluators) Two Scenarios: 1.LAN 2.Across Data Centers (WAN)
27
Key Evaluation Results 27 Histogra m Input Size 1K – 0.5M Parallel Time (32 processors) 4 sec – 34 min PageRank (1 iteration) Matrix Factorizatio n (1 iteration) Using GD Using ALS 4K – 128K 20 sec – 15.5 min 1K – 32K 47 sec – 34 min 64 – 4K 2 min – 2.35 hours
28
Max: 16K ratings (64x smaller data) [NIWJTB’13] Running at Scale 28 Matrix Factorization using gradient descent: 1M ratings, 6K users, 4K movies [KBV’09] 7 machine cluster, 128 processors, 525 GB RAM Time taken: ~13 hours (1 iteration) 4K ratings, 32 threads 1.4 hours < 4 mins We used only 7 machines! 13 hours -> few mins by using more machines
29
Across Data Centers 29 Page Rank Garblers: Oregon Evaluators: N. Virginia B/W provisioned: 2 Gbps Time reduces linearly with increasing processors
30
30 GraphSC is a parallel secure computation framework for Graph-parallel algorithms www.oblivm.c om Thank You! kartik@cs.umd.edu Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.