Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimistic Concurrency Control for Distributed Learning Xinghao Pan Joseph E. Gonzalez Stefanie Jegelka Tamara Broderick Michael I. Jordan.

Similar presentations


Presentation on theme: "Optimistic Concurrency Control for Distributed Learning Xinghao Pan Joseph E. Gonzalez Stefanie Jegelka Tamara Broderick Michael I. Jordan."— Presentation transcript:

1 Optimistic Concurrency Control for Distributed Learning Xinghao Pan Joseph E. Gonzalez Stefanie Jegelka Tamara Broderick Michael I. Jordan

2 Data Model Parameters Machine Learning Algorithm

3 Data Model Parameters Distributed Machine Learning

4 Data Model Parameters ! ! Distributed Machine Learning Concurrency: more machines = less time Correctness: serial equivalence

5 Data Model Parameters Coordination-free

6 Data Model Parameters Mutual Exclusion

7 Data Model Parameters Mutual Exclusion

8 Correctness Concurrency Coordination- free Mutual exclusion High LowHigh Low Optimistic Concurrency Control ?

9 Data Model Parameters Optimistic Concurrency Control Optimistic updates Validation : detect conflict Resolution : fix conflict ! ! Hsiang-Tsung Kung and John T Robinson. On optimistic methods for concurrency control. ACM Transactions on Database Systems (TODS), 6(2):213–226, 1981. Concurrency Correctness

10 Optimistic Concurrency Control Application: Clustering Natural domain for parallelization K-means – popular algorithm Fixed number of clusters – not fit for Big Data Big Data solution: DP-means + OCC

11 Example

12 Example: K-means Bad!

13 Example: DP-means Correct clusters Sequential! Brian Kulis and Michael I. Jordan. Revisiting k-means: New algorithms via Bayesian nonparametrics. In Proceedings of 23rd International Conference on Machine Learning, 2012.

14 OCC DP-means Validation Resolution

15 Evaluation: Amazon EC2 OCC DP-means Runtime Projected Linear Scaling 2x #machines ≈ ½x runtime ~140 million data points; 1, 2, 4, 8 machines

16 Optimistic Concurrency Control High concurrency:  Conflicts rare  Validation easy  Resolution cheap OCCified Algorithms  Online facility location  BP-means: feature modeling Ongoing  Stochastic gradient descent  Collapsed Gibbs sampling

17 What can OCC do for you? See us @ poster session! xinghao@eecs.berkeley.edu Optimistic Concurrency Control Big Learning @ NIPS 2013 http://biglearn.org Xinghao Pan, Joseph E. Gonzalez, Stefanie Jegelka, Tamara Broderick, and Michael I. Jordan. Optimistic concurrency control for distributed unsupervised learning. ArXiv e-prints arXiv:1307.8049, 2013.


Download ppt "Optimistic Concurrency Control for Distributed Learning Xinghao Pan Joseph E. Gonzalez Stefanie Jegelka Tamara Broderick Michael I. Jordan."

Similar presentations


Ads by Google