Optimistic Concurrency Control for Distributed Learning Xinghao Pan Joseph E. Gonzalez Stefanie Jegelka Tamara Broderick Michael I. Jordan.

Slides:



Advertisements
Similar presentations
Scaling Up Graphical Model Inference
Advertisements

Concurrency Control for Machine Learning
MAD-Bayes: MAP-based Asymptotic Derivations from Bayes
Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Parallel Double Greedy Submodular Maxmization Xinghao Pan, Stefanie Jegelka, Joseph Gonzalez, Joseph Bradley, Michael I. Jordan.
Expectation Maximization
Optimization Tutorial
Supervised Learning Recap
Machine learning optimization
Learning Scalable Discriminative Dictionaries with Sample Relatedness a.k.a. “Infinite Attributes” Jiashi Feng, Stefanie Jegelka, Shuicheng Yan, Trevor.
Distance metric learning, with application to clustering with side-information Eric P. Xing, Andrew Y. Ng, Michael I. Jordan and Stuart Russell University.
The loss function, the normal equation,
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.
1 Distributed Computing Algorithms CSCI Distributed Computing: everything not centralized many processors.
1 PEGASOS Primal Efficient sub-GrAdient SOlver for SVM Shai Shalev-Shwartz Yoram Singer Nati Srebro The Hebrew University Jerusalem, Israel YASSO = Yet.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Preface Exponential growth of data volume, steady drop in storage costs, and rapid increase in storage capacity Inadequacy of the sequential processing.
Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,
Session - 14 CONCURRENCY CONTROL CONCURRENCY TECHNIQUES Matakuliah: M0184 / Pengolahan Data Distribusi Tahun: 2005 Versi:
Distributed Systems Fall 2009 Distributed transactions.
Copyright © 2004 Pearson Education, Inc.. Chapter 17 Introduction to Transaction Processing Concepts and Theory.
Google App Engine and Java Application: Clustering Internet search results for a person Aleksandar Kartelj Faculty of Mathematics,
MURI Update 7/26/12 UC Berkeley Prof. Trevor Darrell (UCB) Prof. Michael Jordan (UCB) Prof. Brian Kulis (fmr UCB postdoc, now Asst. Prof. OSU) Dr. Stefanie.
AN OPTIMISTIC CONCURRENCY CONTROL ALGORITHM FOR MOBILE AD-HOC NETWORK DATABASES Brendan Walker.
Computation and Minimax Risk The most challenging topic… Some recent progress: –tradeoffs between time and accuracy via convex relaxations (Chandrasekaran.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Mean-shift and its application for object tracking
Fast Max–Margin Matrix Factorization with Data Augmentation Minjie Xu, Jun Zhu & Bo Zhang Tsinghua University.
Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.
Model representation Linear regression with one variable
Core Methods in Educational Data Mining HUDK4050 Fall 2014.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
Optimistic Methods for Concurrency Control By: H.T. Kung and John Robinson Presented by: Frederick Ramirez.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
M Machine Learning F# and Accord.net.
Page 1 Concurrency Control Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Big data Usman Roshan CS 675. Big data Typically refers to datasets with very large number of instances (rows) as opposed to attributes (columns). Data.
Grid Appliance The World of Virtual Resource Sharing Group # 14 Dhairya Gala Priyank Shah.
Machine learning optimization Usman Roshan. Machine learning Two components: – Modeling – Optimization Modeling – Generative: we assume a probabilistic.
1 Lecture 4: Transaction Serialization and Concurrency Control Advanced Databases CG096 Nick Rossiter [Emma-Jane Phillips-Tait]
Group members :- 1.Vipul S. Basapati ( ) 2.Kathan Tripathi ( )
NIPS 2013 Michael C. Hughes and Erik B. Sudderth
Core Methods in Educational Data Mining HUDK4050 Fall 2015.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Budgeted Optimization with Concurrent Stochastic-Duration Experiments
Convolutional Neural Network
Machine Learning and Data Mining Clustering
Large-scale Machine Learning
Interactive Machine Learning with a GPU-Accelerated Toolkit
Restricted Boltzmann Machines for Classification
Joseph E. Gonzalez Postdoc, UC Berkeley AMPLab
Probabilistic Models for Linear Regression
Omega: flexible, scalable schedulers for large compute clusters
Probabilistic Models with Latent Variables
Stochastic Optimization Maximization for Latent Variable Models
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
Atomic Commit and Concurrency Control
Distributed Computing:
CherryPick: Adaptively Unearthing the Best
Presented by Wanxue Dong
CS639: Data Management for Data Science
Machine Learning and Data Mining Clustering
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Jia-Bin Huang Virginia Tech
Linear regression with one variable
Presentation transcript:

Optimistic Concurrency Control for Distributed Learning Xinghao Pan Joseph E. Gonzalez Stefanie Jegelka Tamara Broderick Michael I. Jordan

Data Model Parameters Machine Learning Algorithm

Data Model Parameters Distributed Machine Learning

Data Model Parameters ! ! Distributed Machine Learning Concurrency: more machines = less time Correctness: serial equivalence

Data Model Parameters Coordination-free

Data Model Parameters Mutual Exclusion

Data Model Parameters Mutual Exclusion

Correctness Concurrency Coordination- free Mutual exclusion High LowHigh Low Optimistic Concurrency Control ?

Data Model Parameters Optimistic Concurrency Control Optimistic updates Validation : detect conflict Resolution : fix conflict ! ! Hsiang-Tsung Kung and John T Robinson. On optimistic methods for concurrency control. ACM Transactions on Database Systems (TODS), 6(2):213–226, Concurrency Correctness

Optimistic Concurrency Control Application: Clustering Natural domain for parallelization K-means – popular algorithm Fixed number of clusters – not fit for Big Data Big Data solution: DP-means + OCC

Example

Example: K-means Bad!

Example: DP-means Correct clusters Sequential! Brian Kulis and Michael I. Jordan. Revisiting k-means: New algorithms via Bayesian nonparametrics. In Proceedings of 23rd International Conference on Machine Learning, 2012.

OCC DP-means Validation Resolution

Evaluation: Amazon EC2 OCC DP-means Runtime Projected Linear Scaling 2x #machines ≈ ½x runtime ~140 million data points; 1, 2, 4, 8 machines

Optimistic Concurrency Control High concurrency:  Conflicts rare  Validation easy  Resolution cheap OCCified Algorithms  Online facility location  BP-means: feature modeling Ongoing  Stochastic gradient descent  Collapsed Gibbs sampling

What can OCC do for you? See poster session! Optimistic Concurrency Control Big NIPS Xinghao Pan, Joseph E. Gonzalez, Stefanie Jegelka, Tamara Broderick, and Michael I. Jordan. Optimistic concurrency control for distributed unsupervised learning. ArXiv e-prints arXiv: , 2013.