SU YUXIN JAN 20, 2014 Petuum: An Iterative-Convergent Distributed Machine Learning Framework
Outline Introduction Implementation Questions Demo
Introduction to Petuum
Bulk Synchronous Parallel
Asynchronous Parameters read / update at any time
Stale Synchronous Parallel
Convergence
Programming read(table, row, col) inc(table, row, col, value) iteration()
Implementation
Overview in Logic
Overview in the Real
Main Components
Table
ConsistencyController::DoGet()
ConsistencyController::iterate()
Server::GetRow()
Least-Recently-Used(LRU) Strategy
Questions
Is Lock-Free Possible ? Data exchange in real-time ? next …
Is Auto-Rescheduling Possible ? sub-centralized server reduce communication cost
Is Auto-Partition Possible ? Run ML algorithms like that in a single thread A Solution for all ML algorithms
In-Memory or In-Storage ? Data capacity is greater than memory size. Memory should be a cache for disk storage. Solution for disk storage: Hadoop Spark ….
New Schema to Reduce the Upper Bound?
STRADS Scheduler Variable Correlations Auto-Parallelization Dynamic Prioritization Monitor the contribution of variables to objective function Load-Balancing in Task
Demo Switch to my laptop …