Distributed Computation Framework for Machine Learning Yuxin Su Big Data Group Apr. 25, 2014
Popular Frameworks Hadoop / MapReduce GraphLab Data-oriented Model Unfriendly to Iterative- based Algorithms GraphLab Dependence-oriented Model
I’m focusing on Petuum A new extension of Bulk Synchronous Parallel Error tolerance for reducing communication demand
Non-negative Matrix Factorization A commonly used algorithm It’s hard to scale up W ≥ 0, H ≥ 0
NMF on Petuum My goal: find an efficient approach to handle huge matrix with billions items Current work Design parallel coordinate descent to solve NMF Analyze the convergence of my approach on Petuum