Download presentation
Presentation is loading. Please wait.
Published byVivien Oliver Modified over 9 years ago
1
Experimental Perspectives on Lasso-related Algorithms on Parallel Computing Frameworks Jichuan Zeng
2
Experimental Perspectives on Lasso-related Algorithms on Parallel Computing Frameworks
Big Data era (4V, Volume, Variety, Variability, Velocity) Still lacks comparison of the state of art Big Data frameworks in specific problems in large-scale dataset. Lasso-related algorithm large-scale, sparsity and slow convergence What are main features and differences of these distributed ML frameworks? Can distributed ML frameworks above capable of solving the lasso-related optimization problem on huge-scale data sets? What is the trade-off between the performance of frameworks and the sparsity of data? What is main factor in each framework that retards the lasso-related algorithms when the scale of data soars?
3
Distributed Machine Learning Frameworks
Graphlab - Graph-based Petuum Parameter Server Stale Synchronous Parallel Spark General-purpose Resilient Distributed Datasets
4
Lasso on ML Frameworks Environment Arcane Multi-core servers, running VMware. For each virtual machine, we configured 4 cores(2.5 GHz each) and 16 GB of RAM Dataset Arcene The sparser dataset contains 10K features and 54K non-zero entries Graphlab performs poor in shared variables applications compared to the Petuum and Spark which deploy on master/workers mode
5
Future Works More lasso-related models
Group lasso Elastic net Fused lasso Graphical lasso Try to improve the current distributed ML frameworks Load balance Heuristic update
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.