Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015.

Slides:



Advertisements
Similar presentations
The Primal-Dual Method: Steiner Forest TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA A A A AA A A.
Advertisements

Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
A Parallel GPU Version of the Traveling Salesman Problem Molly A. O’Neil, Dan Tamir, and Martin Burtscher* Department of Computer Science.
Distributed Graph Analytics Imranul Hoque CS525 Spring 2013.
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
GraphChi: Big Data – small machine
EMIS 8373: Integer Programming Valid Inequalities updated 4April 2011.
CS774. Markov Random Field : Theory and Application Lecture 17 Kyomin Jung KAIST Nov
CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep
Introduction to Approximation Algorithms Lecture 12: Mar 1.
Distributed Message Passing for Large Scale Graphical Models Alexander Schwing Tamir Hazan Marc Pollefeys Raquel Urtasun CVPR2011.
Recent Development on Elimination Ordering Group 1.
The Theory of NP-Completeness
Message Passing Algorithms for Optimization
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
Research at Intel Distributed Localization of Modular Robot Ensembles Robotics: Science and Systems 25 June 2008 Stanislav Funiak, Michael Ashley-Rollman.
Carmine Cerrone, Raffaele Cerulli, Bruce Golden GO IX Sirmione, Italy July
The Role of Specialization in LDPC Codes Jeremy Thorpe Pizza Meeting Talk 2/12/03.
A Lightweight Infrastructure for Graph Analytics Donald Nguyen Andrew Lenharth and Keshav Pingali The University of Texas at Austin.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
A Unified Modeling Framework for Distributed Resource Allocation of General Fork and Join Processing Networks in ACM SIGMETRICS
Network Aware Resource Allocation in Distributed Clouds.
X-Stream: Edge-Centric Graph Processing using Streaming Partitions
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Introduction to Job Shop Scheduling Problem Qianjun Xu Oct. 30, 2001.
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
Message-Passing for Wireless Scheduling: an Experimental Study Paolo Giaccone (Politecnico di Torino) Devavrat Shah (MIT) ICCCN 2010 – Zurich August 2.
Low-Power Gated Bus Synthesis for 3D IC via Rectilinear Shortest-Path Steiner Graph Chung-Kuan Cheng, Peng Du, Andrew B. Kahng, and Shih-Hung Weng UC San.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Fast Parallel and Adaptive Updates for Dual-Decomposition Solvers Ozgur Sumer, U. Chicago Umut Acar, MPI-SWS Alexander Ihler, UC Irvine Ramgopal Mettu,
Scalable Multi-Class Traffic Management in Data Center Backbone Networks Amitabha Ghosh (UtopiaCompression) Sangtae Ha (Princeton) Edward Crabbe (Google)
Tao Lin Chris Chu TPL-Aware Displacement- driven Detailed Placement Refinement with Coloring Constraints ISPD ‘15.
Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan.
PREDIcT: Towards Predicting the Runtime of Iterative Analytics Adrian Popescu 1, Andrey Balmin 2, Vuk Ercegovac 3, Anastasia Ailamaki
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Fast and accurate energy minimization for static or time-varying Markov Random Fields (MRFs) Nikos Komodakis (Ecole Centrale Paris) Nikos Paragios (Ecole.
Solving the Maximum Cardinality Bin Packing Problem with a Weight Annealing-Based Algorithm Kok-Hua Loh University of Maryland Bruce Golden University.
Update any set S of nodes simultaneously with step-size We show fixed point update is monotone for · 1/|S| Covering Trees and Lower-bounds on Quadratic.
Adaptive Hopfield Network Gürsel Serpen Dr. Gürsel Serpen Associate Professor Electrical Engineering and Computer Science Department University of Toledo.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, & Z. Zhang (2014)
Integer Programming (정수계획법)
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
ASSIGNMENT, DISTRIBUTION AND QOS PROVISIONING IN COMMUNICATION NETWORKS.
DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team
Reliable Multicast Routing for Software-Defined Networks.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
1 Approximation algorithms Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij TexPoint fonts used in EMF. Read the TexPoint manual.
Custom Computing Machines for the Set Covering Problem Paper Written By: Christian Plessl and Marco Platzner Swiss Federal Institute of Technology, 2002.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.
Data Driven Resource Allocation for Distributed Learning
Distributed Vehicle Routing Approximation
基于多核加速计算平台的深度神经网络 分割与重训练技术
Objective of This Course
Integer Programming (정수계획법)
Finding Subgraphs with Maximum Total Density and Limited Overlap
Bucket Renormalization for Approximate Inference
Alan Kuhnle*, Victoria G. Crawford, and My T. Thai
Maximum Flow Neil Tang 4/8/2008
Presentation transcript:

Practical Message-passing Framework for Large-scale Combinatorial Optimization Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin KAIST 2015 IEEE International Conference on Big Data 1

Large-scale Real-time Optimizations Are Becoming More Important for Processing Big Data Virtual Machine Placement in Data Centers [1] Multi-path Network Routing in SDN [2] Resource Allocation on Cloud [3] Virtual Network Resource Assignment [4] Introduction 2 [1] Meng, et al. Improving the scalability of data center networks with traffic-aware virtual machine placement. INFOCOM [2] Kotronis, et al. Outsourcing the routing control logic: Better Internet routing based on SDN principles. Hot Topics in Networks [3] Rai, et al. "Generalized resource allocation for the cloud." ACM Symposium on Cloud Computing [4] Zhu, et al. "Algorithms for Assigning Substrate Network Resources to Virtual Network Components." INFOCOM Problem size is becoming large. Decision needs to be made in real time.

Traditional Attempts to Solve Combinatorial Optimization There is a trade-off among accuracy, time complexity and generality. Our goal is to develop the parallelizable framework to solve large-scale combinatorial optimization with low time complexity and high accuracy. Introduction 3 Time Complexity high low Accuracy high poor Greedy Exact Algorithm Integer Programming Approximation Algorithm GOAL General Algorithm Problem-specific Algorithm

Our Approach Many combinatorial optimizations can be expressed as Integer Programming (IP) formulation. We are going to solve the optimization problem using Belief Propagation algorithm. Our Contribution 4 Maximum Weight Matching Problem IP formulation BP formulation (selected) (undecided) (unselected) Message Update Rule: Decision Rule: edge vertex

Belief Propagation (BP) BP algorithm is message-passing based algorithm. Easy to parallelize [5], easy to implement. BP is widely used due to its empirical success in various fields, e.g., error-correcting codes, computer vision, language processing, statistical physics. Previous works on BP for combinatorial optimization Analytic studies are too theoretic, i.e. not practical [6-7]. Empirical studies are problem-specific [8-9]. Our Contribution 5 [5] Gonzalez, et al. "Residual splash for optimally parallelizing belief propagation.” Aistats [6] S. Sanghavi, et al., “Belief propagation and lp relaxation for weighted matching in general graphs,” Information Theory [7] N. Ruozzi and S. Tatikonda, “st paths using the min-sum algorithm,” ALLERTON [8] S. Ravanbakhsh, et al., “Augmentative message passing for traveling salesman problem and graph partitioning,” NIPS [9] M. Bayati, et al., “Statistical mechanics of steiner trees,” Physical review letters, vol. 101, no. 3, p , 2008.

Challenges of BP & Our solution Our Contribution 6 (1) BP’s convergence is too slow for practical instances. → Fixed number of BP iterations. (2) Solution may not produce feasible solution. → Introduce generic “rounding” scheme enforcing the feasibility via weight transformation and post-processing. (3) Solution produce poor accuracy. → Careful message initialization, hybrid damping and asynchronous message updates

Overview of our generic BP-based framework 7 Message Initialization Input (1) BP Weight Transforming Noise Addition Damping Asynchronous Message Update Heuristic Algorithm Output BP Iterations Transformed weight (2) Post-Processing Original Weight After running a fixed number of BP iterations, weights are transformed so that BP messages are considered. Using transformed weight post-processing is responsible for producing feasible solution. Algorithm Design Transformed Weight Feasible Solution

Message Initialization & Hybrid Damping Algorithm Design 8 BP convergence speed can be significantly improved by careful message initialization and hybrid damping. Message Initialization Input (1) BP Weight Transforming Noise Addition Damping Asynchronous Message Update Heuristic Algorithm Output BP Iterations Transformed weight (2) Post-Processing

Evaluation Setup Combinatorial Optimization Problems Maximum Weight Matching, Minimum Weight Vertex Cover, Maximum Weight Independent Set, and Travelling Salesman Problem. Data Sets Benchmark data sets [10], Real-world data sets [11], and synthetic data sets with Erdos-Rényi random graphs. Number of Samples Synthetic Data Sets: 100 samples for up to 100k vertices, 10 samples for up to 500k vertices, and 1 sample for up to 50M vertices. Benchmark Data Sets: 5 samples per each data set. Metrics Running time, accuracy (approximation ratio), and scalability over large- scale input. 9 [10] bhoslib benchmark set. [11] Davis, et al. "The University of Florida sparse matrix collection." TOMS 2011 Evaluation

Running Time Our framework achieves more than 70 times faster running time compared with Blossom V, one of exact algorithm on Maximum Weight Matching with randomly generated data set. 10 Evaluation 71x Experiment Environment -Two Intel Xeon E5 CPUs (16 cores) -Language: c++ -Pthread for parallelization -Post-processing: Greedy -Randomly generated data set Accuracy BlossomBP 100%>99.9%

Accuracy Our framework reduces more than 40% of error ratio compared with existing heuristic algorithms on Minimum Weight Vertex Cover with benchmark data of frb-series from BHOSLIB [12]. 11 Evaluation [10] bhoslib benchmark set % Experiment Environment -Two Intel Xeon E5 CPUs (16 cores) -Language: c++ -Pthread for parallelization -Benchmark data set

Scalability over large-scale input Our framework can handle more than 2.5 billion of variables (50M vertices) while existing schemes can handle up to 300 million of variables under the same machine. 12 Evaluation [12] A. Kyrola, et al., Graphchi: Large-scale graph computation on just a pc. OSDI [13] V. Kolmogorov, “Blossom v: a new implementation of a minimum cost perfect matching algorithm,” Mathematical Programming Computation [14] Gurobi Optimizer gurobi. com (2012). 50M 300M >2.5B Experiment Environments -i7 CPU (4 cores) and 24GB memeory -Language : c++ -GraphChi Implementation (158h) (102h) (>200h)

Conclusion We proposed the first practical and general BP-based framework which achieves above 99.9% of accuracy and more than 70x faster running time than existing algorithms by allowing parallel implementation on synthetic data with 20M vertices of Maximum Weight Matching. Our framework can reduce up to more than 40% of error rate on benchmark data of Maximum Weight Vertex Cover. Our framework is applicable for any large-scale combinatorial optimization tasks. Code is available on 13