Hardware Accelerator for Combinatorial Optimization Fujian Li Advisor: Dr. Areibi
Case Study for Implementing Traveling Salesman Problem Key words: Combinatorial problem Heuristic method Local Search Simulated Annealing algorithm Traveling Salesman Problem (TSP) List representation
Combinatorial Problems Combinatorial problems Traveling Salesman Problem, Partitioning,... Heuristic method Local Search, Simulated Annealing, Tabu Search,... Not guarantee to find global optimum Move through search space using local information to guide the move. Time consuming
Solution Space Minimization Cost Function Global Optimum Local Optima Local Search Local Search is the basis for heuristic methods
Local Search Generate Initial Feasible Solution S0 Repeat Select a Move from Neighborhood: S = N(S0) Compute Change in Cost: f(S) - f(S0) If accept then S0=S Until Stopping Condition
Simulated Annealing Algorithm
Traveling Salesman Problem Given n cities and distances between each of the cities, Find a cycle that hits each city once that minimize the total distance traveled. Combinatorial problem Complexity : O(N!)
Xj X(j-1)X(j+1) D[Xj, X(j-1)] D[Xi, X(i+1)] D[Xj, X(j+1)] D[Xi, X(i-1)] X(i-1) Xi X(i+1) Traveling Salesman Problem Swap two cities to move through the solution space
Traveling Salesman Problem Adder Tree
Implementation List data structure is good for heuristic algorithms Grow and shrink easily Implement common operations easily Move An item is moved from one list to the end of another list. Swap Two items are swapped from the same or different lists. Reposition The position of an item in a list is changed.
Implementation Memory for storing solutions A unit for generating a move A unit for computing the change in cost A unit for using Boltzmann equation
Implementation
Implementation An Aptix Ap4 reconfigurable logic board Xilinx XC4010 FPGAs, memory devices,... FPGA for A move generator The adder tree Update unit Finite state machine...
Implementation A lookup table for Boltzmann equation Exponential computation by FPGA is difficult Contain the values of Boltzmann equation for different values of The table is reloaded each time the temperature is changed
Implementation FPGAs are not suitable for implementing large memories Memory devices for The multiple copies of the solution list Distances
Comparison ImplementationAverage Time per iteration Software-IBM RS Hardware0.35 Speedup over RS times
Future Improvement Pipeline scheme Very high speed memory Block RAM on FPGA Interconnection delay between devices on Aptix
References 1. David Abramson Application Specific Computers for Combinatorial Optimization 2. David Abramson FPGA Based Custom Computing Machines for Irregular Problems 3. David Abramson A Very High Speed Architecture to Support Simulated Annealing
Thank You & Question?
Application of FPGA Satisfied density and speed of FPGA Efficient synthesis tools