Robust Asynchronous Optimization Using Volunteer Computing Grids

Slides:



Advertisements
Similar presentations
Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.
Advertisements

Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Student : Mateja Saković 3015/2011.  Genetic algorithms are based on evolution and natural selection  Evolution is any change across successive generations.
Tuesday, May 14 Genetic Algorithms Handouts: Lecture Notes Question: when should there be an additional review session?
1 Wendy Williams Metaheuristic Algorithms Genetic Algorithms: A Tutorial “Genetic Algorithms are good at taking large, potentially huge search spaces and.
Genetic Algorithms for Bin Packing Problem Hazem Ali, Borislav Nikolić, Kostiantyn Berezovskyi, Ricardo Garibay Martinez, Muhammad Ali Awan.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Genetic Algorithms: A Tutorial
Particle Swarm Optimization Algorithms
Genetic Algorithm.
From Analyzing the Tuberculosis Genome to Modeling the Milky Way Galaxy Using Volunteer Computing for Computational Science Travis Desell Department of.
1 December 12, 2009 Robust Asynchronous Optimization for Volunteer Computing Grids Department of Computer Science Department of Physics, Applied Physics.
Genetic algorithms Prof Kang Li
CS 484 – Artificial Intelligence1 Announcements Lab 3 due Tuesday, November 6 Homework 6 due Tuesday, November 6 Lab 4 due Thursday, November 8 Current.
Genetic algorithms Charles Darwin "A man who dares to waste an hour of life has not discovered the value of life"
1 IE 607 Heuristic Optimization Particle Swarm Optimization.
1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.
Learning by Simulating Evolution Artificial Intelligence CSMC February 21, 2002.
and Volunteer Computing at RPI Travis Desell RCOS, April 23, 2010.
Genetic Algorithms CSCI-2300 Introduction to Algorithms
Biologically inspired algorithms BY: Andy Garrett YE Ziyu.
Particle Swarm Optimization (PSO)
Selection and Recombination Temi avanzati di Intelligenza Artificiale - Lecture 4 Prof. Vincenzo Cutello Department of Mathematics and Computer Science.
Genetic Algorithms. Solution Search in Problem Space.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
Hirophysics.com The Genetic Algorithm vs. Simulated Annealing Charles Barnes PHY 327.
Swarm Intelligence. Content Overview Swarm Particle Optimization (PSO) – Example Ant Colony Optimization (ACO)
CEng 713, Evolutionary Computation, Lecture Notes parallel Evolutionary Computation.
Paper Review for ENGG6140 Memetic Algorithms
Selected Topics in CI I Genetic Programming Dr. Widodo Budiharto 2014.
Genetic Algorithms.
Advanced Computing and Networking Laboratory
Evolutionary Algorithms Jim Whitehead
The 2st Chinese Workshop on Evolutionary Computation and Learning
Evolutionary Technique for Combinatorial Reverse Auctions
Scientific Research Group in Egypt (SRGE)
USING MICROBIAL GENETIC ALGORITHM TO SOLVE CARD SPLITTING PROBLEM.
Cluster formation based comparison of Genetic algorithm and Particle Swarm Optimization in Wireless Sensor Network Ms.Amita Yadav.
Particle Swarm Optimization
PSO -Introduction Proposed by James Kennedy & Russell Eberhart in 1995
Ana Wu Daniel A. Sabol A Novel Approach for Library Materials Acquisition using Discrete Particle Swarm Optimization.
Meta-heuristics Introduction - Fabien Tricoire
C.-S. Shieh, EC, KUAS, Taiwan
Who cares about implementation and precision?
Multi-objective Optimization Using Particle Swarm Optimization
Artificial Intelligence (CS 370D)
Subject Name: Operation Research Subject Code: 10CS661 Prepared By:Mrs
Comparing Genetic Algorithm and Guided Local Search Methods
METAHEURISTIC Jacques A. Ferland
Neuro-Computing Lecture 6
metaheuristic methods and their applications
Jon Purnell Heidi Jo Newberg Malik Magdon-Ismail
Particle swarm optimization
Genetic Algorithms: A Tutorial
CSE 589 Applied Algorithms Spring 1999
Metaheuristic methods and their applications. Optimization Problems Strategies for Solving NP-hard Optimization Problems What is a Metaheuristic Method?
Genetic Algorithms CSCI-2300 Introduction to Algorithms
Evolutionary Computation,
EE368 Soft Computing Genetic Algorithms.
Boltzmann Machine (BM) (§6.4)
Applications of Genetic Algorithms TJHSST Computer Systems Lab
Introduction to Genetic Algorithm and Some Experience Sharing
Artificial Intelligence CIS 342
Traveling Salesman Problem by Genetic Algorithm
Genetic Algorithm Soft Computing: use of inexact t solution to compute hard task problems. Soft computing tolerant of imprecision, uncertainty, partial.
SWARM INTELLIGENCE Swarms
Genetic Algorithms: A Tutorial
Population Methods.
Presentation transcript:

Robust Asynchronous Optimization Using Volunteer Computing Grids Travis Desell, Boleslaw Szymanski, Carlos Varela, Nathan Cole, Heidi Newberg, Malik Magdon-Ismail Rensselaer Polytechnic Institute Department of Computer Science BOINC Workshop 2009 October 22 Barcelona, Spain

Overview Motivation What is Optimization? Astro-Informatics at Milkyway@Home Making Optimization Asynchronous Partial Verification Strategies Results Future Work 03/19/08 2

Motivation Distribution is essential for modern scientific computing Scientific models are becoming increasingly complex Rates of data acquisition are far exceeding increases in computing power Scientists need easily accessible distributed optimization tools Traditional optimization strategies not well suited to large scale computing Lack scalability and fault tolerance 03/19/08 3

What is Optimization? What parameters x’ give the maximum (or minimum) value of f(x)? f is typically very complex with multiple minima Values of x can be continuous or discreet This talk focuses on continuous optimization 03/19/08 4

What is the structure and origin of the Milky Way galaxy? Astro-Informatics What is the structure and origin of the Milky Way galaxy? Being inside the Milky Way provides 3D data: SLOAN digital sky survey has collected over 10 TB data. Can determine its structure – not possible for other galaxies. Very expensive – evaluating a single model of the Milky Way with a single set of parameters can take hours or days on a typical high-end computer. Models determine where different star streams are in the Milky Way, which helps us understand better its structure and how it was formed. 03/19/08 5

Milkyway@Home Progress 03/19/08 6

Traditional Optimization Strategies Traditional continuous optimization strategies are evolutionary, imitating biology. Individual members or entire populations improve monotonically, through recombination. Individual-based Evolution: Differential Evolution Particle Swarm Optimization Population-based Evolution: Genetic Search 03/19/08 7

Issues With Traditional Optimization Traditional global optimization techniques are dependent and iterative Current population (or individual) is used to generate the next population (or individual) Dependencies and iterations limit scalability and impact performance With volatile hosts, what if an individual in the next generation is lost? Redundancy is expensive Scalability limited by population size 03/19/08 8

Asynchronous Optimization Strategy Use an asynchronous methodology No dependencies on unknown results No iterations Continuously updated population N individuals are generated randomly for the initial population Fulfill work requests by applying recombination operators to the population Update population with reported results 03/19/08 9

Asynchronous Search Architecture BOINC Clients Workers (Fitness Evaluation) Report results and update population Validate and assimilate results Request Work WU Request Send Work Send WUs Population Unevaluated Individuals Fitness (1) Individual (1) Unevaluated Individual (1) Fitness (2) Individual (2) Unevaluated Individual (2) Generate work when queue is low . . . WUs ready to send less than 500 Fitness (n) Individual (n) Unevaluated Individual (n) Assimilator Work Units 03/19/08 10

Genetic Search Generate initial random population Iteratively generate new populations: N best individuals survive through ‘selection’ M individuals mutated O individuals generated through ‘recombination’ 03/19/08 11

Genetic Search Example optimize sum of squares: f(pi) = pi[0]2 + pi[1]2 + pi[2]2 iteration sort 25 0, 4, -3 14 2, 3, -1 5 0, 1, -2 13 -2, 0, 3 26 -3, 1, -4 f(pi) pi 1 f(pi) pi 5 0, 1, -2 recombination (average 3 pairs) mutation (1 random) selection (1 best) 9 2, -2, -1 6.75 -2.5, .5, -.5 12.5 0, 2.5, -2.5 10.5 -.5, 2, -2.5 9 2, -2, -1 f(pi) pi 5 0, 1, -2 6.75 -2.5, .5, -.5 12.5 0, 2.5, -2.5 10.5 -.5, 2, -2.5 2 f(pi) pi 5 0, 1, -2 recombination (average 3 pairs) mutation (1 random) selection (1 best) 10 0, 1, 3 1.1875 -.25, -.75, -.75 3.625 .75, 0, -1.75 4.6875 -1.25, 0.25, -1.75 03/19/08 12

Alternate Recombination Double Shot - two parents generate three children Average of the parents Outside the less fit parent, equidistant to parent and average Outside the more fit parent, equidistant to parent and average 03/19/08 13

Alternate Recombination (2) Randomized Simplex N parents generate one or more children Points randomly along the line created by the worst parent, and the centroid (average) of the remaining parents 03/19/08 14

Steady State and Asynchronous GS Steady State is less parallel than Classical GS: Generate initial random population Randomly choose mutation or recombination to generate new individual If new individual improves population, insert it and remove worst member We modify this approach for Asynchronous GS: Randomly choose mutation or recombination to generate new individuals for work requests When fitness reported, insert members if they improve the population 03/19/08 15

Asynchronous vs Iterative Genetic Search 03/19/08 16

Particle Swarm Optimization Particles ‘fly’ around the search space. Move according to their previous velocity and are pulled towards the global best found position and their locally best found position. Analogies: cognitive intelligence (local best knowledge) social intelligence (global best knowledge) 03/19/08 17

Particle Swarm Optimization PSO: vi(t+1) = w * vi(t) + c1 * r1 * (li - pi(t)) + c2 * r2 * (g - pi(t)) pi(t+1) = pi(t) + vi(t+1) w, c1, c2 = constants r1, r2 = random float between 0 and 1 vi(t) = velocity of particle i at iteration t pi(t) = position of particle i at iteration t li = best position found by particle i g = global best position found by all particles 03/19/08 18

Particle Swarm Optimization (Example) w * vi(t) current: pi(t) velocity: vi(t) previous: pi(t-1) global best local best c2 * (g - pi(t)) c1 * (li - pi(t)) possible new positions 03/19/08 19

Differential Evolution (In Brief) Many variations: best/n/bin rand/n/bin best/n/exp rand/n/exp current/n/bin current/n/exp In general: Perform binary or exponential recombination between the current individual and another individual modified by a scaled difference between n pairs of other individuals 03/19/08 20

Differential Evolution (Details) DE (best/1/bin): pi,j(t+1) = gj(t) + c * (pr1,j(t) - pr2,j(t)) = pi,j(t) if r3 == j or r4 < cr otherwise if f(p(t+1)) < f(p(t)) then p(t+1) = p(t) pi,j(t) = jth parameter of ith member of population at iteration t gj = jth parameter of global best member at iteration t c = scaling factor r1, r2 = random int between 0 and population size, r1 != r2 r3 = random int between 0 and number of parameters r4 = random float between 0 and 1 cr = crossover rate 03/19/08 21

Asynchronous DE & PSO Note that generating new positions does not necessarily require the fitness of the previous position 1. Generate new particle or individual positions to fill work queue 2. Update local and global best on results DE: If result improves individual, update individual’s position PSO: If result improves particles local best, update local best, particle’s position and velocity of the result 03/19/08 22

Optimization Method Comparison Tracked best fitness across 5 separate searches for each combination of search parameters. Used Sagittarius stripe 22: 100,789 observed stars 3 streams 20 optimization parameters 03/19/08 23

Optimization Method Comparison Genetic Search (Simplex & Mutation) Particle Swarm DE best/p/bin DE rand/p/bin 03/19/08 24

Latency Effects Is BOINC a good platform for optimization? Fast turnaround required to keep populations evolving Many slow clients -- are these resources wasted? 03/19/08 25

Operator Examination (1) - BlueGene 03/19/08 26

Operator Examination (2) - BOINC 03/19/08 27

Operator Examination (3) - BOINC 03/19/08 28

Operator Examination (4) - BOINC 03/19/08 29

Operator Examination (5) - BOINC 03/19/08 30

Operator Examination (6) - BOINC 03/19/08 31

Operator Examination (7) - BOINC 03/19/08 32

Partial Verification Only results that will be inserted into the population need to be verified BOINC verifies every work unit Partial Verification: Ignore false-negatives (results that won’t be inserted) Verify results which potentially improve the search 03/19/08 33

Partial Verification Strategies (2) Required combining assimilation and validation Slow validation of good results slows convergence Strategy: Queue potentially good results Randomly determine to send results for verification or optimization at an verification rate. Prematurely terminate unvalidated results if better results are received -- particularly beneficial for DE & PSO. 03/19/08 34

Limiting Redundancy (Genetic Search) Genetic Search (v = 0.3) Genetic Search (v = 0.6) Genetic Search (v = 0.9) 03/19/08 35

Limiting Redundancy (PSO) Particle Swarm (v = 0.3) Particle Swarm (v = 0.6) Particle Swarm (v = 0.9) 03/19/08 36

Limiting Redundancy (DE best/n/bin) DE best/n/bin (v = 0.3) DE best/n/bin (v = 0.6) DE best/n/bin (v = 0.9) 03/19/08 37

Limiting Redundancy (DE rand/p/bin) DE rand/n/bin (v = 0.3) DE rand/n/bin (v = 0.6) DE rand/n/bin (v = 0.9) 03/19/08 38

Conclusions BOINC is good for optimization BOINC’s redundancy is not optimal for optimization Global optimization requires lots of tuning Verifying results quickly can be especially important for optimization 03/19/08 39

Future Work DNA@Home: Discreet parameter optimization Generic optimization framework for BOINC Compare limited verification to BOINC’s verification Adaptive verification strategies Meta-Heuristics Simulation with Benchmark Test Functions 03/19/08 40

Questions? 03/19/08 41

Thanks! http://wcl.cs.rpi.edu http://milkyway.cs.rpi.edu Work partially supported by: NSF AST No. 0607618 NSF IIS No. 0612213 NSF MRI No. 0420703 NSF CAREER CNS Award No. 0448407 03/19/08 42

Extra Slides 03/19/08 43

Search Parameters Population Size: 300 Mutation Rate: 0.3 Simplex: 1 Child 2 .. 5 Parents Points generated between -1.5 * (worst – centroid) to 1.5 * (worst - centroid)‏ 03/19/08 44

Asynchronous GS-Simplex on BlueGene 03/19/08 45

Asynchronous GS-Simplex on BOINC‏ 03/19/08 46

Simplex Operator Analysis Even with a long time to report, results still can improve the population Generation near reflection has highest insert rate Generation near centroid provides the most population improvement for fast report times Generation near reflection provide most population improvement for long report times 03/19/08 47

Simplex Operator Improvement (2)‏ 03/19/08 48

Simplex Operator Improvement (3)‏ 03/19/08 49