Computational Intelligence

Slides:

Advertisements

Similar presentations

Computational Intelligence Winter Term 2009/10 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Advertisements

Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.

Computational Intelligence Winter Term 2009/10 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2013/14 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2014/15 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2013/14 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence

Lecture 9 Support Vector Machines

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Visual Recognition Tutorial

Evolutionary Computational Intelligence

Evolutionary Computational Intelligence

A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse.

Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.

Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

Mathematical Preliminaries. 37 Matrix Theory Vectors nth element of vector u : u(n) Matrix mth row and nth column of A : a(m,n) column vector.

Lecture 15: Statistics and Their Distributions, Central Limit Theorem

Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.

Design of an Evolutionary Algorithm M&F, ch. 7 why I like this textbook and what I don’t like about it!

Neural and Evolutionary Computing - Lecture 6

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.

START OF DAY 5 Reading: Chap. 8. Support Vector Machine.

Chapter 01 Probability and Stochastic Processes References: Wolff, Stochastic Modeling and the Theory of Queues, Chapter 1 Altiok, Performance Analysis.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.

Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.

SCALABLE INFORMATION-DRIVEN SENSOR QUERYING AND ROUTING FOR AD HOC HETEROGENEOUS SENSOR NETWORKS Paper By: Maurice Chu, Horst Haussecker, Feng Zhao Presented.

Clase 3: Basic Concepts of Search. Problems: SAT, TSP. Tarea 1 Computación Evolutiva Gabriela Ochoa

Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.

A Brief Maximum Entropy Tutorial Presenter: Davidson Date: 2009/02/04 Original Author: Adam Berger, 1996/07/05

Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,

Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Chapter 12 Case Studies Part B. Control System Design.

Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.

Information geometry.

Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Chapter 3: Discrete Random Variables and Their Distributions CIS.

Computational Intelligence

3. Random Variables (Fig.3.1)

ONE DIMENSIONAL RANDOM VARIABLES

Evolutionary Algorithms Jim Whitehead

deterministic operations research

Deep Feedforward Networks

Computational Intelligence

C.-S. Shieh, EC, KUAS, Taiwan

CH 5: Multivariate Methods

3.1 Expectation Expectation Example

Optimal control T. F. Edgar Spring 2012.

Dr. Arslan Ornek IMPROVING SEARCH

Computational Intelligence

Computational Intelligence

Chap 3. The simplex method

Computational Intelligence

LECTURE 05: THRESHOLD DECODING

STOCHASTIC HYDROLOGY Random Processes

Computational Intelligence

Computational Intelligence

Boltzmann Machine (BM) (§6.4)

Analysis of Engineering and Scientific Data

3. Random Variables Let (, F, P) be a probability model for an experiment, and X a function that maps every to a unique point.

Computational Intelligence

Biologically Inspired Computing: Operators for Evolutionary Algorithms

Computational Intelligence

Computational Intelligence

Computational Intelligence

Computational Intelligence

Continuous Random Variables: Basics

Presentation transcript:

Computational Intelligence Winter Term 2016/17 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund

Design of Evolutionary Algorithms Three tasks: Choice of an appropriate problem representation. Choice / design of variation operators acting in problem representation. Choice of strategy parameters (includes initialization). ad 1) different “schools“: (a) operate on binary representation and define genotype/phenotype mapping + can use standard algorithm – mapping may induce unintentional bias in search (b) no doctrine: use “most natural” representation – must design variation operators for specific representation + if design done properly then no bias in search G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 2

Design of Evolutionary Algorithms ad 1a) genotype-phenotype mapping original problem f: X → Rd scenario: no standard algorithm for search space X available Bn X Rd f g standard EA performs variation on binary strings b  Bn fitness evaluation of individual b via (f ◦ g)(b) = f(g(b)) where g: Bn → X is genotype-phenotype mapping selection operation independent from representation G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 3

Design of Evolutionary Algorithms Genotype-Phenotype-Mapping Bn → [L, R]  R Standard encoding for b  Bn → Problem: hamming cliffs 000 001 010 011 100 101 110 111 1 2 3 4 5 6 7 genotype phenotype 1 Bit 2 Bit 3 Bit L = 0, R = 7 n = 3 Hamming cliff G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 4

Design of Evolutionary Algorithms Genotype-Phenotype-Mapping Bn → [L, R]  R Gray encoding for b  Bn ai, if i = 1 ai-1 ai, if i > 1  = XOR Let a  Bn standard encoded. Then bi = 000 001 011 010 110 111 101 100 1 2 3 4 5 6 7 genotype phenotype OK, no hamming cliffs any longer …  small changes in phenotype „lead to“ small changes in genotype since we consider evolution in terms of Darwin (not Lamarck):  small changes in genotype lead to small changes in phenotype! but: 1-Bit-change: 000 → 100   G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 5

Design of Evolutionary Algorithms Genotype-Phenotype-Mapping Bn → Plog(n) (example only) e.g. standard encoding for b  Bn individual: 010 101 111 000 110 001 100 1 2 3 4 5 6 7 genotype index consider index and associated genotype entry as unit / record / struct; sort units with respect to genotype value, old indices yield permutation: 000 001 010 100 101 110 111 3 5 7 1 6 4 2 genotype old index = permutation G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 6

Design of Evolutionary Algorithms ad 1a) genotype-phenotype mapping typically required: strong causality → small changes in individual leads to small changes in fitness → small changes in genotype should lead to small changes in phenotype but: how to find a genotype-phenotype mapping with that property? necessary conditions: 1) g: Bn → X can be computed efficiently (otherwise it is senseless) 2) g: Bn → X is surjective (otherwise we might miss the optimal solution) 3) g: Bn → X preserves closeness (otherwise strong causality endangered) Let d(· , ·) be a metric on Bn and dX(· , ·) be a metric on X. x, y, z  Bn : d(x, y) ≤ d(x, z)  dX(g(x), g(y)) ≤ dX(g(x), g(z)) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 7

Design of Evolutionary Algorithms ad 1b) use “most natural“ representation typically required: strong causality → small changes in individual leads to small changes in fitness → need variation operators that obey that requirement but: how to find variation operators with that property?  need design guidelines ... G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 8

Design of Evolutionary Algorithms ad 2) design guidelines for variation operators reachability every x 2 X should be reachable from arbitrary x0  X after finite number of repeated variations with positive probability bounded from 0 unbiasedness unless having gathered knowledge about problem variation operator should not favor particular subsets of solutions  formally: maximum entropy principle control variation operator should have parameters affecting shape of distributions; known from theory: weaken variation strength when approaching optimum G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 9

Design of Evolutionary Algorithms ad 2) design guidelines for variation operators in practice binary search space X = Bn variation by k-point or uniform crossover and subsequent mutation a) reachability: regardless of the output of crossover we can move from x  Bn to y  Bn in 1 step with probability where H(x,y) is Hamming distance between x and y. Since min{ p(x,y): x,y  Bn } =  > 0 we are done. G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 10

Design of Evolutionary Algorithms b) unbiasedness don‘t prefer any direction or subset of points without reason  use maximum entropy distribution for sampling! properties: distributes probability mass as uniform as possible additional knowledge can be included as constraints: → under given constraints sample as uniform as possible G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 11

Design of Evolutionary Algorithms Formally: Definition: Let X be discrete random variable (r.v.) with pk = P{ X = xk } for some index set K. The quantity is called the entropy of the distribution of X. If X is a continuous r.v. with p.d.f. fX(¢) then the entropy is given by The distribution of a random variable X for which H(X) is maximal is termed a maximum entropy distribution. ■ G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 12

Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with support { x1, x2, … xn } with x1 < x2 < … xn < 1 s.t.  leads to nonlinear constrained optimization problem: solution: via Lagrange (find stationary point of Lagrangian function) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 13

Excursion: Maximum Entropy Distributions partial derivatives:  uniform distribution  G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 14

Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with support { 1, 2, …, n } with pk = P { X = k } and E[ X ] =  s.t.  leads to nonlinear constrained optimization problem: and solution: via Lagrange (find stationary point of Lagrangian function) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 15

* Excursion: Maximum Entropy Distributions partial derivatives:   ( ) (continued on next slide) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 16

* Excursion: Maximum Entropy Distributions    discrete Boltzmann distribution  value of q depends on  via third condition: * ( ) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 17

Boltzmann distribution Excursion: Maximum Entropy Distributions  = 2 Boltzmann distribution (n = 9)  = 8  = 3 specializes to uniform distribution if  = 5 (as expected)  = 7  = 4  = 5  = 6 G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 18

note: constraints are linear equations in pk Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with support { 1, 2, …, n } with E[ X ] =  and V[ X ] = 2  leads to nonlinear constrained optimization problem: s.t. and and solution: in principle, via Lagrange (find stationary point of Lagrangian function) but very complicated analytically, if possible at all  consider special cases only note: constraints are linear equations in pk G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 19

Excursion: Maximum Entropy Distributions Special case: n = 3 and E[ X ] = 2 and V[ X ] = 2 Linear constraints uniquely determine distribution: I. II. III. insertion in III. II – I: I – III: unimodal uniform bimodal G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 20

Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with unbounded support { 0, 1, 2, … } and E[ X ] =   leads to infinite-dimensional nonlinear constrained optimization problem: s.t. and solution: via Lagrange (find stationary point of Lagrangian function) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 21

* Excursion: Maximum Entropy Distributions partial derivatives:   ( ) (continued on next slide) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 22

Excursion: Maximum Entropy Distributions   insert  set and insists that  geometrical distribution for it remains to specify q; to proceed recall that G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 23

* Excursion: Maximum Entropy Distributions ( )  ( )  value of q depends on  via third condition:   G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 24

geometrical distribution Excursion: Maximum Entropy Distributions  = 1  = 7 geometrical distribution with E[ x ] =   = 2  = 6 pk only shown for k = 0, 1, …, 8  = 3  = 4  = 5 G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 25

Excursion: Maximum Entropy Distributions Overview: support { 1, 2, …, n }  discrete uniform distribution and require E[X] =   Boltzmann distribution and require V[X] = 2  N.N. (not Binomial distribution) support N  not defined! and require E[X] =   geometrical distribution and require V[X] = 2  ? support Z  not defined! and require E[|X|] =   bi-geometrical distribution (discrete Laplace distr.) and require E[|X|2] = 2  N.N. (discrete Gaussian distr.) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 26

positive definite: x ≠ 0 : x‘Cx > 0 Excursion: Maximum Entropy Distributions support [a,b]  R  uniform distribution support R+ with E[X] =   Exponential distribution support R with E[X] = , V[X] = 2  normal / Gaussian distribution N(, 2) support Rn with E[X] =  and Cov[X] = C  multinormal distribution N(, C) expectation vector  Rn covariance matrix  Rn,n positive definite: x ≠ 0 : x‘Cx > 0 G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 27

Excursion: Maximum Entropy Distributions for permutation distributions ? → uniform distribution on all possible permutations set v[j] = j for j = 1, 2, ..., n for i = n to 1 step -1 draw k uniformly at random from { 1, 2, ..., i } swap v[i] and v[k] endfor generates permutation uniformly at random in (n) time Guideline: Only if you know something about the problem a priori or if you have learnt something about the problem during the search  include that knowledge in search / mutation distribution (via constraints!) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 28

Design of Evolutionary Algorithms ad 2) design guidelines for variation operators in practice integer search space X = Zn reachability unbiasedness control - every recombination results in some z  Zn - mutation of z may then lead to any z*  Zn with positive probability in one step ad a) support of mutation should be Zn ad b) need maximum entropy distribution over support Zn ad c) control variability by parameter → formulate as constraint of maximum entropy distribution G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 29

Design of Evolutionary Algorithms ad 2) design guidelines for variation operators in practice X = Zn task: find (symmetric) maximum entropy distribution over Z with E[ | Z | ] =  > 0  need analytic solution of a 1-dimensional, nonlinear optimization problem with constraints! max! s.t. Z , (symmetry w.r.t. 0) (normalization) (control “spread“) (nonnegativity) Z . G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 30

Z Design of Evolutionary Algorithms Z = G1 – G2 result: a random variable Z with support Z and probability distribution Z symmetric w.r.t. 0, unimodal, spread manageable by q and has max. entropy ■ generation of pseudo random numbers: Z = G1 – G2 where stochastic independent! G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 31

probability distributions for different mean step sizes E|Z| =  Design of Evolutionary Algorithms probability distributions for different mean step sizes E|Z| =  G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 32

probability distributions for different mean step sizes E|Z| =  Design of Evolutionary Algorithms probability distributions for different mean step sizes E|Z| =  G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 33

like mutative step size size control of  in EA with search space Rn ! Design of Evolutionary Algorithms How to control the spread? We must be able to adapt q  (0,1) for generating Z with variable E|Z| =  ! self-adaptation of q in open interval (0,1) ? make mean step size adjustable!  R+  (0,1) →  adjustable by mutative self adaptation → get q from  like mutative step size size control of  in EA with search space Rn ! G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 34

Design of Evolutionary Algorithms n - dimensional generalization random vector Z = (Z1, Z2, ... Zn) with Zi = G1,i – G2,i (stoch. indep.); parameter q for all G1i, G2i equal n = 2 G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 35

Design of Evolutionary Algorithms n - dimensional generalization  n-dimensional distribution is symmetric w.r.t. 1 norm!  all random vectors with same step length have same probability! G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 36

=  Design of Evolutionary Algorithms How to control E[ || Z ||1 ] ? by def. linearity of E[·] identical distributions for Zi =  self-adaptation calculate from  G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 37

Design of Evolutionary Algorithms (Rudolph, PPSN 1994) G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 38

Excursion: Maximum Entropy Distributions ad 2) design guidelines for variation operators in practice continuous search space X = Rn reachability unbiasedness control  leads to CMA-ES ! G. Rudolph: Computational Intelligence ▪ Winter Term 2016/17 39