Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Slides:

Advertisements

Similar presentations

Computational Intelligence Winter Term 2009/10 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Advertisements

Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Population-based metaheuristics Nature-inspired Initialize a population A new population of solutions is generated Integrate the new population into the.

Computational Intelligence Winter Term 2009/10 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2013/14 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2014/15 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2013/14 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Visual Recognition Tutorial

Chapter 1 Probability Theory (i) : One Random Variable

Evolutionary Computational Intelligence

Intro to AI Genetic Algorithm Ruth Bergman Fall 2002.

Tutorial 1 Temi avanzati di Intelligenza Artificiale - Lecture 3 Prof. Vincenzo Cutello Department of Mathematics and Computer Science University of Catania.

Evolutionary Computational Intelligence

A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse.

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Computer Implementation of Genetic Algorithm

Efficient Model Selection for Support Vector Machines

Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745

Lecture 15: Statistics and Their Distributions, Central Limit Theorem

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

Chih-Ming Chen, Student Member, IEEE, Ying-ping Chen, Member, IEEE, Tzu-Ching Shen, and John K. Zao, Senior Member, IEEE Evolutionary Computation (CEC),

Design of an Evolutionary Algorithm M&F, ch. 7 why I like this textbook and what I don’t like about it!

Neural and Evolutionary Computing - Lecture 6

Machine Learning Recitation 6 Sep 30, 2009 Oznur Tastan.

Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

START OF DAY 5 Reading: Chap. 8. Support Vector Machine.

1 Genetic Algorithms and Ant Colony Optimisation.

Why do GAs work? Symbol alphabet : {0, 1, * } * is a wild card symbol that matches both 0 and 1 A schema is a string with fixed and variable symbols 01*1*

Genetic Algorithms Przemyslaw Pawluk CSE 6111 Advanced Algorithm Design and Analysis

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.

Chapter 12 FUSION OF FUZZY SYSTEM AND GENETIC ALGORITHMS Chi-Yuan Yeh.

1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.

Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;

Chapter 5a:Functions of Random Variables Yang Zhenlin.

Clase 3: Basic Concepts of Search. Problems: SAT, TSP. Tarea 1 Computación Evolutiva Gabriela Ochoa

Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.

Lecture 3: MLE, Bayes Learning, and Maximum Entropy

A Brief Maximum Entropy Tutorial Presenter: Davidson Date: 2009/02/04 Original Author: Adam Berger, 1996/07/05

1 Autonomic Computer Systems Evolutionary Computation Pascal Paysan.

March 7, Using Pattern Recognition Techniques to Derive a Formal Analysis of Why Heuristic Functions Work B. John Oommen A Joint Work with Luis.

Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.

Random Variables By: 1.

Chapter 12 Case Studies Part B. Control System Design.

Computational Intelligence

Computational Intelligence

Evolutionary Algorithms Jim Whitehead

Deep Feedforward Networks

Computational Intelligence

CH 5: Multivariate Methods

Optimal control T. F. Edgar Spring 2012.

Computational Intelligence

Computational Intelligence

Computational Intelligence

Computational Intelligence

Computational Intelligence

Computational Intelligence

Biologically Inspired Computing: Operators for Evolutionary Algorithms

Computational Intelligence

Computational Intelligence

Computational Intelligence

Computational Intelligence

Presentation transcript:

Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 2 Design of Evolutionary Algorithms Three tasks: 1.Choice of an appropriate problem representation. 2.Choice / design of variation operators acting in problem representation. 3.Choice of strategy parameters (includes initialization). ad 1) different “schools“: (a) operate on binary representation and define genotype/phenotype mapping + can use standard algorithm – mapping may induce unintentional bias in search (b) no doctrine: use “most natural” representation – must design variation operators for specific representation + if design done properly then no bias in search

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 3 Design of Evolutionary Algorithms ad 1a) genotype-phenotype mapping original problem f: X → R d scenario: no standard algorithm for search space X available BnBn XRdRd f g standard EA performs variation on binary strings b 2 B n fitness evaluation of individual b via (f ◦ g)(b) = f(g(b)) where g: B n → X is genotype-phenotype mapping selection operation independent from representation

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 4 Design of Evolutionary Algorithms Genotype-Phenotype-Mapping B n → [L, R]  R ● Standard encoding for b  B n → Problem: hamming cliffs L = 0, R = 7 n = 3 1 Bit2 Bit1 Bit3 Bit1 Bit2 Bit1 Bit Hamming cliff genotype phenotype

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 5 Design of Evolutionary Algorithms ● Gray encoding for b  B n Let a  B n standard encoded. Then b i = a i,if i = 1 a i-1  a i,if i > 1  = XOR genotype phenotype OK, no hamming cliffs any longer …  small changes in phenotype „lead to“ small changes in genotype since we consider evolution in terms of Darwin (not Lamarck):  small changes in genotype lead to small changes in phenotype! but: 1-Bit-change: 000 → 100   Genotype-Phenotype-Mapping B n → [L, R]  R

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 6 Design of Evolutionary Algorithms ● e.g. standard encoding for b  B n genotype index Genotype-Phenotype-Mapping B n → P log(n) individual: consider index and associated genotype entry as unit / record / struct; sort units with respect to genotype value, old indices yield permutation: genotype old index (example only) = permutation

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 7 Design of Evolutionary Algorithms ad 1a) genotype-phenotype mapping typically required: strong causality → small changes in individual leads to small changes in fitness → small changes in genotype should lead to small changes in phenotype but: how to find a genotype-phenotype mapping with that property? necessary conditions: 1) g: B n → X can be computed efficiently (otherwise it is senseless) 2) g: B n → X is surjective (otherwise we might miss the optimal solution) 3) g: B n → X preserves closeness (otherwise strong causality endangered) Let d(¢, ¢) be a metric on B n and d X (¢, ¢) be a metric on X. 8x, y, z 2 B n : d(x, y) ≤ d(x, z) ) d X (g(x), g(y)) ≤ d X (g(x), g(z))

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 8 Design of Evolutionary Algorithms ad 1b) use “most natural“ representation but: how to find variation operators with that property? typically required: strong causality → small changes in individual leads to small changes in fitness → need variation operators that obey that requirement ) need design guidelines...

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 9 Design of Evolutionary Algorithms ad 2) design guidelines for variation operators a)reachability every x 2 X should be reachable from arbitrary x 0 2 X after finite number of repeated variations with positive probability bounded from 0 b)unbiasedness unless having gathered knowledge about problem variation operator should not favor particular subsets of solutions ) formally: maximum entropy principle c)control variation operator should have parameters affecting shape of distributions; known from theory: weaken variation strength when approaching optimum

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 10 Design of Evolutionary Algorithms ad 2) design guidelines for variation operators in practice binary search space X = B n variation by k-point or uniform crossover and subsequent mutation a) reachability: regardless of the output of crossover we can move from x 2 B n to y 2 B n in 1 step with probability where H(x,y) is Hamming distance between x and y. Since min{ p(x,y): x,y 2 B n } =  > 0 we are done.

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 11 Design of Evolutionary Algorithms b) unbiasedness don‘t prefer any direction or subset of points without reason ) use maximum entropy distribution for sampling! properties: - distributes probability mass as uniform as possible - additional knowledge can be included as constraints: → under given constraints sample as uniform as possible

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 12 Design of Evolutionary Algorithms Definition: Let X be discrete random variable (r.v.) with p k = P{ X = x k } for some index set K. The quantity is called the entropy of the distribution of X. If X is a continuous r.v. with p.d.f. f X (¢) then the entropy is given by The distribution of a random variable X for which H(X) is maximal is termed a maximum entropy distribution.■ Formally:

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 13 Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with support { x 1, x 2, … x n } with x 1 < x 2 < … x n < 1 s.t. ) leads to nonlinear constrained optimization problem: solution: via Lagrange (find stationary point of Lagrangian function)

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 14 Excursion: Maximum Entropy Distributions partial derivatives: ) ) uniform distribution

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 15 Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with support { 1, 2, …, n } with p k = P { X = k } and E[ X ] = s.t. ) leads to nonlinear constrained optimization problem: and solution: via Lagrange (find stationary point of Lagrangian function)

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 16 Excursion: Maximum Entropy Distributions partial derivatives: ) ) (continued on next slide) * ( )

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 17 Excursion: Maximum Entropy Distributions ) ) )discrete Boltzmann distribution ) value of q depends on via third condition: * ( )

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 18 Excursion: Maximum Entropy Distributions Boltzmann distribution (n = 9)  = 2  = 3  = 4  = 8  = 7  = 6  = 5 specializes to uniform distribution if = 5 (as expected)

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 19 Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with support { 1, 2, …, n } with E[ X ] =  and V[ X ] =  2 s.t. ) leads to nonlinear constrained optimization problem: and solution: in principle, via Lagrange (find stationary point of Lagrangian function) but very complicated analytically, if possible at all ) consider special cases only note: constraints are linear equations in p k

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 20 Excursion: Maximum Entropy Distributions Special case: n = 3 and E[ X ] = 2 and V[ X ] =  2 Linear constraints uniquely determine distribution: I. II. III. II – I: I – III: insertion in III. unimodal uniform bimodal

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 21 Excursion: Maximum Entropy Distributions Knowledge available: Discrete distribution with unbounded support { 0, 1, 2, … } and E[ X ] = s.t. ) leads to infinite-dimensional nonlinear constrained optimization problem: and solution: via Lagrange (find stationary point of Lagrangian function)

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 22 Excursion: Maximum Entropy Distributions ) (continued on next slide) partial derivatives: * ( ) )

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 23 Excursion: Maximum Entropy Distributions )) setand insists that ) insert ) geometrical distributionfor it remains to specify q; to proceed recall that

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 24 Excursion: Maximum Entropy Distributions ) value of q depends on via third condition: * ( ) ) )

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 25 Excursion: Maximum Entropy Distributions geometrical distribution with E[ x ] = p k only shown for k = 0, 1, …, 8  = 1  = 2  = 3  = 4  = 5  = 6  = 7

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 26 Excursion: Maximum Entropy Distributions Overview: support { 1, 2, …, n }  discrete uniform distribution and require E[X] =  Boltzmann distribution and require V[X] =  2  N.N. (not Binomial distribution) support N  not defined! and require E[X] =  geometrical distribution and require V[X] =  2  ? support Z  not defined! and require E[|X|] =  bi-geometrical distribution (discrete Laplace distr.) and require E[|X| 2 ] =  2  N.N. (discrete Gaussian distr.)

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 27 Excursion: Maximum Entropy Distributions support [a,b]  R  uniform distribution support R + with E[X] =  Exponential distribution support R with E[X] =  V[X] =  2  normal / Gaussian distribution N( ,  2 ) support R n with E[X] =  and Cov[X] = C  multinormal distribution N( , C) expectation vector 2 R n covariance matrix 2 R n,n positive definite: 8x ≠ 0 : x‘Cx > 0

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 28 Excursion: Maximum Entropy Distributions for permutation distributions ? Guideline: Only if you know something about the problem a priori or if you have learnt something about the problem during the search  include that knowledge in search / mutation distribution (via constraints!) → uniform distribution on all possible permutations set v[j] = j for j = 1, 2,..., n for i = n to 1 step -1 draw k uniformly at random from { 1, 2,..., i } swap v[i] and v[k] endfor generates permutation uniformly at random in  (n) time

Lecture 11 G. Rudolph: Computational Intelligence ▪ Winter Term 2011/12 29 Excursion: Maximum Entropy Distributions continuous search space X = R n ad 2) design guidelines for variation operators in practice a)reachability b)unbiasedness c)control leads to CMA-ES