Computer Science and Mathematical Basics Chap. 3 발표자 : 김정집
Introduction t Fundamental notions of computer science and mathematics necessary for understanding the GP approach t leading question What are the mathematical and information- processing contexts of GP? What are the tools from these contexts that GP has to work with
3.1 The Importance of Randomness in Evolutionary Learning t Evolution in Nature vs. Evolution in Computers in nature, mutation is basically “free” t The Costs of Variation in nature, sexual reproduction is not “free” t GP as a General Search Process “non-deterministic” algorithm depend on randomness
3.2 Mathematical Basics t Randomness and Probability random events play such a prominent role in GP
3.2.1 Combinatorics and the Search Space t Permutation N different elements constituting the set E can be ordered in N! different permutations t Combination t Variation
3.2.2 Random numbers t Quasi-random number generator t Linear Congruential Method
Randomness test t X 2 Test randomness test if X 2 is near to k, then the random number generator is good
3.2.3 Probability t Elementary Events random experiments - flip a coin events - “heads” or “tails” t Relative Frequency t Probability
t Random Variables and Probability Distributions probability distribution p(x) of random variable x
t Expectation Value and Variance moment quantity Expectation value Variance
t Bernoulli Process and the Binomial Distribution t Probability Density Functions t Normal Distribution
t Multiplicative Variation and the Log-Normal Distribution
Three distributions
3.3 Computer Science Background and Terminology t The Turing Machine, Turing Completeness, and Universal Computation
t Turing Completeness a programming language allows to write a program that emulates the behavior of a certain arbitrary TM t Structure and Function of a TM t Universal TM and Universal Computation A Universal TM U can emulate any TM T U is said to be able to perform universal computation
3.3.2 The Halting Problem t Halting Theorem there is no problem that can determine the termination properties of all programs time bounded excution of GP
3.3.3 Complexity t Complexity measure # of nodes, # of bits needed to express a program in linear form, or # of instructions t Kolmogorov Complexity and Generalization Kolmogorov Complexity u “complexity of a computable object” s the shortest program that produces the object upon execution u if two programs model the same data, the shorter one can be argued to have a higher probability of being general
Different complexity measures
3.4 Computer Hardware t Von Neumann machine a computer where the program resides in the same storage as the data used by that program
3.4.1 Von Neumann Machines t The Processor t RISC/CISC RISC(SPARC or PowerPC) u extensive use of registers CISC(Pentium)
Schematic view of CPU
3.4.2 Evolvable Hardware t FPGAs t EHW When HW has failures, there is no need to discard the entire HW; instead one simply reprogram the chip
3.5 Computer Science t Elementary representation of software machine language, assembly language higher language data structures
3.5.1 Machine Language and Assembler Language t Machine Language A sequence of integers impractical to use numbers for instructions not natural to remember t Assembly very simple grammar
3.5.2 Higher Languages t Imperative Language BASIC, C, FORTRAN, Pascal, SIMULA program statements explicitly order (Latin imperare) the computer how to perform a certain task t Functional, Applicative Language LISP, LOGO, ML, BETA a program represents a function that maps input data and internal data into output data using a function on its arguments is called application, so a functional language is also called applicative
t Predictive Language PROLOG programming means describing to the computer what is wanted as result t Objective-Oriented Languages SMALLTALK-80, C++, JAVA the principle behind these languages is modeling a system by objects
Language classes
3.5.3 Data Structures
t Aggregation cartesian product of structures t Generalization unites structures t Recursion t Graph, Tree, List t Power Set t Function Space t Selector
3.5.4 Manual versus Genetic Programming t From Bits to Memo Code t From Assembler to High-Level Languages t From High-Level Languages to Algebraic Specification t A Programmer’s Heuristics “cut and paste” strategy u cut and paste crossover u generation of new segments mutation u debugging and testing selection u unused code introns
t The main difference GP can afford to evolve a population of programs simultaneously a programmer only work in this way if u the environments changed only slightly between applications or u the programming language was hard to handle hard for GP system to generate code u without any idea of what a given argument or function could mean to the output