OPTIMIZATION OF MODELS: LOOKING FOR THE BEST STRATEGY Pavel Kordík, Oleg Kovářík, Miroslav Šnorek Department of Computer Science and Engineering, Faculty of Eletrical Engineering, Czech Technical University in Prague, Czech Republic kordikp@fel.cvut.cz (Pavel Kordík)
Motivation Continuous optimization Several methods available Which is the best? Is there any strategy to chose the best method for given task?
Our task: FAKE GAME research project
The GAME engine for automated data mining How it works inside?
The GAME engine: building a model Group of Adaptive Models Evolution (GAME) Inductive model Heterogeneous units Niching genetic algorithm (will be explained) employed in each layer to optimize the topology of GAME networks.
Heterogeneous units in GAME
Optimization of coefficients (learning) x Gaussian ( GaussianNeuron ) 1 x n 2 ( ) y’ å 2 x - a i i ... - i = 1 ( ) ( ) = + + 2 + y 1 a e 1 a * n + a 2 x n + 1 n We are looking for optimal values of coefficients a0, a1, …, an+2 We have inputs x1, x2, …, xn and target output y in the training data set The difference between unit output y’ and the target value y should be minimal for all vectors from the training data set
What is an analytic gradient and how to derive it? Error of the unit for training data (energy surface) Gradient of the error Unit with gaussian transfer function Partial derivation of error in the direction of coefficient ai
Partial derivatives of the Gauss unit
Optimization of their coefficients Unit repeat Optimization method optimize coefficients given inintial values new values coefficients a 1 , a 2 , ..., a n error final values compute error on training data estimate gradient a) Unit does not provide analytic gradient just error of the unit b) Unit repeat Optimization method optimize coefficients given inintial values new values coefficients a 1 , a 2 , ..., a n error final values compute error on training data gradient of the Unit provides analytic gradient and the error of the unit
Very efficient gradient based training for hybrid networks developed! Quasi Newton method estimating gradient gradient supplied
Optimization methods available in GAME
Experimental results of competing opt. methods on Building data set RMS error on testing data sets (Building data) averaged over 5 runs
RMS error on the Boston data set
Classification accuracy [%] on the Spiral data set
Evaluation on diverse data sets What is it All?
Remember the Genetic algorithm optimizing the structure of GAME?
Conclusion It is wise to combine several different optimization strategies for the training of inductive models. Evolution of optimization methods works, but it is not significantly better than the random selection of methods. Nature inspired methods are slow for this problem (they don’t care about the analytic gradient). Future work: utilize the gradient in nature inspired methods.