PSMS for Neural Networks on the Agnostic vs Prior Knowledge Challenge Hugo Jair Escalante, Manuel Montes and Enrique Sucar Computer Science Department National Astrophysics, Optics and Electronics, México IJCNN-2007 ALvsPK ChallengeOrlando, Florida, August 17, 2007
Outline Introduction Particle swarm optimization Particle swarm model selection Results Conclusions
Introduction: model selection Agnostic learning –General purpose methods –No knowledge on the task at hand or on machine learning is required Prior knowledge –Prior knowledge can increases model’s accuracy –Expert domain is needed
Introduction Problem: Given a set of preprocessing methods, feature selection and learning algorithms (CLOP), select the best combination of them, together with their hyperparameters Solution: Bio-inspired search strategy (PSO) Bird flocking Fish schooling
Particle swarm optimization (PSO) A population of individuals is created (Swarm) Each individual (particle) represents a solution to the problem at hand Particles fly through the search space by considering the best global and individual solutions A fitness function is used for evaluating solutions
Particle swarm optimization (PSO) Begin –Initialize swarm –Locate leader (p g ) –it=0 –While it < max_it For each particle –Update Position (2) –Evaluation (fitness) –Update particle’s best (p) EndFor Update leader (p g ) it++; –EndWhile End
PSO for model selection (PSMS) Each particle encodes a CLOP model Cross-validation BER is used for evaluating models
Experiment's settings Standard parameters for PSO 10 particles per swarm PSMS applied to ADA, GINA, HIVA and SYLVA 5-cross validation was used
Results up to March 1 st Corrida_final 500 iterations for ADA 100 iterations for HIVA, GIVA 50 iterations for SYLVA Trial and error for NOVA
Results up to March 1 st Best ave. BER still held by Reference (Gavin Cawley) with “the bad”. Note that the best entry for each dataset is not necessarily the best entry overall. Some of the best agnostic entries of individual datasets were made as part of prior knowledge entries (the bottom four); there is no corresponding overall agnostic ranking. Agnostic learning best ranked entries as of March 1 st, 2007
Results up to March 1 st 100 Iterations 500 Iterations (ADA)
Results up to August 1 st DatasetEntry nameModelsEntry IDTest BERTest AUCScore ADACorrida_final_10CV chain({standardize({'center=0'}), normalize({'center=1'}),shift_n_s cale({'take_log=0'}), neural({'units=5', 'shrinkage=1.4323','balance=0',' maxiter=257'}), bias}) GINAAdaBoost * chain({normalize({'center=0'}), svc({'coef0=0.1', 'degree=5','gamma=0','shrinkag e=0.01'}), bias}) HIVACorrida_final chain({standardize({'center=1'}), normalize({'center=0'}), neural({'units=5', 'shrinkage=3.028','balance=0','m axiter=448'}), bias}) NOVAAdaBoost * chain({normalize({'center=0'}), gentleboost(neural({'units=1', 'shrinkage=0.2', 'balance=1', 'maxiter=50'}), {'units=10','rejNum=3'}), bias}) SYLVAPSMS_100_4all_NCV chain({standardize({'center=0'}), normalize({'center=0'}),shift_n_s cale({'center=1'}), neural({'units=8', 'shrinkage=1.2853','balance=0',' maxiter=362'}), bias}) OverallPSMS_100_4all_NCV Same as Corrida_final except by Sylva’s model * Models selected by trial and error
Results up to August 1 st Best ave. BER still held by Reference (Gavin Cawley) with “the bad”. Note that the best entry for each dataset is not necessarily the best entry overall. The blue shaded entries did not count towards the prize (participant part of a group or not wishing to be identified). Agnostic learning best ranked entries as of August 1 st, 2007
Results up to August 1 st BER AUC
Conclusions Competitive and simple models are obtained with PSMS No knowledge on the problem at hand neither on machine learning is required PSMS is easy to implement It suffers from the same problem as other search algorithms