1/23 Ant Colony Optimization for Hyperbox Clustering and its Application to HPV Virus Classification ハイパーボックス・クラスタリングのためのア ント・コロニ最適化と HPV ウィルス判別への応 用 知能システム科学専攻 廣田研究室 Guilherme Novaes RAMOS 04M35692
2/23 Pattern recognition Text Speech Image Customer profile Chemical compounds Microarrays … Motivation
3/23 Human PapillomaVirus HPV virus HPV symptom Cervical HPVs Oral HPVs Research is not very advanced Proper treatment Local risk profile Cancer Early diagnosis
4/23 Proposal Hyperbox
5/23 Background: Ant Colony Optimization Dorigo [IEEE, 97] Characteristics Versatile Robust Population based 1/3
6/23 33 Background: Hyperboxes Simpson [91] Defines a region in an n-dimensional space Described by 2 vectors Simplest classifier If x H 1 Then x Class 1 2/3
7/23 Background: Existing applications ACO Cemetery approach Partition matrix Hyperbox Min-max fuzzy neural networks Pattern classification Clustering Classifiers 3/3
8/23 Hyperbox clustering with Ant Colony Optimization Ants scatter hyperboxes in the feature space Objective: maximize hyperbox density
9/23 HACO Initialization Start Build solution Local optimization Update pheromone Criteria? Stop Load data Define C Initialize pheromone Y N 1/6 Define Clusters
10/23 HACO ExploitationExploration Probability Assign hyperbox 2/6 Initialization Start Build solution Local optimization Update pheromone Criteria? Stop Y N Define Clusters
11/23 HACO Hyperbox density Generate neighbor Change solution N Y Probability Density? 3/6 Initialization Start Build solution Local optimization Update pheromone Criteria? Stop Y N Define Clusters
12/23 HACO ij : pheromone value : trail persistance best : hyperbox density of best solution 4/6 Initialization Start Build solution Local optimization Update pheromone Criteria? Stop Y N Define Clusters
13/23 HACO Fitness (density) Number of iterations Comparison with previous solutions … 5/6 HACO ACO Initialization Start Build solution Local optimization Update pheromone Criteria? Stop Y N Define Clusters Density Iteration Fitness
14/23 HACO Overlapping Nearest neighbor 6/6 Initialization Start Build solution Local optimization Update pheromone Criteria? Stop Y N Define Clusters
15/23 Specifications Pentium M 1.6GHz, 512 MB of RAM C++ Suse Linux Data sets 3 computer generated HPV Experiments
16/23 Experiments - Dataset 1 1/6 FCM ACO HACO 150 samples, 2 dimensions NN
17/23 Experiments - Dataset 2 2/6 FCM ACO HACO 302 samples, 2 dimensions NN
18/23 Experiments - Dataset 3 3/6 FCM ACO HACO 600 samples, 2 dimensions NN
19/23 Experiments - Results 4/6 NNFCMACO TimeFitness Accuracy TimeFitnessAccuracyTimeFitnessAccuracy DS % % % DS % % % DS % % % HACO D = 1D = 2D = 3 TimeFitness Accuracy TimeFitnessAccuracyTimeFitnessAccuracy DS % % % DS % % % DS % % %
20/23 Experiments - HPV Data Department of stomatology Dentistry School Characteristics 199 samples 42 attributes 5/6
21/23 Experiments - Results 6/6 NNFCMACO TimeFitness Accuracy TimeFitnessAccuracyTimeFitnessAccuracy HPV % % % HACO D = 1D = 2D = 3 TimeFitness Accuracy TimeFitnessAccuracyTimeFitnessAccuracy HPV % % %
22/23 Conclusions Pattern recognition Probable HPV risk profile Advantages Higher accuracy Competitive runtime ACO (HPV) 29.1% % more accurate 82.6% % faster
23/23 Perspectives Test with larger data sets Automatic parameter setting Hyperbox shape optimization Compare/Apply other tools GA SOM …
24/23 Thank you for your attention
25/23 HPV statistics Over 100 viruses 500,000 new cases of cancer diagnosed each year 200,000 deaths each year
26/23 Parameters
27/23 Hyperbox Number : search space ratio n : attributes D k : k-th dimension length x k : k-th attribute of samples 1/6