270 likes | 489 Views
Ant Colony Optimization for Hyperbox Clustering and its Application to HPV Virus Classification ハイパーボックス・クラスタリングのためのアント・コロニ最適化と HPV ウィルス判別への応用. 知能システム科学専攻 廣田研究室 Guilherme Novaes RAMOS 04M35692. Motivation. Pattern recognition Text Speech Image Customer profile Chemical compounds
E N D
Ant Colony Optimization for Hyperbox Clustering and its Application to HPV Virus Classificationハイパーボックス・クラスタリングのためのアント・コロニ最適化とHPVウィルス判別への応用 知能システム科学専攻 廣田研究室 Guilherme Novaes RAMOS 04M35692
Motivation • Pattern recognition • Text • Speech • Image • Customer profile • Chemical compounds • Microarrays • …
Early diagnosis Human PapillomaVirus • Cervical HPVs • Oral HPVs • Research is not very advanced • Proper treatment • Local risk profile Cancer HPV symptom HPV virus
Hyperbox Proposal
1/3 Background:Ant Colony Optimization • Dorigo [IEEE, 97] • Characteristics • Versatile • Robust • Population based
3 2/3 Background:Hyperboxes • Simpson [91] • Defines a region in an n-dimensional space • Described by 2 vectors • Simplest classifier If x H1Then x Class 1
3/3 Background:Existing applications • ACO • Cemetery approach • Partition matrix • Hyperbox • Min-max fuzzy neural networks • Pattern classification • Clustering • Classifiers
Hyperbox clustering with Ant Colony Optimization • Ants scatter hyperboxes in the feature space • Objective: maximize hyperbox density
1/6 Start Stop HACO Initialization Load data Build solution Define C Local optimization Initialize pheromone Update pheromone N Criteria? Y Define Clusters
2/6 Start Stop Criteria? HACO Initialization Build solution Probability Local optimization Exploration Exploitation Assign hyperbox Update pheromone N Y Define Clusters
3/6 Stop Start Criteria? HACO Initialization • Hyperbox density Probability Build solution Generate neighbor Local optimization Update pheromone N Density? Y N Change solution Y Define Clusters
4/6 Start Stop Criteria? HACO Initialization • ij : pheromone value • : trail persistance • best: hyperbox density of best solution Build solution Local optimization Update pheromone N Y Define Clusters
HACO 5/6 Start Stop ACO Criteria? HACO Initialization • Fitness (density) • Number of iterations • Comparison with previous solutions • … Build solution Local optimization Update pheromone N Density Fitness Y Define Clusters Iteration
6/6 Start Stop Criteria? HACO Initialization • Overlapping • Nearest neighbor Build solution Local optimization Update pheromone N Y Define Clusters
Experiments • Specifications • Pentium M 1.6GHz, 512 MB of RAM • C++ • Suse Linux • Data sets • 3 computer generated • HPV
1/6 Experiments - Dataset 1 ACO HACO NN FCM 150 samples, 2 dimensions
2/6 Experiments - Dataset 2 ACO HACO NN FCM 302 samples, 2 dimensions
3/6 Experiments - Dataset 3 ACO HACO NN FCM 600 samples, 2 dimensions
4/6 Experiments - Results
5/6 Experiments - HPV Data • Department of stomatology • Dentistry School • Characteristics • 199 samples • 42 attributes
6/6 Experiments - Results
Conclusions • Pattern recognition • Probable HPV risk profile • Advantages • Higher accuracy • Competitive runtime • ACO(HPV) • 29.1% - 36.3% more accurate • 82.6% - 97.6% faster
Perspectives • Test with larger data sets • Automatic parameter setting • Hyperbox shape optimization • Compare/Apply other tools • GA • SOM • …
HPV statistics • Over 100 viruses • 500,000 new cases of cancer diagnosed each year • 200,000 deaths each year
1/6 Hyperbox Number • : search space ratio • n : attributes • Dk : k-th dimension length • xk:k-th attribute of samples