Download presentation
Presentation is loading. Please wait.
1
南京大学 软件新技术国家重点实验室 机器学习与数据挖掘研究所
University of Birmingham 演化学习 基于演化优化处理机器学习问题 俞扬 南京大学 软件新技术国家重点实验室 机器学习与数据挖掘研究所 University of Birmingham
2
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction for binary vector: for real vector: mutation: [1,0,0,1,0] → [1,1,0,1,0] crossover: [1,0,0,1,0] + [0,1,1,1,0] → [0,1,0,1,0] + [1,0,1,1,0] mutation:
3
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
4
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
5
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
6
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
7
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
8
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
9
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction
10
Evolutionary algorithms
Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction only need to evaluate solutions ⇒ calculate f(x) !
11
Applications Series N700 Series 700 this nose ... has been newly developed ... using the latest analytical technique (i.e. genetic algorithms) N700 cars save 19% energy % increase in the output... This is a result of adopting the ... nose shape
12
QHAs(human designed) 38% efficiency
Applications evolved antennas 93% efficiency QHAs(human designed) 38% efficiency human designed
13
Evolutionary optimization v.s. machine learning
[Computing machinery and intelligence. Mind 49: , 1950.] “We cannot expect to find a good child machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution, by the identifications “Structure of the child machine = Hereditary material “Changes of the child machine = Mutations “Judgment of the experimenter = Natural selection” Alan Turing [Genetic algorithms and machine learning. Machine Learning, 3:95-99, 1988.] D. E. Goldberg J. H. Holland Leslie Valiant ACM/Turing Award [Evolvability. Journal of the ACM, 56(1), 2009.]
14
Evolutionary optimization v.s. machine learning
Selective ensemble: [Z.-H. Zhou, J. Wu, and W. Tang. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002] training data test data The GASEN approach minimize the estimated generalization error directly by a genetic algorithm regression classification figures from [Zhou et al., AIJ’02]
15
Fundamental Problems Unsolved
When get the result? How is the result? Which operators affect the result? ...
16
A summary of our work When get the result:
we developed a general tool for the first hitting time analysis [AIJ’08] and disclosed a general performance bound How is the result: we derived a general approximation performance bound [AIJ’12] and disclosed that EAs can be the best-so-far algorithm with practical advantages Which operators affect the result: we proved the usefulness of crossover for multi-objective optimization. [AIJ’13] on synthetic as well as NP-hard problems And more: optimization with noisy [Qian et al., PPSN’14], class-wise analysis [Qian, Yu and Zhou, PPSN’12], statistical view [Yu and Qian, CEC’14] ...
17
Outline Approximation ability Pareto v.s. Penalty
Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion
18
In applications... “... roughly a fourfold improvement...” “...save 19% energy % increase in the output...” “...38% efficiency ... resulted in 93% efficiency...” A simple EA takes exponential time to find an optimal solution, but time to find a approximate solution Maximum Matching: find the largest possible number of non-adjacent edges [Giel and Wegener, STACS’03]
19
Exact v.s. approximate approximate optimization: obtain good enough solutions x* f(x) x with a close-to-opt. objective value measure of the goodness: (for minimization) is called the approximation ratio of x x is an r-approximate solution
20
Previous studies a simple EA: arbitrarily bad!
On Minimum Vertex Cover (MVC) (NP-Hard problem) a simple EA: arbitrarily bad! Pareto optimization: good ⇒ price performance optimal Pareto front A C B better performance: A better price: C better price: A [Friedrich et al., ECJ’10] solution multi-objective evolutionary algorithm new solutions archive initialization random evaluation & selection problem-independent reproduction a special case or a general principle?
21
Our work [Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] We propose the SEIP framework for analysis approximation ratios isolation function: isolates the competition among solutions Only one isolation ⇒ the bad EA Properly configured isolation ⇒ the multi-objective reformulation Partial ratio: measures infeasible solutions infeasible feasible
22
Our work Theorem SEIP can find -approximate solutions in time
[Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] SEIP can find approximate solutions in time Theorem number of isolations size of an isolation best conditional partial ratio in isolations bad EAs ⇒ only one isolation ⇒ c is very large multi-objective EAs ⇒ balance q and c
23
Our work On minimum set cover problem
[Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] On minimum set cover problem a typical NP-hard problem for approximation studies k is the size of the largest set n elements in E m weighted sets in C SEIP finds approximate solutions in time SEIP finds approximate solutions in time For minimum k-set cover problem: For unbounded minimum set cover problem: EAs can be the best-so-far approximation algorithm
24
Our work [Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] Greedy algorithm: bad! no better than SEIP: for , approximate solutions in time (anytime algorithm) approximate ratio time EAs can be the best-so-far approximation algorithm, with practical advantages
25
Outline Approximation ability Pareto v.s. Penalty
Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion
26
For constrained optimizations
⇒ Constrained optimization: Penalty method: Pareto method (multi-objective reformulation): when Pareto method is better?
27
Problem Class 1 Minimum Matroid Problem matroid
[C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Matroid Problem matroid rank: minimum matroid optimization given a matroid (U,S), let x be the subset indicator vector of U e.g. minimum spanning tree, maximum bipartite matching minimum matroid optimization
28
Problem Class 1 Minimum Matroid Problem
[C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Matroid Problem the worst problem-case average-runtime complexity solve optimal solutions For the Penalty Function Method For the Pareto Optimization Method
29
Problem Class 2 Minimum Cost Coverage Monotonic submodular function
[C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Cost Coverage Monotonic submodular function minimum cost coverage problem given U, let x be the subset indicator vector of U, given a monotone and submodular function f , and some value e.g. minimum submodular cover, minimum set cover
30
Problem Class 2 Minimum Matroid Problem
[C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Matroid Problem the worst problem-case average-runtime complexity solve Hq-approximate solutions For the Penalty Function Method For the Pareto Optimization Method
31
Outline Approximation ability Pareto v.s. Penalty
Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion
32
Previous approaches Previous approaches Ordering-based methods OEP
error minimization [Margineantu and Dietterich, ICML’97] diversity-like criterion maximization [Banfield et al., Info Fusion’05] [Martínez-Munõz, Hernańdez-Lobato, and Suaŕez TPAMI’09] combined criterion [Li, Yu, and Zhou, ECML’12] OEP Optimization-based methods semi-definite programming [Zhang, Burer and Street, JMLR’06] quadratic programming [Li and Zhou, MCS’09] genetic algorithms [Zhou, Wu and Tang, AIJ’02] artificial immune algorithms [Castro et al., ICARIS’05] SEP
33
Back to selective ensemble
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] multi-objective reformulation selective ensemble can be divided into two goals reduce error reduce size Pareto Ensemble Pruning (PEP): 1. random generate a pruned ensemble, put it into the archive 2. loop | pick an ensemble randomly from the archive | randomly change it to make a new one | if the new one is not dominated | | put it into the archive | | put its good neighbors into the archive 3. when terminates, select an ensemble from the archive
34
Back to selective ensemble
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pareto Ensemble Pruning (PEP): 1. random generate a pruned ensemble, put it into the archive 2. loop | pick an ensemble randomly from the archive | randomly change it to make a new one | if the new one is not dominated | | put it into the archive | | put its good neighbors into the archive 3. when terminates, select an ensemble from the archive new solutions archive initialization random evaluation & selection reproduction:
35
Theoretical advantages
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now?
36
Theoretical advantages
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods
37
Theoretical advantages
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods PEP can be better than ordering-based methods
38
Theoretical advantages
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods PEP can be better than ordering-based methods PEP/ordering-based methods can be better than the direct use of heuristic search
39
Theoretical advantages
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods PEP can be better than ordering-based methods PEP/ordering-based methods can be better than the direct use of heuristic search For the first time
40
Empirical comparison Pruning bagging base learners with size 100
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100
41
Empirical comparison Pruning bagging base learners with size 100
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100
42
Empirical comparison Pruning bagging base learners with size 100
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100
43
Empirical comparison Pruning bagging base learners with size 100
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100 on error on size
44
Empirical comparison On pruning different bagging size
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] On pruning different bagging size
45
Application mobile human activity recognition
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] mobile human activity recognition 3 times more than the runner-up saves more than 20% storage and testing time than the runner-up Compared with previous overall accuracy 89.3% [Anguita et al., IWAAL’12], we achieves 90.2%
46
Outline Approximation ability Pareto v.s. Penalty
Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion
47
Sparse regression Regression: Sparse regression (sparsity k):
denotes the number of non-zero elements in w Forward (FR) Current best approximation ratio: on R2 [Das and Kempe, ICML’11] Forward-Backward (FoBa), Orthogonal Matching Pursuit (OMP) ... Greedy methods [Gilbert et al.,2003; Tropp, 2004] Convex relaxation methods [Tibshirani, 1996; Zou & Hastie, 2005] Previous methods
48
Our approach multi-objective reformulation:
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] multi-objective reformulation: sparse regression can be divided into two goals reduce MSE reduce size
49
evaluation & selection
Our approach [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] new solutions archive initialization random evaluation & selection reproduction
50
Theoretical advantages
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] Is POSS as good as the previously best method (FR) ? Yes, POSS can achieve the same approximation ratio Can POSS be better ? Yes, POSS can solve exact solutions on problem subclasses, while FR cannot
51
Empirical comparison [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] select 8 features, report R2 (the larger the better), average over 100 runs
52
Empirical comparison [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] Comparison optimization performance with different sparsities
53
Empirical comparison Comparison test error with different sparsities
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] Comparison test error with different sparsities with L2 regularization
54
Empirical comparison POSS running time v.s. performance
[C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] POSS running time v.s. performance best greedy performance theoretical running time
55
Outline Approximation ability Pareto v.s. Penalty
Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion
56
Conclusion Motivated by the hard optimizations in machine learning
We did research on the theoretical foundation of evolutionary algorithms With the leveraged optimization power, we can solve learning problems better convex formulations are limited non-convex formulations are rising
57
Thank you! yuy@nju.edu.cn http://cs.nju.edu.cn/yuy Collaborators 钱超
Prof. Xin Yao 周志华教授
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.