南京大学 软件新技术国家重点实验室 机器学习与数据挖掘研究所

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

南京大学 软件新技术国家重点实验室 机器学习与数据挖掘研究所 http://lamda.nju.edu.cn University of Birmingham 演化学习 基于演化优化处理机器学习问题 俞扬 南京大学 软件新技术国家重点实验室 机器学习与数据挖掘研究所 University of Birmingham

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction for binary vector: for real vector: mutation: [1,0,0,1,0] → [1,1,0,1,0] crossover: [1,0,0,1,0] + [0,1,1,1,0] → [0,1,0,1,0] + [1,0,1,1,0] mutation:

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction

Evolutionary algorithms Genetic Algorithms [J. H. Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975.] Evolutionary Strategies [I. Rechenberg. Evolutionstrategie: Optimierung Technisher Systeme nach Prinzipien des Biologischen Evolution. Fromman-Hozlboog Verlag, Stuttgart, 1973.] Evolutionary Programming [L. J. Fogel, A. J. Owens, M. J. Walsh. Artificial Intelligence through Simulated Evolution, John Wiley, 1966.] and many other nature-inspired algorithms ... new solutions archive initialization random evaluation & selection problem-independent reproduction only need to evaluate solutions ⇒ calculate f(x) !

Applications Series N700 Series 700 this nose ... has been newly developed ... using the latest analytical technique (i.e. genetic algorithms) N700 cars save 19% energy ... 30% increase in the output... This is a result of adopting the ... nose shape

QHAs(human designed) 38% efficiency Applications evolved antennas 93% efficiency QHAs(human designed) 38% efficiency human designed

Evolutionary optimization v.s. machine learning [Computing machinery and intelligence. Mind 49: 433-460, 1950.] “We cannot expect to find a good child machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse. There is an obvious connection between this process and evolution, by the identifications “Structure of the child machine = Hereditary material “Changes of the child machine = Mutations “Judgment of the experimenter = Natural selection” Alan Turing 1912-1954 [Genetic algorithms and machine learning. Machine Learning, 3:95-99, 1988.] D. E. Goldberg J. H. Holland Leslie Valiant ACM/Turing Award [Evolvability. Journal of the ACM, 56(1), 2009.]

Evolutionary optimization v.s. machine learning Selective ensemble: [Z.-H. Zhou, J. Wu, and W. Tang. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002] training data test data The GASEN approach minimize the estimated generalization error directly by a genetic algorithm regression classification figures from [Zhou et al., AIJ’02]

Fundamental Problems Unsolved When get the result? How is the result? Which operators affect the result? ...

A summary of our work When get the result: we developed a general tool for the first hitting time analysis [AIJ’08] and disclosed a general performance bound How is the result: we derived a general approximation performance bound [AIJ’12] and disclosed that EAs can be the best-so-far algorithm with practical advantages Which operators affect the result: we proved the usefulness of crossover for multi-objective optimization. [AIJ’13] on synthetic as well as NP-hard problems And more: optimization with noisy [Qian et al., PPSN’14], class-wise analysis [Qian, Yu and Zhou, PPSN’12], statistical view [Yu and Qian, CEC’14] ...

Outline Approximation ability Pareto v.s. Penalty Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion

In applications... “... roughly a fourfold improvement...” “...save 19% energy ... 30% increase in the output...” “...38% efficiency ... resulted in 93% efficiency...” A simple EA takes exponential time to find an optimal solution, but time to find a -approximate solution Maximum Matching: find the largest possible number of non-adjacent edges [Giel and Wegener, STACS’03]

Exact v.s. approximate approximate optimization: obtain good enough solutions x* f(x) x with a close-to-opt. objective value measure of the goodness: (for minimization) is called the approximation ratio of x x is an r-approximate solution

Previous studies a simple EA: arbitrarily bad! On Minimum Vertex Cover (MVC) (NP-Hard problem) a simple EA: arbitrarily bad! Pareto optimization: good ⇒ price performance optimal Pareto front A C B better performance: A better price: C better price: A [Friedrich et al., ECJ’10] solution multi-objective evolutionary algorithm new solutions archive initialization random evaluation & selection problem-independent reproduction a special case or a general principle?

Our work [Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] We propose the SEIP framework for analysis approximation ratios isolation function: isolates the competition among solutions Only one isolation ⇒ the bad EA Properly configured isolation ⇒ the multi-objective reformulation Partial ratio: measures infeasible solutions infeasible feasible

Our work Theorem SEIP can find -approximate solutions in time [Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] SEIP can find -approximate solutions in time Theorem number of isolations size of an isolation best conditional partial ratio in isolations bad EAs ⇒ only one isolation ⇒ c is very large multi-objective EAs ⇒ balance q and c

Our work On minimum set cover problem [Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] On minimum set cover problem a typical NP-hard problem for approximation studies k is the size of the largest set n elements in E m weighted sets in C SEIP finds -approximate solutions in time SEIP finds -approximate solutions in time For minimum k-set cover problem: For unbounded minimum set cover problem: EAs can be the best-so-far approximation algorithm

Our work [Y. Yu, X. Yao, and Z.-H. Zhou. On the approximation ability of evolutionary optimization with application to minimum set cover. Artificial Intelligence, 2012.] Greedy algorithm: bad! no better than SEIP: for , -approximate solutions in time (anytime algorithm) approximate ratio time EAs can be the best-so-far approximation algorithm, with practical advantages

Outline Approximation ability Pareto v.s. Penalty Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion

For constrained optimizations ⇒ Constrained optimization: Penalty method: Pareto method (multi-objective reformulation): when Pareto method is better?

Problem Class 1 Minimum Matroid Problem matroid [C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Matroid Problem matroid rank: minimum matroid optimization given a matroid (U,S), let x be the subset indicator vector of U e.g. minimum spanning tree, maximum bipartite matching minimum matroid optimization

Problem Class 1 Minimum Matroid Problem [C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Matroid Problem the worst problem-case average-runtime complexity solve optimal solutions For the Penalty Function Method For the Pareto Optimization Method

Problem Class 2 Minimum Cost Coverage Monotonic submodular function [C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Cost Coverage Monotonic submodular function minimum cost coverage problem given U, let x be the subset indicator vector of U, given a monotone and submodular function f , and some value e.g. minimum submodular cover, minimum set cover

Problem Class 2 Minimum Matroid Problem [C. Qian, Y. Yu and Z.-H. Zhou. On Constrained Boolean Pareto Optimization. IJCAI’15] Minimum Matroid Problem the worst problem-case average-runtime complexity solve Hq-approximate solutions For the Penalty Function Method For the Pareto Optimization Method

Outline Approximation ability Pareto v.s. Penalty Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion

Previous approaches Previous approaches Ordering-based methods OEP error minimization [Margineantu and Dietterich, ICML’97] diversity-like criterion maximization [Banfield et al., Info Fusion’05] [Martínez-Munõz, Hernańdez-Lobato, and Suaŕez TPAMI’09] combined criterion [Li, Yu, and Zhou, ECML’12] OEP Optimization-based methods semi-definite programming [Zhang, Burer and Street, JMLR’06] quadratic programming [Li and Zhou, MCS’09] genetic algorithms [Zhou, Wu and Tang, AIJ’02] artificial immune algorithms [Castro et al., ICARIS’05] SEP

Back to selective ensemble [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] multi-objective reformulation selective ensemble can be divided into two goals reduce error reduce size Pareto Ensemble Pruning (PEP): 1. random generate a pruned ensemble, put it into the archive 2. loop | 2.1 pick an ensemble randomly from the archive | 2.2 randomly change it to make a new one | 2.3 if the new one is not dominated | | 2.3.1 put it into the archive | | 2.3.2 put its good neighbors into the archive 3. when terminates, select an ensemble from the archive

Back to selective ensemble [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pareto Ensemble Pruning (PEP): 1. random generate a pruned ensemble, put it into the archive 2. loop | 2.1 pick an ensemble randomly from the archive | 2.2 randomly change it to make a new one | 2.3 if the new one is not dominated | | 2.3.1 put it into the archive | | 2.3.2 put its good neighbors into the archive 3. when terminates, select an ensemble from the archive new solutions archive initialization random evaluation & selection reproduction:

Theoretical advantages [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now?

Theoretical advantages [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods

Theoretical advantages [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods PEP can be better than ordering-based methods

Theoretical advantages [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods PEP can be better than ordering-based methods PEP/ordering-based methods can be better than the direct use of heuristic search

Theoretical advantages [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Can we have theoretical comparisons now? PEP is at least as good as ordering-based methods PEP can be better than ordering-based methods PEP/ordering-based methods can be better than the direct use of heuristic search For the first time

Empirical comparison Pruning bagging base learners with size 100 [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100

Empirical comparison Pruning bagging base learners with size 100 [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100

Empirical comparison Pruning bagging base learners with size 100 [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100

Empirical comparison Pruning bagging base learners with size 100 [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] Pruning bagging base learners with size 100 on error on size

Empirical comparison On pruning different bagging size [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] On pruning different bagging size

Application mobile human activity recognition [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Ensemble Pruning. AAAI’15] mobile human activity recognition 3 times more than the runner-up saves more than 20% storage and testing time than the runner-up Compared with previous overall accuracy 89.3% [Anguita et al., IWAAL’12], we achieves 90.2%

Outline Approximation ability Pareto v.s. Penalty Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion

Sparse regression Regression: Sparse regression (sparsity k): denotes the number of non-zero elements in w Forward (FR) Current best approximation ratio: on R2 [Das and Kempe, ICML’11] Forward-Backward (FoBa), Orthogonal Matching Pursuit (OMP) ... Greedy methods [Gilbert et al.,2003; Tropp, 2004] Convex relaxation methods [Tibshirani, 1996; Zou & Hastie, 2005] Previous methods

Our approach multi-objective reformulation: [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] multi-objective reformulation: sparse regression can be divided into two goals reduce MSE reduce size

evaluation & selection Our approach [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] new solutions archive initialization random evaluation & selection reproduction

Theoretical advantages [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] Is POSS as good as the previously best method (FR) ? Yes, POSS can achieve the same approximation ratio Can POSS be better ? Yes, POSS can solve exact solutions on problem subclasses, while FR cannot

Empirical comparison [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] select 8 features, report R2 (the larger the better), average over 100 runs

Empirical comparison [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] Comparison optimization performance with different sparsities

Empirical comparison Comparison test error with different sparsities [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] Comparison test error with different sparsities with L2 regularization

Empirical comparison POSS running time v.s. performance [C. Qian, Y. Yu and Z.-H. Zhou. Pareto Optimization for Subset Selection. NIPS’15] POSS running time v.s. performance best greedy performance theoretical running time

Outline Approximation ability Pareto v.s. Penalty Evolutionary learning Pareto Ensemble Pruning Pareto Sparse Regression Conclusion

Conclusion Motivated by the hard optimizations in machine learning We did research on the theoretical foundation of evolutionary algorithms With the leveraged optimization power, we can solve learning problems better convex formulations are limited non-convex formulations are rising

Thank you! yuy@nju.edu.cn http://cs.nju.edu.cn/yuy Collaborators 钱超 Prof. Xin Yao 周志华教授