Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presentation Title Department of Computer Science A More Principled Approach to Machine Learning Michael R. Smith Brigham Young University Department of.

Similar presentations


Presentation on theme: "Presentation Title Department of Computer Science A More Principled Approach to Machine Learning Michael R. Smith Brigham Young University Department of."— Presentation transcript:

1 Presentation Title Department of Computer Science A More Principled Approach to Machine Learning Michael R. Smith Brigham Young University Department of Computer Science 2 February 2015

2 Presentation Title Department of Computer Science Machine Learning  Learn from past experience  Change their behavior without explicitly being programed  Optimization techniques  Maximize accuracy  Minimize error  Mine data 2

3 Presentation Title Department of Computer Science Machine Learning Example  I, Robot 3

4 Presentation Title Department of Computer Science Machine Learning 4

5 Presentation Title Department of Computer Science 5 Machine Learning WeightHeightBlood Press Temp 20578good98.2 15765bad100.7 18571mod99.5 Learning Algorithm Training Data WeightHeightBlood Press Temp 17267bad100.1 Has Disease yes no Has Disease ?

6 Presentation Title Department of Computer Science 6 Machine Learning

7 Presentation Title Department of Computer Science 7 Machine Learning WeightHeightBlood Press TempHas Disease 20578good98.2 Yes 15765bad100.7 Yes 18571mod99.5 No WeightHeightBlood Press Temp 17267bad100.1 Has Disease ? Data Set# Features# ClassesEntropy…# NodesLearning Rate…Accuracy Disease420.24…30.1…83.4 Iris430.76…70.2…97.4 Meta-data

8 Presentation Title Department of Computer Science 8 Meta-Learning Data Set# Features# ClassesEntropy…# NodesLearning Rate…Accuracy Disease420.24…30.1…83.4 Iris430.76…70.2…97.4 Data Set# Features# ClassesEntropy… Ecology1730.5… Meta-features

9 Presentation Title Department of Computer Science Meta-Learning 9

10 Presentation Title Department of Computer Science Meta-Learning 10

11 Presentation Title Department of Computer Science Previous Work 11  Random Search

12 Presentation Title Department of Computer Science Instance Hardness  Learning algorithms are generally evaluated at the data set level  Are some instances intrinsically hard to classify?  Why are some instances misclassified?  Are there instances which are misclassified that should not be?  Are some instances misclassified by all learning algorithms?  If so, why? 12

13 Presentation Title Department of Computer Science Data Set 13

14 Presentation Title Department of Computer Science Overfit 14

15 Presentation Title Department of Computer Science 15 Linear Classifier

16 Presentation Title Department of Computer Science 16 Detrimental Instances

17 Presentation Title Department of Computer Science Instance Hardness  Better intuition of learning algorithms and why instances are misclassified  Can learning algorithms be improved? Where?  Informed analysis of learning algorithm performance  Is the classification reasonable?  Where can the quality of the data be improved  Empirical analysis of the classification of 57 data sets by 9 learning algorithms  10-fold cross-validation  178,109 instances  5,310 models were created 17

18 Presentation Title Department of Computer Science Instance Hardness 18

19 Presentation Title Department of Computer Science Instance Hardness 19  9 learning algorithms  C4.5  MLP  RIPPER  NNge  Ridor  Unsupervised Meta-learning  Cluster learning algorithms based on diversity  Intuition for all of the algorithms in the cluster  5NN  Random Forest  LWL  Naïve Bayes

20 Presentation Title Department of Computer Science Existence of Instance Hardness 20  53% correctly classified by all algorithms  5% misclassified by all algorithms  Learning algorithms disagree on 42% of the instances  15% misclassified by the majority of algorithms

21 Presentation Title Department of Computer Science 21 Modeling Detrimental Instances  True class label is generally ignored  Regularization  Validation sets  Pruning

22 Presentation Title Department of Computer Science 22 Modeling Detrimental Instances

23 Presentation Title Department of Computer Science Instance Quality Learning 23

24 Presentation Title Department of Computer Science 24 Inequality Learning

25 Presentation Title Department of Computer Science 25 0.00019 0.678 0.054 Inequality Learning

26 Presentation Title Department of Computer Science Results: Original MLPC4.55-NNLWLNBNngeRandFRidorRip Orig80.780.17969.475.779.481.676.677.8 QW-L83.880.18070.477.279.483.378.679.7 p-val< 0.0010.0450.015 0.014< 0.0010.788< 0.0010.036< 0.001 g,e,l47,0,532,0,2035,1,1628,10,1435,1,1620,1,2733,1,1831,1,1938,0,14 QW-B84.682.380.368.275.279.483.578.678.8 p-val< 0.001 0.0160.5900.8580.877< 0.0010.013< 0.001 g,e,l49,0,337,1,1432,0,2022,12,1819,1,3221,1,2632,2,1834,1,1637,3,12 Filter82.981.882.370.077.382.483.279.579.7 p-val< 0.001 0.032< 0.001 g,e,l39,0,1338,3,1138,4,1026,12,1436,1,1540,0,1233,1,1835,3,1440,2,10 26

27 Presentation Title Department of Computer Science Results: Original 27 MLPC4.55-NNLWLNBNngeRandFRidorRip Orig80.780.17969.475.779.481.676.677.8 QW-L83.880.18070.477.279.483.378.679.7 p-val< 0.0010.0450.015 0.014< 0.0010.788< 0.0010.036< 0.001 g,e,l47,0,532,0,2035,1,1628,10,1435,1,1620,1,2733,1,1831,1,1938,0,14 QW-B84.682.380.368.275.279.483.578.678.8 p-val< 0.001 0.0160.5900.8580.877< 0.0010.013< 0.001 g,e,l49,0,337,1,1432,0,2022,12,1819,1,3221,1,2632,2,1834,1,1637,3,12 Filter82.981.882.370.077.382.483.279.579.7 p-val< 0.001 0.032< 0.001 g,e,l39,0,1338,3,1138,4,1026,12,1436,1,1540,0,1233,1,1835,3,1440,2,10

28 Presentation Title Department of Computer Science Inequality Learning 28  Increases the accuracy for all of the investigated learning algorithms  Advantage to using a continuous value rather than binary  Most effective in global learning algorithms such as backpropagation  Could be a side effect of how we integrated instance quality into the learning algorithm. (Future Work)  Focusing on the data, how does it compare with hyper-parameter optimization (HPO)?

29 Presentation Title Department of Computer Science Comparison of HPO and Filtering 29

30 Presentation Title Department of Computer Science K-Fold Cross-Validation  Create K partitions of the data set  For each partition, use as testing and remaining K-1 partitions for training 30

31 Presentation Title Department of Computer Science K-Fold Cross-Validation  Use a validation set to determine which set of hyper- parameters to use 31 Validation examples

32 Presentation Title Department of Computer Science Experimental Methodology 32  Hyper-parameter optimization  Bayesian Optimization (more than 512 hyper-parameter settings explored for most learning algorithms)  Standard uses the accuracy on a validation set  Optimistic uses the 10-fold cross-validation accuracy  Filtering  Ensemble Filter (L-Filter)  Removes instances that are misclassified by the majority of a set of learning algorithms  Adaptive Filter (A-Filter)  Greedy search among candidate learning algorithms

33 Presentation Title Department of Computer Science Results-Standard Approach VS OrigL-FilterHPO MLP44,1,747,0,5 C4.545,1,639,0,13 kNN44,2,641,2,9 NB42,0,1042,1,9 RF38,3,1137,2,13 RIP50,0,247,1,4

34 Presentation Title Department of Computer Science Results-Optimistic Approach 34 Not one filtering approach is best for all data sets and learning algorithms VS HPOL-FilterA-Filter MLP27,3,2245,0,7 C4.533,4,1548,2,2 kNN30,2,2051,0,1 NB22,2,2834,0,18 RF27,1,2446,0,6 RIP34,1,1748,0,4

35 Presentation Title Department of Computer Science Why does filtering have such a significant effect?  Recall: Maximize the probability of the hypothesis given the data  At the instance-level: 35

36 Presentation Title Department of Computer Science 36 Example Data Set

37 Presentation Title Department of Computer Science A Need for Better Understanding  Filter has a much higher potential than HPO  No principled examination 37

38 Presentation Title Department of Computer Science The Need for a Repository 38

39 Presentation Title Department of Computer Science The Need for a Repository 39

40 Presentation Title Department of Computer Science The Need for a Repository 40

41 Presentation Title Department of Computer Science Benefits of a Repository  Better science  Reproducible/saved results  Save time  Build reputation  Easier to compare with other work  Gives a snapshot of current state  Overall  Specific data set  Meta-learning  Provide data set 41

42 Presentation Title Department of Computer Science Machine Learning Results Repository 42

43 Presentation Title Department of Computer Science Machine Learning Results Repository 43 Data Set-Level Learning Algorithm -Level Instance-Level

44 Presentation Title Department of Computer Science Future Directions and Projects  MLRR  Data quality  Linking with papers  Creating user profiles  Anonymous postings for supplemental material  Meta-learning  Combine learning with optimization techniques  Meta-features  Deep learning  Collaborative filtering  Automate machine learning 44

45 Presentation Title Department of Computer Science Future Directions and Projects  Incorporate information into the learning process  Use cases of machine learning  How is machine learning actually used?  How can it be made easier to use?  Collaboration/application to other fields  Bioinformatics  Social media  Sports statistics 45

46 Presentation Title Department of Computer Science Thank you


Download ppt "Presentation Title Department of Computer Science A More Principled Approach to Machine Learning Michael R. Smith Brigham Young University Department of."

Similar presentations


Ads by Google