Experiments: Three data sets : Ecoli, Yeast, Fly Evaluate each classifier using 5-fold cross validation Results: Feature selection (wrapper model) improves.

Slides:



Advertisements
Similar presentations
Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
Advertisements

Slides from: Doug Gray, David Poole
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Data Mining Classification: Alternative Techniques
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
Supervised Learning Recap
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Lecture 14 – Neural Networks
Support Vector Machines (and Kernel Methods in general)
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
x – independent variable (input)
Proteome Analyst Transparent High-throughput Protein Annotation: Function, Localization and Custom Predictors.
CES 514 – Data Mining Lecture 8 classification (contd…)
Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.
Analysis of Classification-based Error Functions Mike Rimer Dr. Tony Martinez BYU Computer Science Dept. 18 March 2006.
Artificial Neural Networks
Linear Discriminators Chapter 20 From Data to Knowledge.
Classification and Prediction: Regression Analysis
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
8/25/05 Cognitive Computations Software Tutorial Page 1 SNoW: Sparse Network of Winnows Presented by Nick Rizzolo.
Appendix B: An Example of Back-propagation algorithm
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
CSE 446 Perceptron Learning Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Text Classification 2 David Kauchak cs459 Fall 2012 adapted from:
CISC Machine Learning for Solving Systems Problems Presented by: Ashwani Rao Dept of Computer & Information Sciences University of Delaware Learning.
Linear Discrimination Reading: Chapter 2 of textbook.
Meng-Han Yang September 9, 2009 A sequence-based hybrid predictor for identifying conformationally ambivalent regions in proteins.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
CSE 446 Logistic Regression Perceptron Learning Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Ubiquitination Sites Prediction Dah Mee Ko Advisor: Dr.Predrag Radivojac School of Informatics Indiana University May 22, 2009.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Support Vector Machines
Support Vector Machines
Boosted Augmented Naive Bayes. Efficient discriminative learning of
CSE 4705 Artificial Intelligence
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Data Mining Lecture 11.
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Machine Learning Week 1.
Linear Discriminators
CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis
CSSE463: Image Recognition Day 18
Linear Discrimination
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Image recognition.
Support Vector Machines 2
Presentation transcript:

Experiments: Three data sets : Ecoli, Yeast, Fly Evaluate each classifier using 5-fold cross validation Results: Feature selection (wrapper model) improves accuracy ANN and SVM give best performance Naïve Bayesian Net (NB) Assumption:   features are conditionally independent, given class labels Structure:   1 level tree   class labels — root   features — leaf nodes Support Vector Machine (SVM)Tree Augmented Naïve Bayesian Net (TAN)Artificial Neural Net (ANN)Logistic Regression (LR) This work was partially funded by grants from PENCE and NSERC Zhiyong Lu*, Xiaomeng Wu and Russ Greiner University of Alberta Protein Sequence Classification in Proteome Analyst Assumption: allow some additional edges between features for simple correlation between the features Structure: approximate the interactions among features using a tree structure among features, as well as link from class to each feature Input vectors are separated into positive vs. negative instance Data points that lie on the margin are “support vectors” Map to new feature space such as polynomial function and RBF Practical for learning real-valued and vector-valued functions over continuous and discrete-valued features Robust to noise in training data Successful application in many other fields Aims to produce smallest empirical classification error Gradient-descent algorithm is used to set parameters Learning algorithm descends in the direction of total derivative, given a set of training data C F1 F2 Fn … Input: feature vector F (F1,F2,…,Fn) Prediction: C F1 F2F3 F4 Input: feature vector F = (F1, F2, …, Fn) Conditional Mutual Information between every two features F1 and F2, given C: Algorithm for learning structure (links between features) : Chow and Liu, 1968 Prediction: … … input hidden output Perceptron: Linear separation of the input space h(x) = sign( + b ) Input node: feature vector F (F1,F2,…,Fn) Hidden node: one layer, fully connected Backpropagation algorithm Prediction: each output node for one class Classification Error: Empirical Classification Error: Logistic Regression = Discriminative Learning of NB Learn the CPTable entries for the given NB structure to produce larger empirical LCL score, hence, smaller error Log conditional likelihood(LCL): Empirical LCL: Initial CPTable For each training data, calculate partial derivative Sum up to get a total derivative Gradient-descent algorithm to update each CPTable entry Get better conditional likelihood “MORE ACCURATE” ! Acknowledgement: Dept. of Computing Science PENCE Dr. Duane Szafron, Dr. Paul Lu James Redford, Roman Eisner Feature extractor unknown sequences that have high similarity to some sequences in the SwissProt database Go Psiblast sp|P00561|AK1H_ECOLI BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE sp|P27725|AK1H_SERMA BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE sp|P49079|AKH1_MAIZE BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE sp|P49080|AKH2_MAIZE BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE sp|P44505|AKH_HAEIN BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE D sp|P37142|AKH_DAUCA BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE D sp|P57290|AKH_BUCAI BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE D sp|P00562|AK2H_ECOLI BIFUNCTIONAL ASPARTOKINASE/HOMOSERINE LOCUS AKH1_MAIZE 820 aa linear BCT 16-OCT-2001 DEFINITION Bifunctional aspartokinase/homoserine dehydrogenase I (AKI-HDI) [Includes: Aspartokinase I ; Homoserine dehydrogenase I ]. ACCESSION P00561 PID g VERSION P00561 GI: DBSOURCE swissprot: locus AK1H_ECOLI, accession P00561; class: standard. extra accessions:Q47659,created: Jul 21, sequence updated: Jul 21, annotation updated: Oct 16, xrefs (non-sequence databases): EcoGene EG10998, InterPro IPR002912, InterPro IPR001048, InterPro IPR001341, InterPro IPR001342, Pfam PF00696, Pfam PF01842, Pfam PF00742, PROSITE PS00324, PROSITE PS01042 KEYWORDS Transferase; Kinase; Oxidoreductase; Threonine biosynthesis; NADP; Allosteric enzyme; Multifunctional enzyme; Complete proteome. SOURCE Escherichia coli.... LOCUS AK1H_ SERMA 820 aa linear BCT 16-OCT-2001 DEFINITION Bifunctional aspartokinase/homoserine dehydrogenase I (AKI-HDI) [Includes: Aspartokinase I ; Homoserine dehydrogenase I ]. ACCESSION P00561 PID g VERSION P00561 GI: DBSOURCE swissprot: locus AK1H_ECOLI, accession P00561; class: standard. extra accessions:Q47659,created: Jul 21, sequence updated: Jul 21, annotation updated: Oct 16, xrefs (non-sequence databases): EcoGene EG10998, InterPro IPR002912, InterPro IPR001048, InterPro IPR001341, InterPro IPR001342, Pfam PF00696, Pfam PF01842, Pfam PF00742, PROSITE PS00324, PROSITE PS01042 KEYWORDS Transferase; Kinase; Oxidoreductase; Threonine biosynthesis; NADP; Allosteric enzyme; Multifunctional enzyme; Complete proteome. SOURCE Escherichia coli.... LOCUS AK1H_ECOLI 820 aa linear BCT 16-OCT-2001 DEFINITION Bifunctional aspartokinase/homoserine dehydrogenase I (AKI-HDI) [Includes: Aspartokinase I ; Homoserine dehydrogenase I ]. ACCESSION P00561 PID g VERSION P00561 GI: DBSOURCE swissprot: locus AK1H_ECOLI, accession P00561; class: standard. extra accessions:Q47659,created: Jul 21, sequence updated: Jul 21, annotation updated: Oct 16, xrefs (non-sequence databases): EcoGene EG10998, InterPro IPR002912, InterPro IPR001048, InterPro IPR001341, InterPro IPR001342, Pfam PF00696, Pfam PF01842, Pfam PF00742, PROSITE PS00324, PROSITE PS01042 KEYWORDS Transferase; Kinase; Oxidoreductase; Threonine biosynthesis; NADP; Allosteric enzyme; Multifunctional enzyme; Complete proteome. SOURCE Escherichia coli.... similar sequences for unknown-1 most similar sequences for unknown-1 > unknown 1 ->AK1H-Ecoli SwissProt Database x w Half-space: w.x + b < 0 Class: -1 Half-space: w.x + b > 0 Class: +1 Hyperplane: w.x + b = 0 H1 H2 Feature Ordering Classifier Learning procedure Build Classifier (predicted) Protein Class labels labeled data thousands of features Information Content: Feature Selection: Wrapper model Machine Learning ! Tokenizer Start TAN+Wrapper x x x x x x Tokenizer Output margin Tokenizer unlabeled data Tokenizer Classification