The Chicken Project Dimension Reduction-Based Penalized logistic Regression for cancer classification Using Microarray Data By L. Shen and E.C. Tan Name.

Slides:



Advertisements
Similar presentations
Adjusting Active Basis Model by Regularized Logistic Regression
Advertisements

Partial Least Squares Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by Javier Cabrera.
Vector Operations in R 3 Section 6.7. Standard Unit Vectors in R 3 The standard unit vectors, i(1,0,0), j(0,1,0) and k(0,0,1) can be used to form any.
Regression “A new perspective on freedom” TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A AAA A A.
Classification / Regression Support Vector Machines
Detecting Faces in Images: A Survey

Support vector machine
Minimum Redundancy and Maximum Relevance Feature Selection
« هو اللطیف » By : Atefe Malek. khatabi Spring 90.
Model Assessment, Selection and Averaging
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
São Paulo Advanced School of Computing (SP-ASC’10). São Paulo, Brazil, July 12-17, 2010 Looking at People Using Partial Least Squares William Robson Schwartz.
Support Vector Machines H. Clara Pong Julie Horrocks 1, Marianne Van den Heuvel 2,Francis Tekpetey 3, B. Anne Croy 4. 1 Mathematics & Statistics, University.
Support Vector Machines
x – independent variable (input)
A Statistical Framework for the Design of Microarray Experiments and Effective Detection of Differential Gene Expression by Shu-Dong Zhang, Timothy W.
Curve-Fitting Regression
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lecture 10: Support Vector Machines
Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:
Matrix Approach to Simple Linear Regression KNNL – Chapter 5.
Collaborative Filtering Matrix Factorization Approach
Presented By Wanchen Lu 2/25/2013
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Classification (Supervised Clustering) Naomi Altman Nov '06.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE: Support Vector Machines.
Sample classification using Microarray Data. AB We have two sample entities malignant vs. benign tumor patient responding to drug vs. patient resistant.
Support vector machines for classification Radek Zíka
Statistical Methods Statistical Methods Descriptive Inferential
Generalizing Linear Discriminant Analysis. Linear Discriminant Analysis Objective -Project a feature space (a dataset n-dimensional samples) onto a smaller.
1/15 Strengthening I-ReGEC classifier G. Attratto, D. Feminiano, and M.R. Guarracino High Performance Computing and Networking Institute Italian National.
ICS 178 Introduction Machine Learning & data Mining Instructor max Welling Lecture 6: Logistic Regression.
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
A presentation on the topic For CIS 595 Bioinformatics course
CS 478 – Tools for Machine Learning and Data Mining SVM.
Linear Models for Classification
Support Vector Machines in Marketing Georgi Nalbantov MICC, Maastricht University.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.
Linear Methods for Classification Based on Chapter 4 of Hastie, Tibshirani, and Friedman David Madigan.
CZ5225: Modeling and Simulation in Biology Lecture 7, Microarray Class Classification by Machine learning Methods Prof. Chen Yu Zong Tel:
A Short and Simple Introduction to Linear Discriminants (with almost no math) Jennifer Listgarten, November 2002.
Logistic Regression & Elastic Net
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
Machine Learning and Data Mining: A Math Programming- Based Approach Glenn Fung CS412 April 10, 2003 Madison, Wisconsin.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
Preparation for Calculus P. Fitting Models to Data P.4.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Dimension reduction (2) EDR space Sliced inverse regression Multi-dimensional LDA Partial Least Squares Network Component analysis.
Classification of tissues and samples 指導老師:藍清隆 演講者:張許恩、王人禾.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
PREDICT 422: Practical Machine Learning
PREDICT 422: Practical Machine Learning
Chapter 7. Classification and Prediction
Machine learning, pattern recognition and statistical data modelling
Pawan Lingras and Cory Butz
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Collaborative Filtering Matrix Factorization Approach
6.5 Taylor Series Linearization
Support Vector Machines 2
Multiple linear regression
Presentation transcript:

The Chicken Project Dimension Reduction-Based Penalized logistic Regression for cancer classification Using Microarray Data By L. Shen and E.C. Tan Name of student: Kung-Hua Chang Date: July 8, 2005 SoCalBSI California State University at Los Angeles

Background Microarray data have the characteristics that the number of samples ismuch less than the number of variables. This causes the “curse of dimensionality” problem. In order to solve this problem, many dimension reduction methods are used such as Singular Value Decomposition and Partial Least Squares.

Background (cont’d) Singular Value Decomposition and Partial Least Squares. Given a m x n matrix X that stores all of the gene expression data. Then X can be approximated as:

Background (cont’d)

Logistic regression and least square regression. They are ways to draw a line that can approximate a set of points.

Background (cont’d) The difference is that logistic regression equations are solved iteratively. A trial equation is fitted and tweaked over and over in order to improve the fit. Iterations stop when the improvement from one step to the next is suitably small. Least square regression can be solved explicitly.

Background (cont’d) Penalized logistic regression is just a logistic regression method except that there is a cost function associated with it.

Background (cont’d) Support Vector Machine (SVM)  SVM tries a find a hyper-plane that can separate different sets of data.  Not a linear model.

Hypothesis  The combination of dimension reduction- based penalized logistic regression has the best performance compared to support vector machine and least squares regression.

Data Analysis The above table shows the number of training/testing cases in the seven publicly available cancer data sets.

Data Analysis (cont’d)

Data Analysis

Generally, the partial least square based classifier uses less time than the singular value decomposition based classifier.

Data Analysis (cont’d) The penalized logistic regression training requires solving a set of linear equations iteratively until convergence, while the least square regression training requires solving a set of linear equations only once. So it’s reasonable to see that penalized logistic regression uses more time than the least square regression.

Data Analysis (cont’d) The overall time required by partial least squares and SVD-based regression method is much less than that of support vector machine.

Data Analysis

Conclusion The combination of dimension reduction based penalized logistic regression has the best performance compared to support vector machine and least squares regression.

References [1] L. Shen and E.C. Tan (to appear in June, 2005) "Dimension Reduction-Based Penalized Logistic Regression for Cancer Classification Using Microarray Data", IEEE/ACM Trans. Computational Biology and Bioinformatics [2] SoCalBSI: [3] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning; Data mining, Inference and Prediction. Springer Verlag, New York, 2001.