Regression Usman Roshan CS 698 Machine Learning. Regression Same problem as classification except that the target variable y i is continuous. Popular.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Regularized risk minimization
Linear Regression.
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Pattern Recognition and Machine Learning
Pattern Recognition and Machine Learning
R OBERTO B ATTITI, M AURO B RUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Supervised Learning Recap
CMPUT 466/551 Principal Source: CMU
Chapter 4: Linear Models for Classification
Regression Usman Roshan CS 675 Machine Learning. Regression Same problem as classification except that the target variable y i is continuous. Popular.
Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic.
Visual Recognition Tutorial
Classification and risk prediction
1-norm Support Vector Machines Good for Feature Selection  Solve the quadratic program for some : min s. t.,, denotes where or membership. Equivalent.
Classification and risk prediction Usman Roshan. Disease risk prediction What is the best method to predict disease risk? –We looked at the maximum likelihood.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Ranking individuals by group comparison New exponentiel model Two methods for calculations  Regularized least square  Maximum likelihood.
Genome-wide association studies Usman Roshan. Recap Single nucleotide polymorphism Genome wide association studies –Relative risk, odds risk (or odds.
Constrained Optimization Rong Jin. Outline  Equality constraints  Inequality constraints  Linear Programming  Quadratic Programming.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Today Wrap up of probability Vectors, Matrices. Calculus
Review of Lecture Two Linear Regression Normal Equation
PATTERN RECOGNITION AND MACHINE LEARNING
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
1 Logistic Regression Adapted from: Tom Mitchell’s Machine Learning Book Evan Wei Xiang and Qiang Yang.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct
CSE 446 Logistic Regression Winter 2012 Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
Linear Models for Classification
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Regression Usman Roshan CS 675 Machine Learning. Regression Same problem as classification except that the target variable y i is continuous. Popular.
Machine Learning 5. Parametric Methods.
Regression. We have talked about regression problems before, as the problem of estimating the mapping f(x) between an independent variable x and a dependent.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Clustering Usman Roshan CS 675. Clustering Suppose we want to cluster n vectors in R d into two groups. Define C 1 and C 2 as the two groups. Our objective.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Applied statistics Usman Roshan.
Regression Usman Roshan.
Usman Roshan CS 675 Machine Learning
Deep Feedforward Networks
Probability Theory and Parameter Estimation I
Empirical risk minimization
Bounding the error of misclassification
Ch3: Model Building through Regression
10701 / Machine Learning.
Lecture 04: Logistic Regression
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Probabilistic Models for Linear Regression
Roberto Battiti, Mauro Brunato
Ying shen Sse, tongji university Sep. 2016
10701 / Machine Learning Today: - Cross validation,
Biointelligence Laboratory, Seoul National University
Regression Usman Roshan.
Parametric Methods Berlin Chen, 2005 References:
Empirical risk minimization
Machine learning overview
Presentation transcript:

Regression Usman Roshan CS 698 Machine Learning

Regression Same problem as classification except that the target variable y i is continuous. Popular solutions – Linear regression (perceptron) – Support vector regression – Logistic regression (for regression)

Linear regression Suppose target values are generated by a function y i = f(x i ) + e i We will estimate f(x i ) by g(x i,θ). Suppose each e i is being generated by a Gaussian distribution with 0 mean and σ 2 variance (same variance for all e i ). Now we can ask what is the probability of y i given the input x i and variables θ (denoted as p(x i |y i,θ) This is normally distributed with mean g(x i,θ) and variance σ 2.

Linear regression Apply maximum likelihood to estimate g(x, θ) Assume each (x i,y i ) i.i.d. Then probability of data given model (likelihood) is P(X|θ) = p(x 1,y 1 )p(x 2,y 2 )…p(x n,y n ) Each p(x i,y i )=p(y i |x i )p(x i ) Maximizing the log likelihood gives us least squares (linear regression)

Logistic regression Similar to linear regression derivation Minimize sum of squares between predicted and actual value However – predicted is given by sigmoid function and – y i is constrained in the range [0,1]

Support vector regression Makes no assumptions about probability distribution of the data and output (like support vector machine). Change the loss function in the support vector machine problem to the e-sensitive loss to obtain support vector regression

Support vector regression Solved by applying Lagrange multipliers like in SVM Solution w is given by a linear combination of support vectors (like in SVM) The solution w can also be used for ranking features. From regularized risk minimization the loss would be

Application Prediction of continuous phenotypes in mice from genotype (Predicting unobserved phen…)Predicting unobserved phen Data are vectors x i where each feature takes on values 0, 1, and 2 to denote number of alleles of a particular single nucleotide polymorphism (SNP) Output y i is a phenotype value. For example coat color (represented by integers), chemical levels in blood

Mouse phenotype prediction from genotype Rank SNPs by Wald test – First perform linear regression y = wx + w 0 – Calculate p-value on w using t-test t-test: (w-w null )/stderr(w)) w null = 0 T-test: w/stderr(w) stderr(w) given by Σ i (y i -wx i -w 0 ) 2 /(x i -mean(x i )) – Rank SNPs by p-values – OR by Σ i (y i -wx i -w 0 ) Rank SNPs by support vector regression (w vector in SVR) Perform linear regression on top k ranked SNP under cross- validation.

Prediction of MCH in mouse

Prediction of CD8 in mouse