Download presentation
Presentation is loading. Please wait.
Published bySuzanna Moore Modified over 8 years ago
1
Regression Usman Roshan CS 698 Machine Learning
2
Regression Same problem as classification except that the target variable y i is continuous. Popular solutions – Linear regression (perceptron) – Support vector regression – Logistic regression (for regression)
3
Linear regression Suppose target values are generated by a function y i = f(x i ) + e i We will estimate f(x i ) by g(x i,θ). Suppose each e i is being generated by a Gaussian distribution with 0 mean and σ 2 variance (same variance for all e i ). Now we can ask what is the probability of y i given the input x i and variables θ (denoted as p(x i |y i,θ) This is normally distributed with mean g(x i,θ) and variance σ 2.
4
Linear regression Apply maximum likelihood to estimate g(x, θ) Assume each (x i,y i ) i.i.d. Then probability of data given model (likelihood) is P(X|θ) = p(x 1,y 1 )p(x 2,y 2 )…p(x n,y n ) Each p(x i,y i )=p(y i |x i )p(x i ) Maximizing the log likelihood gives us least squares (linear regression)
5
Logistic regression Similar to linear regression derivation Minimize sum of squares between predicted and actual value However – predicted is given by sigmoid function and – y i is constrained in the range [0,1]
6
Support vector regression Makes no assumptions about probability distribution of the data and output (like support vector machine). Change the loss function in the support vector machine problem to the e-sensitive loss to obtain support vector regression
7
Support vector regression Solved by applying Lagrange multipliers like in SVM Solution w is given by a linear combination of support vectors (like in SVM) The solution w can also be used for ranking features. From regularized risk minimization the loss would be
8
Application Prediction of continuous phenotypes in mice from genotype (Predicting unobserved phen…)Predicting unobserved phen Data are vectors x i where each feature takes on values 0, 1, and 2 to denote number of alleles of a particular single nucleotide polymorphism (SNP) Output y i is a phenotype value. For example coat color (represented by integers), chemical levels in blood
9
Mouse phenotype prediction from genotype Rank SNPs by Wald test – First perform linear regression y = wx + w 0 – Calculate p-value on w using t-test t-test: (w-w null )/stderr(w)) w null = 0 T-test: w/stderr(w) stderr(w) given by Σ i (y i -wx i -w 0 ) 2 /(x i -mean(x i )) – Rank SNPs by p-values – OR by Σ i (y i -wx i -w 0 ) Rank SNPs by support vector regression (w vector in SVR) Perform linear regression on top k ranked SNP under cross- validation.
10
Prediction of MCH in mouse
11
Prediction of CD8 in mouse
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.