0), but about p(t>t * | =0). However, for simple situations this works just fine and arguments are mostly philosophical"> 0), but about p(t>t * | =0). However, for simple situations this works just fine and arguments are mostly philosophical">
Download presentation
Presentation is loading. Please wait.
Published bySharon Griffith Modified over 8 years ago
1
2-9-061 Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( , 2 ) Joint pdf for the whole random sample Maximum likelihood estimates of the model parameters and 2 are numbers that maximize the joint pdf for the fixed sample which is called the Likelihood function Likelihood function is basically the pdf for the fixed sample
2
2-9-062 Sampling Distributions x 1,x 2,....,x n ~ iid N( , 2 ) "Sample Statistics" is a numerical summary of a random sample (e.g. ). As functions of random variables, they are also random variables. "Sampling Distribution" is the probability distribution (statistical model) of "Sample Statistics" - can be derived from the probability distribution of experimental outcomes
3
2-9-063 "Frequentist" inference Assume that parameters in the model describing the probability of experimental outcome are unknown, but fixed values Given a random sample of experimental outcome (data), we make inference (i.e. make probabilistic statements) about the values of the underlying parameters based on the sampling distributions of parameter estimates and other "sample statistics" Since model parameters are not random variables, these statements are somewhat contrived. For example we don't talk about the p( >0), but about p(t>t * | =0). However, for simple situations this works just fine and arguments are mostly philosophical
4
2-9-064 Bayesian Inference Assumes parameters are random variables - key difference Inference based on the posterior distribution given data Prior Distribution Defines prior knowledge or ignorance about the parameter Posterior Distribution Prior belief modified by data
5
2-9-065 Bayesian Inference Prior distribution of Data model given Posterior distribution of given data (Bayes theorem) P( >0|data)
6
2-9-066 Bayesian Estimation Bayesian point-estimate is the expected value of the parameter under its posterior distribution given data In some cases, the expectation of the posterior distribution could be difficult to assess - easer to find the value for the parameter that maximized the posterior distribution given data - Maximum a Posteriori (MAP) estimate Since the numerator of the posterior distribution in the Bayes theorem is constant in the parameter, this is equivalent to maximizing the product of the likelihood and the prior pdf
7
2-9-067 Alternative prior for the normal model Degenerate uniform prior for assuming that any prior value is equally likely - this is clearly unrealistic - we know more than that MAP estimate for is identical to the maximum likelihood estimate Bayesian point-estimation and maximum likelihood are very closely related
8
2-9-068 Hierarchical Bayesian Models and Empirical Bayes Inference MOTIVATION x ij ~ ind N( j, j 2 ), i=1,...,n is number of replicated observations and j=1,...,T is indexing all genes Each gene has its own mean and variance Usually n is small in comparison to T Want to use information from all genes to estimate the variance of individual gene measurements
9
2-9-069 Hierarchical Bayesian Models and Empirical Bayes Inference SOLUTION Postulate the "hierarchical" Bayesian model in which individual variances for different genes are assumed to be generated by a single distributions Estimate the parameters of this distribution using the Empirical Bayes approach Estimate individual gene's variances using Bayesian estimation assuming the prior parameters calculated using Empirical Bayes
10
2-9-0610 Hierarchical Bayesian Models and Empirical Bayes Inference Testing the hypothesis i =0, by calculating the modified t-statistics Limma operates on linear models y j = X j + j, 1j,..., nj ~ N(0, j 2 ) and the Empirical Bayes estimation is applied to estimate 2 for each gene
11
2-9-0611 Effects of using Empirical Bayes modifications > attributes(FitLMAD) $names [1] "coefficients" "stdev.unscaled" "sigma" "df.residual" "cov.coefficients" [6] "pivot" "method" "design" "genes" "Amean" $class [1] "MArrayLM" attr(,"package") [1] "limma" > attributes(EFitLMAD) $names [1] "coefficients" "stdev.unscaled" "sigma" "df.residual" "cov.coefficients" [6] "pivot" "method" "design" "genes" "Amean" [11] "df.prior" "s2.prior" "var.prior" "proportion" "s2.post" [16] "t" "p.value" "lods" "F" "F.p.value" $class [1] "MArrayLM" attr(,"package") [1] "limma"
12
2-9-0612 Effects of using Empirical Bayes modifications > EFitLMAD$s2.prior [1] 0.03466463 > EFitLMAD$df.prior [1] 4.514814
13
2-9-0613 Effects of using Empirical Bayes modifications > AnovadB$s2.prior [1] 0.0363576 > AnovadB$df.prior [1] 5.134094 Empirical Bayes "inflates variances" from the low-variability genes This reduces the proportion of "false positive" resulting from the low variance It biases chance of being differentially expressed towards genes with higher observed differential expressions It has been shown to overall improve the proportion of true positives among the genes pronounced significant "Stein effect" - individually we can not improve over the simple t-test, but by looking at all genes at the same time, turns out that this method works better
14
2-9-0614 Effects of using Empirical Bayes modifications > AnovadB$s2.prior [1] 0.0363576 > AnovadB$df.prior [1] 5.134094
15
2-9-0615 Effects of using Empirical Bayes modifications > AnovadB$s2.prior [1] 0.0363576 > AnovadB$df.prior [1] 5.134094
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.