Classical regression review

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Bayesian inference of normal distribution
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Sampling: Final and Initial Sample Size Determination
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
MF-852 Financial Econometrics
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Statistical inference form observational data Parameter estimation: Method of moments Use the data you have to calculate first and second moment To fit.
8 Statistical Intervals for a Single Sample CHAPTER OUTLINE
Quantitative Business Analysis for Decision Making Simple Linear Regression.
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Review of normal distribution. Exercise Solution.
One Sample  M ean μ, Variance σ 2, Proportion π Two Samples  M eans, Variances, Proportions μ1 vs. μ2 σ12 vs. σ22 π1 vs. π Multiple.
Bayes Factor Based on Han and Carlin (2001, JASA).
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Model Inference and Averaging
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Interval Estimation.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Confidence Interval & Unbiased Estimator Review and Foreword.
Statistics 300: Elementary Statistics Sections 7-2, 7-3, 7-4, 7-5.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
- 1 - Matlab statistics fundamentals Normal distribution % basic functions mew=100; sig=10; x=90; normpdf(x,mew,sig) 1/sig/sqrt(2*pi)*exp(-(x-mew)^2/sig^2/2)
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Chapter 10 Confidence Intervals for Proportions © 2010 Pearson Education 1.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Bayesian Inference: Multiple Parameters
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Inference: Conclusion with Confidence
Bayesian Estimation and Confidence Intervals
MCMC Output & Metropolis-Hastings Algorithm Part I
STATISTICS POINT ESTIMATION
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Inference: Conclusion with Confidence
Model Inference and Averaging
Ch3: Model Building through Regression
Inference for the mean vector
Analyzing Redistribution Matrix with Wavelet
Statistical Methods For Engineers
Estimating
CONCEPTS OF ESTIMATION
9 Tests of Hypotheses for a Single Sample CHAPTER OUTLINE
OVERVIEW OF BAYESIAN INFERENCE: PART 1
STATISTICS INTERVAL ESTIMATION
Discrete Event Simulation - 4
BASIC REGRESSION CONCEPTS
CHAPTER 6 Statistical Inference & Hypothesis Testing
CS639: Data Management for Data Science
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Accuracy of Averages.
Presentation transcript:

Classical regression review Important equations Functional form of the regression Regression coefficients Standard error Coefficient of determination Variance of coefficients Variance of regression Variance of prediction

Practice example Example problem % Data y=[0.95, 1.08, 1.28, 1.23, 1.42, 1.45]'; x=[0 0.2 0.4 0.6 0.8 1.0]'; be = [0.9871 0.4957] se = 0.0627 R = 0.9162 sebe = [0.0454 0.0750] corr= -0.8257 Conf. interval = red line Pred. interval = magenta line

Bayesian analysis of classical regression Remark Classical regression is turned into the Bayesian: unknown coefficients b are estimated conditional on the observed data set (x, y). If non-informative prior for b, solution is the same as the classical one. If there exist certain priors, however, there is no closed form solution. Like we did before, we can practice Bayesian and validate results using the classical solution, in case of non-informative prior. Statistical definition of the data Assuming normal distribution of the data with the mean at regression equation, the data distribution is expressed as Parameters to be estimated Regression coefficients b=[b1,b2] ( something like m) and variance s2.

Joint posterior pdf of b, s2 Non-informative prior Likelihood to observe the data y Joint posterior pdf of b=[b1,b2], s2 (this is 3 parameters problem) Compare with posterior pdf of normal distribution parameters m,s2

Joint posterior pdf of b, s2 Analytical procedure Factorization Marginal pdf of s2 Conditional pdf of b Posterior predictive distribution Sampling method based on factorization approach Draw random s2 from inverse- c2 distribution. Draw random b from conditional pdf b|s2. Draw predictive ỹ at a new point using the expression ỹ|y.

Practice example Joint posterior pdf of b, s2 Data This is function of 3 parameters. In order to draw the shape of the pdf, let’s assume s = 0.06. Max location of be = [b1 b2] is near [1 0.5] which agrees with true values. where X=[ones(n,1) x]; y=[0.95, 1.08, 1.28, 1.23, 1.42, 1.45]'; x=[0 0.2 0.4 0.6 0.8 1.0]';

Practice example Sampling by MCMC Using N=1e4, starting from b=[0;0] and s=1, as we iterate MCMC, we get convergence of b and s. At the initial stage, however, samples should be discarded. This is called Burn-in. The max likelihood of b is found near [1;0.5], and of s near 0.06, which agree with the true values.

Practice example Sampling by MCMC Using N=1e4, MCMC is repeated ten times. The variances of the results are favorably small, which shows that the distribution can be accepted as the solution. * * * * * * * * * * * * * * *

Practice example Sampling by MCMC Different value of w for proposal pdf leads to convergence failure.

Practice example Sampling by MCMC Different starting point of b may be suggested to check convergence and whether we get the same result.

Practice example Posterior analysis Posterior distribution of regression: using samples of B1 & B2, samples of ym are generated, where ym = B1+B2*x. Blue curve is the mean of ym. Red curves are the confidence bounds of ym. (2.5%, 97.5% of the samples.) Posterior predictive distribution: using samples of ym and S, samples of predicted y are generated, i.e., yp ~ N(ym,s2).

Confidence vs prediction interval Classical regression Confidence interval comes from variance of regression Prediction interval comes from variance of prediction Bayesian approach of regression Confidence interval comes from Posterior distribution of regression. Predictive interval comes from Posterior predictive distribution. Bayesian approach of normal distribution Confidence interval comes from t-distribution with n-1 dof where mean ȳ and variance s2/n. Predictive interval comes from t-distribution with n-1 dof where mean ȳ and variance s2/n + s2.