A Bayesian  2 test for goodness of fit 10/23/09 Multilevel RIT.

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Advertisements

Point Estimation Notes of STAT 6205 by Dr. Fan.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Chap 9: Testing Hypotheses & Assessing Goodness of Fit Section 9.1: INTRODUCTION In section 8.2, we fitted a Poisson dist’n to counts. This chapter will.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Maximum likelihood (ML) and likelihood ratio (LR) test
Chapter 10 Simple Regression.
Simulation Modeling and Analysis
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Evaluating Hypotheses
Machine Learning CMPT 726 Simon Fraser University
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Linear and generalised linear models
8-4 Testing a Claim About a Mean
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Linear and generalised linear models
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Continuous Random Variables and Probability Distributions.
Maximum likelihood (ML)
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Inference for regression - Simple linear regression
Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
1 Sampling and Sampling Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND STATISTICS.
Chapter 9.3 (323) A Test of the Mean of a Normal Distribution: Population Variance Unknown Given a random sample of n observations from a normal population.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Mid-Term Review Final Review Statistical for Business (1)(2)
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Bayesian Analysis and Applications of A Cure Rate Model.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
CPE 619 Comparing Systems Using Sample Data Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Math 4030 Final Exam Review. Probability (Continuous) Definition of pdf (axioms, finding k) Cdf and probability (integration) Mean and variance (short-cut.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Lecture Slides Elementary Statistics Twelfth Edition
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Statistical Significance Hypothesis Testing.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Virtual University of Pakistan
LECTURE 33: STATISTICAL SIGNIFICANCE AND CONFIDENCE (CONT.)
Chapter 9: Inferences Involving One Population
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Testing a Claim About a Mean:  Not Known
Chapter 7: Sampling Distributions
Chapter 9 Hypothesis Testing.
CONCEPTS OF ESTIMATION
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Confidence Intervals for a Standard Deviation
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

A Bayesian  2 test for goodness of fit 10/23/09 Multilevel RIT

Overview Talk about basic  2 test. Review with some examples. Talk about the paper with examples.

Basic  2 test The  2 test is used to test if a sample of data came from a population with a specific distribution. An attractive feature of the  2 goodness-of-fit test is that it can be applied to any univariate distribution for which you can calculate the CDF. y1y1 y3y3 y2y2 y4y4 ynyn y5y5

The value of the  2 depends on how you partition the support. The sample size needs to be a sufficient size for the approximation to be valid.

n is the sample size K is the number of partitions or bins specified over the sample space is the probability assigned by the null model to this interval is the number of observations within the k th bin The  2 statistic, in the case of the simple hypothesis, is:  2 with k-1 degrees of freedom, as n goes to infinity

4 examples We generate 4 sets of RVs: 1)1000 normal 2)1000 double exponential 3)1000 t distribution with 3 degrees of freedom 4)1000 lognormal We use the chi square test to see if each of the data sets fits a normal distribution. H o : the data come from a normal distribution

are the estimates of the bin probabilities based on either the MLE for the grouped data or on the minimum  2 method. The  2 statistic, in the case of composite hypothesis, is:  2 with k-s-1 degrees of freedom, as n goes to infinity Where s is the dimension of the underlying parameter vector 

= 5.73

The MLE for the grouped data means maximizing this function with respect to , while minimum  2 estimation involves finding the value of  that minimizes a function related to R g.

A Bayesian  2 statistic. Let y 1, ……., y n (= y) denote the scalar-valued, continuous, identically distributed, conditionally independent observations drawn from the pdf f(y|  ).  is indexed by an s-dimensional parameter vector     R s We want to generate a sampled value from the posterior p(  | y). To do that, we can apply the inverse of the probability integral transform method.

Set up these integrals, and then solve for’s Generally, in practice, the are calculated using the Gibbs sampler.

denotes a value of sampled from the posterior distribution based on y The MLE Notation considerations

This is interesting because if you contrast R B with R ^ we see that R ^ has k – s – 1 degrees of freedom while R B has K – 1 degrees of freedom. R B is independent of the number of parameters.

The process is:

1)Have data y 1, ……., y n

The process is: 1)Have data y 1, ……., y n 2)Generate from data y 1, ……., y n (by integral transform or Gibbs sampler).

The process is: 1)Have data y 1, ……., y n 2)Generate from data y 1, ……., y n (by integral transform or Gibbs sampler). 3)Create ’s

The process is: 1)Have data y 1, ……., y n 2)Generate from data y 1, ……., y n (by integral transform or Gibbs sampler). 3)Create ’s 4)Calculate R B

The process is: 1)Have data y 1, ……., y n 2)Generate from data y 1, ……., y n (by integral transform or Gibbs sampler). 3)Create ’s 4)Calculate R B 5)Repeat steps 2 to 4 to get many R B ’s

The process is: 1)Have data y 1, ……., y n 2)Generate from data y 1, ……., y n (by integral transform or Gibbs sampler). 3)Create ’s 4)Calculate R B 5)Repeat steps 2 to 4 to get many R B ’s 6)By LLN,

We can then report the proportion of R B values that exceeded the 95 th percentile of the reference  2 with k-1 degrees of freedom. If the proportion is higher than what is expected then, the excess can be attributed to dependence between R B values or lack of fit. If the R B values did represent independent draws from the  2, then the proportion of values falling in the critical region of the test would exactly equal the size of the test.

The statistic A is used in the event that formal significance tests must be performed to assess model adequacy.

A is related to a commonly used quantity in signal detection theory and represents the area under the ROC curve [e.g., Hanley and McNeil (1982)] for comparing the joint posterior distribution of R B values to a χ 2 K−1 random variable.

The statistic A is used in the event that formal significance tests must be performed to assess model adequacy. A is related to a commonly used quantity in signal detection theory and represents the area under the ROC curve [e.g., Hanley and McNeil (1982)] for comparing the joint posterior distribution of R b values to a χ 2 K−1 random variable. The expected value of A, if taken with respect to the joint sampling distribution of y and the posterior distribution of θ given y, would be 0.5. Large deviations in the expected value of A from 0.5, when the expectation is taken with respect to the posterior distribution of θ for a fixed value of y, indicate model lack of fit.

Some things to keep in mind Unfortunately, approximating the sampling distribution of A can be a lot of trouble.

Some things to keep in mind Unfortunately, approximating the sampling distribution of A can be a lot of trouble. How do you decide how many bins to make and how to assign probabilities to these bins? Consistency of tests against general alternatives requires that k   as n  .

Some things to keep in mind Unfortunately, approximating the sampling distribution of A can be a lot of trouble. How do you decide how many bins to make and how to assign probabilities to these bins? Consistency of tests against general alternatives requires that k   as n  . Having too many bins can result in loss of power.

Some things to keep in mind Unfortunately, approximating the sampling distribution of A can be a lot of trouble. How do you decide how many bins to make and how to assign probabilities to these bins? Consistency of tests against general alternatives requires that k   as n  . Having too many bins can result in loss of power. Mann and Wald suggested to use 3.8(n-1) 0.4 equiprobable cells.

Example Let y = (y 1, ….., y n ) denote a random sample from a normal distribution with unknown  and  2 Let us assume a joint prior for ( ,  2 ) to be proportional to 1/  2.

For a given data vector y and posterior sample (μ˜,σ˜ ), bin counts m k (μ˜,σ˜ ) are determined by counting the number of observations y i that fall into the interval ( ˜σ  −1 (a k−1 ) + ˜μ, ˜σ  −1 (a k ) + ˜μ), where  −1 (·) denotes the standard normal quantile function. Based on these counts, R B (μ˜,σ˜ ) is calculated according to

Power Calculation The next figure displays the proportion of times in 10,000 draws of t samples that the test statistic A was larger than the 0.95 quantile for the sampled values of A pp. (A pp comes from posterior predictive observations of y).

Essentially, the only requirement for their use is that observations be conditionally independent. Main advantages: Goodness-of-fit tests based on the statistic R B provide a simple way of assessing the adequacy of model fit in many Bayesian models. Values of RB generated from a posterior distribution may prove useful both as a convergence diagnostic for MCMC algorithms and for detecting errors written in computer code to implement these algorithms. From a computational perspective, such statistics can be calculated in a straightforward way using output from existing MCMC algorithms.

There is a later paper written in 2007 that uses the same methodology, but applied to censored data.

Bayesian Chi-square TTE fit Using Bayesian chi-square tests to assess goodness of fit for time-to-event data This software computes the Bayesian chi square test of Valen Johnson [1] for right-censored time-to-event data. It tests the goodness of fit of the best fit to the data from the following distribution families: exponential gamma inverse gamma Weibull log normal log logistic log odds rate

Bayesian chi square test results Input options Filesample1.txt Number of bins16 (default) Discrete timeyes RNG seedfrom system time Notation0 for alive and 1 for dead Bayesian chi square and related statistics Distributionmean X2var X295th percentilep-value boundBICDICDIC # parameters Gamma LogOddsRate LogLogistic LogNormal Weibull InverseGamma Exponential

mean X2 is the Bayesian chi square (BCS) value, the mean of the chi-square values from 1000 samples from the posterior. var X2 is the corresponding sample variances of the chi square values. 95 percentile is this order statistic of the chi-square samples. p-value bound is the upper bound on the p-value corresponding to the order statistic using Rychlik's inequality. BIC is the 'Bayesian' information criteria. DIC is the deviance information criteria. DIC # parameters is the number of effective parameters as measured by the DIC.

Distributionparam1param2param3 Gamma LogOddsRate LogLogistic LogNormal Weibull InverseGamma Exponential This output produced by BCSTTE, Bayesian Chi-Square TTE fit, available at Distribution parameters

Bayesian chi square test results Input options Filesample2.txt Number of bins5 Discrete timeno RNG seed12345 Notation0 for uncensored and 1 for censored Bayesian chi square and related statistics Distributionmean X2var X295th percentilep-value boundBICDICDIC # parameters Gamma LogLogistic LogOddsRate LogNormal Weibull InverseGamma e Exponential e

Distribution parameters Distributionparam1param2param3 Gamma LogLogistic LogOddsRate LogNormal Weibull InverseGamma Exponential

Here is the math. That’s most of it…

Thanks for coming to the talk. Cao, Jing, Moosman, Ann, Johnson, V.E. (2008). ‘A Bayesian Chi-Squared Goodness-of-Fit Test for Censored Data Models.’ UT MD Anderson Cancer Center Department of Biostatistics Working Paper Series