P Values - part 3 The P value as a ‘statistic’ Robin Beaumont 1/03/2012 With much help from Professor Geoff Cumming.

Slides:



Advertisements
Similar presentations
Chapter 3 Properties of Random Variables
Advertisements

Review bootstrap and permutation
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Probability Probability; Sampling Distribution of Mean, Standard Error of the Mean; Representativeness of the Sample Mean.
Central Limit Theorem.
Sampling Distributions
The Basics of Regression continued
Chapter Sampling Distributions and Hypothesis Testing.
Edpsy 511 Homework 1: Due 2/6.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Continuous Random Variables and Probability Distributions.
Variance Fall 2003, Math 115B. Basic Idea Tables of values and graphs of the p.m.f.’s of the finite random variables, X and Y, are given in the sheet.
Continuous Probability Distribution  A continuous random variables (RV) has infinitely many possible outcomes  Probability is conveyed for a range of.
BPT 2423 – STATISTICAL PROCESS CONTROL.  Frequency Distribution  Normal Distribution / Probability  Areas Under The Normal Curve  Application of Normal.
AM Recitation 2/10/11.
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Statistical Analysis Statistical Analysis
SECTION 6.4 Confidence Intervals for Variance and Standard Deviation Larson/Farber 4th ed 1.
Slide 1 Copyright © 2004 Pearson Education, Inc..
P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Theory of Probability Statistics for Business and Economics.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
P Values - part 4 The P value and ‘rules’ Robin Beaumont 10/03/2012 With much help from Professor Geoff Cumming.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Variability. Statistics means never having to say you're certain. Statistics - Chapter 42.
P Values - part 2 Samples & Populations Robin Beaumont 11/02/2012 With much help from Professor Chris Wilds material University of Auckland.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
§ 5.3 Normal Distributions: Finding Values. Probability and Normal Distributions If a random variable, x, is normally distributed, you can find the probability.
Introduction to Statistical Inference Jianan Hui 10/22/2014.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
RESEARCH & DATA ANALYSIS
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
P Values Robin Beaumont 8/2/2012 With much help from Professor Chris Wilds material University of Auckland.
© Copyright McGraw-Hill 2004
© 2010 Pearson Prentice Hall. All rights reserved 7-1.
Outline of Today’s Discussion 1.The Distribution of Means (DOM) 2.Hypothesis Testing With The DOM 3.Estimation & Confidence Intervals 4.Confidence Intervals.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
P Values - part 2 Samples & Populations Robin Beaumont 2011 With much help from Professor Chris Wilds material University of Auckland.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
Section 6.4 Inferences for Variances. Chi-square probability densities.
From the population to the sample The sampling distribution FETP India.
1 Probability and Statistics Confidence Intervals.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Confidence Intervals. Point Estimate u A specific numerical value estimate of a parameter. u The best point estimate for the population mean is the sample.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Theoretical distributions: the Normal distribution.
Introduction to Marketing Research
Confidence Intervals and Sample Size
Measures of Dispersion
Hypothesis Testing: One Sample Cases
IEE 380 Review.
Assumptions For testing a claim about the mean of a single population
AP Statistics: Chapter 7
AP Biology Intro to Statistics
Continuous Random Variable
Quantitative Methods in HPELS HPELS 6210
Summary of Tests Confidence Limits
Chapter 5 Normal Probability Distributions.
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

P Values - part 3 The P value as a ‘statistic’ Robin Beaumont 1/03/2012 With much help from Professor Geoff Cumming

P values - Putting it all together

Summary so far A P value is a conditional probability which considers a range of outcomes – shown as a ‘area’ in a graph. The SEM formula allows us to: predict the accuracy of your estimate ( i.e. the mean value of our sample) across a infinite number of samples! Review

Summary so far A statistic is just a summary measure, technically we have reduced a set of data to one or two values: Range (smallest – largest) Mean, median etc. Inter-quartile range, SD Variance Z score, T value, chi square value, F value etc P value What is a statistic?

T value T statistic – different types, simplest 1 sample: So when t = 0 means 0/anything = estimated and hypothesised population mean are equal So when t = 1 observed different same as SEM So when t = 10 observed different much greater than SEM

T statistic example Serum amylase values from a random sample of 15 apparently healthy subjects. The mean = 96 SD= 35 units/100 ml. How likely would such a ‘unusual’ sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) This looks like a rare occurrence? The population value = the null hypothesis

t density:s x = n = t Shaded area = Original units: 0 Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a unusual sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) What does the shaded area mean! Given that the sample was obtained from a population with a mean of 120 a sample with a T (n=15) statistic of or or one more extreme will occur 1.8% of the time = just under two samples per hundred on average... Given that the sample was obtained from a population with a mean of 120 a sample of 15 producing a mean of 96 (120-x where x=24) or 144 (120+x where x=24) or one more extreme will occur 1.8% of the time, that is just under two samples per hundred on average. =P value P value = 2 · P(t (n−1) < t| H o is true) = 2 · [area to the left of t under a t distribution with df = n − 1]

P value and probability for the one sample t statistic p value = 2 x P( t (n-1) values more extreme than obtained t (n-1) | H o is true) = 2 X [area to the left of t under a t distribution with n − 1 shape] Statistic -> sampling distribution -> PDF -> p value No sampling distribution! Create a virtual one

P Value Variability Taking another random sample the P value be different How different? – Does not follow a normal distribution Depends upon the probability of the null hypothesis being true! Remember we have assumed so far that the null hypothesis is true. Dance of the p values – Geoff Cummings

Simplified dance of the p values when the null hypothesis is true Example from Geoff Cummings dance of the p values The take home message is that we can obtain very small p values even when the null hypothesis is true.

P value -> statistic but Not all statistics represent values that are reflected in a population value Other ways of getting an idea of variability across trials: Reproducibility Probability Value (RP) Why no CI for the P Value if it varies across trials Goodman 1992 and also 2001 journal articles Hung, O’Neill, Bauer & Kohne 1997 Biometrics journal Shao & Chow 2002 – Statistics in Medicine journal Boos & Stefanki 2011 – Journal of the American statistical association Cummings and book

Cumming’s Reproducibility (replication) Probability Value Given P obtained = 0.05 What is the interval in which we are likely to see 80% of subsequent P values? Answer: We have 80% of seeing subsequent p values fall within the zero to 0.22 boundary 0, 0.22 [One sided] This means that we have a 20% of them being subsequently > 0.22

What about when the null hypothesis is not true?