To Add Make it clearer that excluding variables from a model because it is not “predictive” removes all meaning from a CI since this is infinite repetitions.

Slides:



Advertisements
Similar presentations
Traps and pitfalls in medical statistics Arvid Sjölander.
Advertisements

Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
Inference Sampling distributions Hypothesis testing.
Random error, Confidence intervals and p-values Simon Thornley Simon Thornley.
Confidence Intervals © Scott Evans, Ph.D..
Hypothesis Testing: One Sample Mean or Proportion
Lecture 5 Outline: Thu, Sept 18 Announcement: No office hours on Tuesday, Sept. 23rd after class. Extra office hour: Tuesday, Sept. 23rd from 12-1 p.m.
8 - 10: Intro to Statistical Inference
Inference about a Mean Part II
Sample Size Determination
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Hypothesis Testing.
Concepts of Interaction Matthew Fox Advanced Epi.
Chapter 1: Introduction to Statistics
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Topic 5 Statistical inference: point and interval estimate
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
FRAMING RESEARCH QUESTIONS The PICO Strategy. PICO P: Population of interest I: Intervention C: Control O: Outcome.
PARAMETRIC STATISTICAL INFERENCE
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Essential Statistics Chapter 131 Introduction to Inference.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Significance testing and confidence intervals Col Naila Azam.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
MATH 2400 Ch. 15 Notes.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Issues concerning the interpretation of statistical significance tests.
Chapter 8: Confidence Intervals based on a Single Sample
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 8: Confidence Intervals based on a Single Sample
More Contingency Tables & Paired Categorical Data Lecture 8.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
1 Probability and Statistics Confidence Intervals.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Section 9.1 First Day The idea of a significance test What is a p-value?
Estimating a Population Proportion ADM 2304 – Winter 2012 ©Tony Quon.
Hypothesis Tests for 1-Proportion Presentation 9.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
What Is a Test of Significance?
Unit 5: Hypothesis Testing
Random error, Confidence intervals and P-values
Chapter 9 Hypothesis Testing.
Significance Tests: The Basics
Significance Tests: The Basics
Intro to Confidence Intervals Introduction to Inference
Interpreting Epidemiologic Results.
Presentation transcript:

To Add Make it clearer that excluding variables from a model because it is not “predictive” removes all meaning from a CI since this is infinite repetitions Talk about stats being based on sampling variability – assumes sample is a random sample of some super- population (even if narrowly defined), but they are not a random sample, they self-select, so we couldn’t have infinite samples

Random Error I: p-values, confidence intervals, hypothesis testing, etc. Matthew Fox Advanced Epidemiology

Do you like/use p-values?

What is a relative risk? What is a pvalue?

Table 1 of a randomized trial of asbestos and LC FactorAsbestosNo Asbestos pvalue Female10%25%0.032 Smoking60%40%0.351 >60 yrs5%7%0.832 HBP25%24%0.765 Alcohol use37%45%0.152

Which result is more precise? RR 2.0 (95% CI: 1.0 – 4.0) RR 5.0 (95% CI: 2.5 – 10.0)

RR 2.0 (95% CI: 1.0 – 4.0) What are the chances the true results is between 1.0 and 4.0?

If yes, what does it mean to be “by chance?” What is it that is caused by chance? In a randomized trial, could the finding be by change?

This Morning Randomization – Why do we do it? P-values – What are they? – How do we calculate them? – What do they mean? Confidence Intervals

Last Session Selection bias – Results from selection into our out of study related to both exposure and outcome – Structural: conditioning on common effects – Adjustment for selection proportions – Weighting for LTFU Matching – In a case control study, creates selection bias by design, must be controlled in analysis

“There’s a certain feeling of ease and pleasure for me as a scientist that any way you slice the data, it’s statistically significant,” said Dr. Anthony S. Fauci, a top AIDS expert in the United States government, which paid most of the trial’s costs.Anthony S. Fauci

Randomization Randomization lends meaning to likelihoods, p-values and confidence intervals It can reduce the probability of severe confounding to an acceptable level But randomization does not prevent confounding

Greenland: Randomization, statistics and causal inference Objective: – Clarify the meaning and limitations of inferential statistics in the absence of randomization Example — lidocaine therapy after acute MI – Patient 1: doomed – Patient 2: immune – lidocaine therapy assigned at random – two results are equally likely

Greenland: Randomization, statistics and causal inference True RD = 0, so both possible results are confounded Expectation = 0 = (1 + -1)/2 – Statistically unbiased (expectation equals truth) Conclusions – Randomization does not prevent confounding – Randomization does provide a known probability distribution for the possible results under a specified hypothesis about the effect – Statistical unbiasedness of randomized exposure corresponds to an average confounding of zero over the distribution of results

Probability Theory With an assigned probability distribution, can calculate expectation The expectation does not have to be in the set of possible outcomes – Here, the expectation equals zero

Probability If we randomize and assume null is true (as we do when calculating p-values) – We expect half of the subjects to be exposed and half the events to be among the exposed If truly no effect of exposure, all data combinations, permutations are possible – Everyone was either type 1 or 4 – All the events (deaths) would occur regardless of whether assigned the exposure or not

Probability Theory The probability of each possible data result in a 2x2 table is: – A function of the number of combinations (permutations implies order matters) – Probability of each event is number of ways to assign X subjects to exposure out of Y and A events out of a total of B total events – Assumes the margins are fixed

E+E-Total D+??100 D-??900 Total Fixed margins, how many parameters (cells) do I need to estimate to fill in the entire table?

Greenland: Randomization, statistics and causal inference What comfort does this provide scientists trying to interpret a single result? Can make probability of severe confounding small by increasing the sample size E+E-Total D D Total Risk RR0.43RD-0.08

Greenland: Randomization, statistics and causal inference Given there were 100 cases and an even distribution of exposed and unexposed, how many cases would we expect to be exposed? E+E-Total D D Total Risk RR0.43RD-0.08

Greenland: Randomization, statistics and causal inference What comfort does this provide scientists trying to interpret a single result? Can make probability of severe confounding small by increasing the sample size Probability under the null that randomization would yield a result with at least as much downward confounding as the observed result

Back to the counterfactual If association we measure differs from the truth, even if by chance, what explains it? – Unexposed can’t stand in for what would have happened to exposed had they been unexposed This is confounding – But on average, zero confounding – This gives us a probability distribution to calculate the probability of confounding explaining the results This is a p-value

Randomized trial of E on D in 4 patients We find: E+E- D+20 D-02 Total22

Randomized trial of E on D in 4 patients If the null is true, what CST types must they be? E+E- D+20 D-02 Total22

Hypergeometric distribution The hypergeometric distribution: Where X = random variable, x = exposed cases, n = exposed population, M is = total cases, and N = total population. M!/ x!(M-x)!

Spreadsheet

Greenland: Randomization, statistics and causal inference When treatment is assigned by the physician, Expectation depends on physician behavior – Expectation does not necessarily equal truth For observational data we DON’T have probability distribution for confounding – When E isn’t randomized, statistics don’t provide valid probability statements about exposure effects because – p-values, CIs, & likelihoods calculated with assumption all data interchanges are equally likely

Greenland: Randomization, statistics and causal inference A lternatives – Limit statistics to data description (e.g., visual summaries, tables of risks or rates, etc.) – Influence analysis: explore degree to which effect estimates would change under small perturbations of the data, such as interchanging a few subjects – Employ more elaborate statistical models – Sensitivity analysis – At the very least, interpret conventional statistics as minimum estimates of the error

(1) The p-value is: Probability under the test hypothesis (usually the null) that a test statistic would be ≥ to its observed value, assuming no bias in data collection or analysis – Why the null? Our job is to measure – 1-sided upper p-value is test stat ≥ observed value – 1-sided lower p-value is test stat ≤ observed value – Mid-p assigns only half probability of the observation to the 1-sided upper p-value – 2-sided p-value is twice the smaller of the 1-sideds

(2) The p-value is not: Probability that a test hypothesis (null hypothesis) is true – Calculated assuming that test hypothesis is true. – Cannot calculate probability of an event that is assumed in the calculation Probability of observing the result under the test hypothesis (null) [likelihood] – Also includes probability of results more extreme

(3) The p-value is not: An  -level (the Type 1 error rate) – More on that later A significance level – Used to refer to both p-values and Type 1 error rates – Should be avoided to prevent confusion

(4) The 2-sided p-value is not: Because 2-sided p-value is twice smaller of lower and upper 1-sided p- values, which may not be same and may be > 1, it is not the: – Probability that the data would show as strong an association as observed or stronger if the null hypothesis were true; – Probability that a point estimate would be as far or further from the test value as observed

Significance testing: Compares p-value to an arbitrary or conventional Type 1 error rate –  =0.05 Emphasizes decision making, not measurement – Derives from agricultural and industrial applications of statistics – Reflects the roots of epidemiology as the union of statistics and medicine

JNCI announces materials to “help journalists get it right”

Response They acknowledge the definitions were incorrect, however: “We were not convinced that working journalists would find these definitions user- friendly, so we sacrificed precision for utility. We will add references to standard textbooks for journalists who want to learn more.”

Frequentist Statistics

Alternatives to pvalues Two studies which is more precise? – RR 10.0, p = – RR 1.3, p = The pvalue conflates the size of the effect and its precision – RR 10.0, p = 0.039, 95% CI: – RR 1.3, p = 0.062, 95% CI:

Frequentist intervals (1) Definition: – If the statistical model is correct and no bias, a confidence interval derived from a valid test will, over unlimited repetitions of the study, contain the true parameter with a frequency no less than its confidence level (e.g. 95%). But the statistical model is only correct under randomization CAN’T say that the probability the interval includes the truth equals the interval’s coverage probability (e.g., 95%).

Confidence Interval Simulation

Frequentist intervals (2) Advantages – Provides more information than significance tests or p-values: direction, magnitude, and variability – Economical compared with p-value function Disadvantages – Less information than the p-value function – Underlying assumptions (valid statistical model, no bias, repeated experiments)

Approximations: Test-based

Approximations: Wald

Standard Errors (basic over i strata):

How do we measure precision? Width of the confidence interval Measured how? – If I tell you the 95% CI for an RR is 2 to 8, can you tell me the point estimate? – Sqrt(U*L) – Difference measures, just subtract Remember relative measures are on the log scale, so width of a CI is measured by the RATIO of the upper to the lower CI

Frequentist Intervals (4): Interpretation

Conclusion about confidence intervals A CI used for hypothesis testing is an abuse of the CI – The goal is precision, not significance The goal of epi is precision, not significance – A precise null estimate is just as important as a precise significant estimate – An imprecise, statistically significant estimate is as useless as a non-statistically significant, imprecise estimate

What results are published or highlighted in publications? Find a publication with multiple results Rank them in order of precision Then see what is highlighted in the abstract

CIPRA Trial Trial of nurse vs. Doctor managed HIV care For primary results, co-investigators wanted pvalues and confidence intervals Didn’t want hypothesis testing even though was aware people would do it anyway I fought initially, and lost the debate Put in both

CIPRA Trial Reviewer comment: Table 3: Column with p-values can be dropped given that 95% confidence intervals are presented; perhaps mark significance as * (e.g. for p<0.025) and ** (e.g. for p<0.005) after the 95% CI's.

Summary Randomization gives meaning to statistics – Gives a probability distribution for confounding When randomization doesn’t hold, we have no probability distribution Pvalues aren’t probability of chance, null, etc. CIs allows us to assess precision – But are based on infinite repetitions – Do not contain the true value with 95% probability