Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential.

Slides:



Advertisements
Similar presentations
A Spreadsheet for Analysis of Straightforward Controlled Trials
Advertisements

Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 1: Quantitative Needs in Biological Research.
Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Chapter 11: Sequential Clinical Trials Descriptive Exploratory Experimental Describe Find Cause Populations Relationships and Effect Sequential Clinical.
Clinical Trials Hanyan Yang
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
● Midterm exam next Monday in class ● Bring your own blue books ● Closed book. One page cheat sheet and calculators allowed. ● Exam emphasizes understanding.
Chapter 1: Introduction to Statistics
Biostatistics in Clinical Research Peter D. Christenson Biostatistician January 12, 2005IMSD U*STAR RISE.
Biostatistics for Coordinators Peter D. Christenson REI and GCRC Biostatistician GCRC Lecture Series: Strategies for Successful Clinical Trials Session.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 5: Analysis Issues in Large Observational Studies.
Biostatistics: An Introduction RISE Program 2010 Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center January 15, 2010 Peter D. Christenson.
Chapter 1: The Nature of Statistics
Chapter 2 The Research Enterprise in Psychology. Table of Contents The Scientific Approach: A Search for Laws Basic assumption: events are governed by.
Biostatistics in Practice Peter D. Christenson Biostatistician LABioMed.org /Biostat Session 6: Case Study.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Biostatistics: Study Design Peter D. Christenson Biostatistician Summer Fellowship Program July 2, 2004.
Chapter 5: Producing Data “An approximate answer to the right question is worth a good deal more than the exact answer to an approximate question.’ John.
Biostatistics Case Studies 2008 Peter D. Christenson Biostatistician Session 5: Choices for Longitudinal Data Analysis.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 1: Design and Fundamentals of Inference.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
How to find a paper Looking for a known paper: –Field search: title, author, journal, institution, textwords, year (each has field tags) Find a paper to.
Biostatistics in Practice Session 2: Quantitative and Inferential Issues II Youngju Pak Biostatistician 1.
Statistical Principles for Clinical Research Sponsored by: NIH General Clinical Research Center Los Angeles Biomedical Research Institute at Harbor-UCLA.
CHAPTER 9: Producing Data: Experiments. Chapter 9 Concepts 2  Observation vs. Experiment  Subjects, Factors, Treatments  How to Experiment Badly 
EDCI 696 Dr. D. Brown Presented by: Kim Bassa. Targeted Topics Analysis of dependent variables and different types of data Selecting the appropriate statistic.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 1: Quantitative and Inferential Issues.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 3: Testing Hypotheses.
Biostatistics in Practice Peter D. Christenson Biostatistician Session 4: Study Size for Precision or Power.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 2: Correlation of Time Courses of Simultaneous.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Design of Clinical Research Studies ASAP Session by: Robert McCarter, ScD Dir. Biostatistics and Informatics, CNMC
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Biostatistics in Practice Session 6: Data and Analyses: Too Little or Too Much Youngju Pak Biostatistician
Biostatistics in Practice Peter D. Christenson Biostatistician Session 6: Data and Analyses: Too Little or Too Much.
How Psychologists Do Research Chapter 2. How Psychologists Do Research What makes psychological research scientific? Research Methods Descriptive studies.
CHAPTER 9: Producing Data Experiments ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Biostatistics Case Studies 2006 Peter D. Christenson Biostatistician Session 1: Demonstrating Equivalence of Active Treatments:
Experimental Research
Chapter 5 Data Production
CHAPTER 4 Designing Studies
Biostatistics Case Studies 2016
Article & Final Reviews
CHAPTER 4 Designing Studies
12 Inferential Analysis.
Chapter Eight: Quantitative Methods
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
12 Inferential Analysis.
Statistical Reasoning December 8, 2015 Chapter 6.2
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Psych 231: Research Methods in Psychology
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
Psychological Research Methods and Statistics
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Presentation transcript:

Biostatistics in Practice Youngju Pak Biostatistician Peter D. Christenson Session 1: Quantitative and Inferential Issues

Class Note We will typically have many more slides than are covered in class.

Session 1 Objectives General quantitative needs in biological research Statistical software Protocol examples, with statistical sections Overview of statistical issues using a published paper

Session 1 Objectives General quantitative needs in biological research Statistical software Protocol examples, with statistical sections Overview of statistical issues using a published paper

General Quantitative Needs Descriptive: Appropriate summarization to meet scientific questions: e.g., changes, or % changes, or reaching threshold? mean, or minimum, or range of response? average time to death, or chances of dying by a fixed time?

General Quantitative Needs, Cont’d Inferential: Could results be spurious, a fluke, due to “natural” variations or chance? Sensitivity/Power: How many subjects are needed? Validity: Issues such as bias and valid inference are general scientific ones, but can be addressed statistically.

Session 1 Objectives General quantitative needs in biological research Statistical software Protocol examples, with statistical sections Overview of statistical issues using a published paper

Professional Statistics Software Package Output Enter code; syntax. Stored data; access- ible.

Typical Statistics Software Package Select Methods from Menus Output after menu selection Data in spreadsheet

Microsoft Excel for Statistics Primarily for descriptive statistics. Limited output. No analyses for %s.

Almost Free On-Line Statistics Software Run from browser, not local. Can store data, results on statcrunch server. $5/ 6 months usage.

Free Statistics Software: Mystat

Free Study Size Software

Session 1 Objectives General quantitative needs in biological research Statistical software Protocol examples, with statistical sections Overview of statistical issues using a published paper

Typical Statistics Section of Protocol Overview of study design and goals Randomization/treatment assignment Study size Missing data / subject withdrawal or incompletion Definitions / outcomes Analysis populations Data analysis methods Interim analyses

Public Protocol Registration Attempt to allow the public to be aware of studies that may be negative. Many journals now require registration in order to consider future publication.

Public Protocol Registration

Example of Protocol --- Displayed in Class ---

Session 1 Objectives General quantitative needs in biological research Statistical software Protocol examples, with statistical sections Overview of statistical issues

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Paper with Common Statistical Issues Case Study:

McCann, et al., Lancet 2007 Nov 3;370(9598): Food additives and hyperactive behaviour in 3-year-old and 8/9-year- old children in the community: a randomised, double-blinded, placebo- controlled trial. Target population: 3-4, 8-9 years old children Study design: randomized, double-blinded, controlled, crossover trial Sample size: 153 (3 years), 144(8-9 years) in Southampton UK Objective: test whether intake of artificial food color and additive (AFCA) affects childhood behavior Sampling: Stratified sampling based on SES Baseline measure: 24h recall by the parent of the child’s pretrial diet Group: three groups (mix A, mix B, placebo) Outcomes: ADHD rating scale IV by teachers, WWP hyperactivity score by parents, classroom observation code, Conners continuous performance test II (CPTII)  GHA score

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Selecting Study Subjects

Representative or Random Samples How were the children to be studied selected (second column on the first page)? The authors purposely selected "representative" social classes. Is this better than a "randomly" chosen sample that ignores social class? Often hear: Non-random = Non-scientific.

Case Study: Participant Selection No mention of random samples.

Case Study: Participant Selection It may be that only a few schools are needed to get sufficient individuals. If, among all possible schools, there are few that are lower SES, none of these schools may be chosen. So, a random sample of schools is chosen from the lower SES schools, and another random sample from the higher SES schools.

Selection by Over-Sampling It is not necessary that the % lower SES in the study is the same as in the population. There may still be too few subjects in a rare subgroup to get reliable data. Can “over-sample” a rare subgroup, and then weight overall results by proportions of subgroups in the population. The CDC NHANES studies do this.

Random Samples vs. Randomization We have been discussing the selection of subjects to study, often a random sample. An observational study would, well, just observe them. An interventional study assigns each subject to one or more treatments in order to compare treatments. Randomization refers to making these assignments in a random way.

Why Randomize? ABABAB BABABA ABABAB BABABA ABABAB BABABA Plant breeding example: Compare yields of varieties A and B, planting each to 18 plots. Which design is better? BABABB AABABA BABBAA BBAABB ABABBA AABABA SystematicRandomized

Why Randomize? So that groups will be similar except for the intervention. So that, when enrolling, we will not unconsciously choose an “appropriate” treatment for a particular subject. Minimizes the chances of introducing bias when attempting to systematically remove it, as in plant yield example.

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Basic Study Designs 1. Prospective (longitudinal) 2. Retrospective(Case-Control) 3. Cross sectional 4. Randomized-Control

Case Study: Crossover Design Each child is studied on 3 occasions under different diets. Is this better than three separate groups of children? Why, intuitively? How could you scientifically prove your intuition?

Blocked vs. Unblocked Studies AKA matched vs. unmatched. AKA paired vs. unpaired. Block = Pair = Set receiving all treatments. Set could be an individual at multiple times (pre and post), or left and right arms for sunscreen comparison; twins or family; centers in multi- center study, etc. Block ↔ Homogeneous. Blocking is efficient because treatment differences are usually more consistent among subjects than each separate treatment is.

Potential Efficiency Due to Pairing … … A BA B Δ=B-A … …. Δ 33 3 Unpaired A and B Separate Groups Paired A and B in a Paired Set

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Outcome Measures Generally, how were the outcome measures defined (third page)? They are more complicated here than for most studies. What are the units (e.g., kg, mmol, $, years)? Outcome measures are specific and pre- defined. Aims and goals may be more general.

Summarization / Data Reduction How are the outcome measures summarized? e.g., Table 2:

Case Study: Statistical Comparisons How might you intuitively decide from the summarized results whether the additives have an effect? Different Enough? Clinically? Statistically?

Statistical Comparisons: Figure 3

Statistical Comparisons and Tests of Hypotheses Engineering analogy: Signal and Noise Signal = Diet effect Noise = Degree of precision Statistical Tests: Effect is probably real if signal-to-noise ratio Signal/Noise is large enough. Importance of reducing “noise”, which incorporates subject variability and N.

Back to Efficiency of Design … … A BA B Δ=B-A … …. Δ 33 3 Unpaired A and B Separate Groups Paired A and B in a Paired Set Noise Signal = 3

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Number of Subjects The authors say, in the second column on the fourth page: Intuitively, what should go into selecting the study size? We will make this intuition rigorous in Session 4.

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Other Effects, Potential Biases The top of the second column on the fourth page mentions other effects on diet : The issue here is: Could apparent diet differences (e.g., B vs Placebo) be attributable to something else?

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Non-Completing or Non-Adhering Subjects What is the most relevant group of studied subjects: all randomized, mostly adherent, fully adherent? Study Goal: Scientific effect? Societal impact?

Statistical Issues Subject selection Randomization Efficiency from study design Summarizing study results Making comparisons Study size Attributability of results Efficacy vs. effectiveness Exploring vs. proving

Multiple and Mid-Study Analyses Many more analyses could have been performed on each of the individual behavior ratings that are described in the first column of the 3rd page. Wouldn’t it be negligent not to do them, and miss something? Is there a downside to doing them? Should effects be monitored as more and more subjects complete?

Multiple Analyses GHA: Global Hyperactivity Aggregate Teacher ADHD Parent ADHD Class ADHD Conner … … … … Many Separate Measures Torture data long enough and it will confess to something

Mid-Study Analyses Effect 0 Number of Subjects Enrolled Time → Too many analyses, as on previous slide Wrong early conclusion Need to monitor, but also account for many analyses

Bad Science That May Seem Good 1.Re-examining data, or using many outcomes, seeming to be due diligence. 2.Adding subjects to a study that is showing marginal effects; stopping early due to strong results. 3.Emphasizing effects in subgroups. Actually bad? Could be negligent NOT to do these, but need to account for doing them.