Designing experiments

Slides:



Advertisements
Similar presentations
Experimental Design Into to Statistics Jeopardy Sampling Credits.
Advertisements

The Practice of Statistics
Experiments and Observational Studies.  A study at a high school in California compared academic performance of music students with that of non-music.
Chapter 1: Introduction to Statistics
Experimental Design making causal inferences Richard Lambert, Ph.D.
Part III Gathering Data.
Collection of Data Chapter 4. Three Types of Studies Survey Survey Observational Study Observational Study Controlled Experiment Controlled Experiment.
Experiments Main role of randomization: Assign treatments to the experimental units. Sampling Main role of randomization: Random selection of the sample.
Observations vs. Experiments Target Goals: I can distinguish between an observational study and an experiment. I can explain how a lurking variable in.
Statistics 300: Introduction to Probability and Statistics Section 1-4.
Ten things about Experimental Design AP Statistics, Second Semester Review.
Experiments Textbook 4.2. Observational Study vs. Experiment Observational Studies observes individuals and measures variables of interest, but does not.
1.3 Experimental Design. What is the goal of every statistical Study?  Collect data  Use data to make a decision If the process to collect data is flawed,
AP Statistics Review Day 2 Chapter 5. AP Exam Producing Data accounts for 10%-15% of the material covered on the AP Exam. “Data must be collected according.
Copyright © 2009 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Topic 2: Types of Statistical Studies
Topic 1: Samples and Populations
Unit 1 Section 1.3.
Statistics: Experimental Design
Elementary Statistics
CHAPTER 4 Designing Studies
Chapter 5 Data Production
Sampling and Experimentation
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Experiments and Inference About Cause
CHAPTER 4 Designing Studies
Probability and Statistics
Data Collection Principles
Planning and Conducting a Study
Inferential Statistics and Probability a Holistic Approach
Observational Studies and Experiments
Principles of Experiment
Section 1.3 Data Collection and Experimental Design.
Chapter 13 Experimental and Observational Studies
Producing Data, Randomization, and Experimental Design
Producing Data, Randomization, and Experimental Design
4.2 Experiments.
Experiments and Observational Studies
CHAPTER 4 Designing Studies
Experimental Design Basics
Data Collection and Sampling
Ten things about Experimental Design
AP Statistics Jeopardy
Designing Experiments
Use your Chapter 1 notes to complete the following warm-up.
CHAPTER 4 Designing Studies
Warm Up Imagine you want to conduct a survey of all public high school seniors in San Jose to determine how many colleges they plan to apply to this year.
CHAPTER 11: Producing Data— Part II Review
Into to Statistics Jeopardy
Chapter 4: Designing Studies
Observational Studies
Statistical Reasoning December 8, 2015 Chapter 6.2
CHAPTER 4 Designing Studies
Unit 10 Statistics Part 1.
CHAPTER 4 Designing Studies
Experiments & Observational Studies
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Understanding Basic Statistics
Chapter 4: Designing Studies
Chapter 3 producing data
Chapter 5.2 Designing Experiments
Understanding Basic Statistics
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Introductory Statistics Introductory Statistics
CHAPTER 4 Designing Studies
Probability and Statistics
Types of Statistical Studies and Producing Data
Presentation transcript:

Designing experiments How to collect the data you need

OUTLINE of topics Avoid obvious problems with: How to do sampling. The question Sampling Variables How to do sampling. Principles of experimental design.

Data collection Consider the following 3 research questions: What is the average mercury content of swordfish in The Atlantic Ocean? Over the last 5 years, what is the average time to degree for Duke undergrads? Does a new drug reduce the number of deaths in patients with severe heart disease? Each question has a specific target population. Usually impossible to study the entire target population so sampling is used.

Anecdotal evidence A man on the news got mercury poisoning from eating swordfish, so the average mercury concentration in swordfish must be dangerously high. I met two students who took more than 7 years to graduate from Duke, so it must take longer to graduate at Duke than at many other colleges. My friend’s dad had a heart attack and died after they gave him a new heart disease drug, so the drug must not work. The evidence may well be true and verifiable but is not likely to represent the target population very well.

Sampling from a pop Ex: How long for Duke students to graduate? Random selection is vital. How might this sample have been collected?

What happened here? This is a convenience sample. Probably introduces bias into the sample.

Other forms of bias Non-response bias – sampling protocol may be random but introduce unintentional bias. Most often seen in surveys Surveys sent to random sample of population but only answered by a certain subset of pop. *pause - Q: If 50% of the online reviews of a product are negative, do you think this means that 50% of buyers are dissatisfied?

Explanatory and response variables Explanatory variable is thought to affect response variable Causal relationship is NOT guaranteed. Labels are used to keep track of which variable might affect the other. There can be multiple explanatory variables.

Observational studies Data is collected without interfering in how the data arises. Ex: collect data from surveys, medical records or follow a cohort (group of similar individuals) over time. Can demonstrate association NOT causation. Ex: An observational study found that increased sunscreen usage was associated with increased skin cancer. True verifiable data but what does it mean? *pause – What else might be going on?

Confounding variables A variable that is correlated with both explanatory and response variables. Can you think of any confounding variables to explain this relationship?

How to get random samples Almost all statistical methods are based on having random samples from a population. 3 types: Simple random sampling Stratified sampling Cluster sampling

Simple random Every member of the population has an equal chance of being sampled. Knowing one member provides no info about other members.

Stratified sampling First create strata – similar cases are grouped together. Strata often based on ordinal categorical variables. Must sample from all strata equally.

Clustered sampling Cases placed into clusters. Some clusters randomly picked and then simple random sample taken from selected clusters. Most useful when: Inter-cluster variability is low. Intra-cluster variability is high.

Principles of experimental design Treatments are assigned to cases. Contrast with observational studies. Randomization is necessary to show a causal connection between variables. 4 principles: Control Randomization Replication Blocking

Controlling – minimize or eliminate any differences between groups. Ex: drug is administered to experimental group in pill form. How do you manage control? Randomization – individuals randomly assigned to groups to minimize influence of other factors. Some indivs might be more susceptible to disease due to diet. Mix indivs with high and low quality diets into groups.

Replication – more cases that are studied, the better we can understand how explanatory and response variables are related. Large samples act as replicates. Replicating entire experiment even better if $, etc allows. Blocking – other variables (non explanatory) may influence response. Must control for this. First group cases into blocks – indivs share characteristic in common. Randomly assign members from each block to control and experimental groups.

Is there any bias that might arise Is there any bias that might arise? Is there anything that has not been controlled?

Blind studies and placebos From previous example: If patients are aware of treatment vs. no treatment: may lead to emotional effects that are hard to quantify possibly influence response variable. Make study blind – patients do not know whether they are receiving treatment or placebo. Double blind – researchers also do not know who receives treatment until study has concluded.