Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 13: Experiments and Observational Studies

Similar presentations


Presentation on theme: "Chapter 13: Experiments and Observational Studies"— Presentation transcript:

1 Chapter 13: Experiments and Observational Studies

2 Music, Anyone? In a 1981 study conducted at Mission Viejo High School in California, researchers compared the scholastic performance of music students with that of non-music students. The music students had a much higher overall grade point average than the non-music students, 3.59 to Not only that, a whopping 16% of the music students had all A’s compared with only 5% on the non-music students. Can we make any conclusions based on this?

3 Observational Studies
An observational study is a study based on data in which no manipulation of factors has been employed researchers don’t assign choices researchers simply observe “what happened” In a retrospective study, subjects are selected and then their previous conditions or behaviors are determined Not based on random samples Usually focus on estimating differences between groups or associations between variables Because retrospective records are based on historical data, they can have errors.

4 Observational Studies
Valuable for discovering trends and possible relationships Widely used in public health and marketing Retrospective studies often try to discover variables related to rare outcomes Problematic – usually restricted to a small part of the entire population More problems – based on historical data and could have errors

5 Prospective Study In a prospective study, subjects are followed to observe future outcomes Typically focus on estimating differences among groups that might appear as the groups are followed during the course of the study

6 Randomized, Comparative Experiments
Is it EVER possible to find a cause-and-effect relationship?! What if we randomly assigned ½ the 3rd graders at a school to take music lessons and forbid the other ½ then looked at the results when the students graduated high school? This would be an experiment. An experiment manipulates factor levels to create treatments, randomly assigns subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels.

7 Experiments An experiment requires random assignment of subjects to treatments Experiments study the relationship between two or more variables The researcher must identify at least one explanatory variable, or factor, to manipulate and at least one response variable to measure

8 Experiments What distinguishes experiments from other types of investigations is that the experimenter actively and deliberately manipulates the factors to control the details of the possible treatments and assigns the subjects to those treatments at random.

9 The Who’s Humans who are experimented on are commonly called subjects or participants Other individuals (rats, bacteria, days, etc.) are commonly referred to by the more generic term experimental unit

10 More Terms in Experiments
The specific values that the experimenter chooses for a factor are called the levels of the factor Example: An experiment that studies sleep deprivation might assign participants to sleep for 4, 6, or 8 hours. These are levels of the factor of sleep

11 More Terms in Experiments
The combination of specific levels from all the factors that an experimental unit receives is known as its treatment Treatments must ALWAYS be assigned to participants randomly!

12 The Four Principles of Experimental Design
Control – we control sources of variation other than the factors we are testing my making conditions as similar as possible for all treatments Randomize – allows us to equalize the effects of unknown or uncontrollable sources of variation (lurking variables) *If experimental units are not assigned to treatments at random, you will NOT be able to draw conclusions from your study!!

13 The Four Principles of Experimental Design
Replicate – repeat, repeat, repeat Within the experiment – apply the treatments to a number of subjects (the outcome of an experiment on a single subject is an anecdote, not data) The entire experiment – an essential step in science (the result of a single experiment doesn’t “prove” anything)

14 The Four Principles of Experimental Design
Block – to reduce the effects of identifiable attributes of the subjects that cannot be controlled The “stratified” equivalent of a survey to an experiment The variable we block is called a blocking variable Group similar individuals together and then randomize within each block to remove much of the variability due to the differences among the blocks

15 Diagrams Diagrams emphasize the random allocation of subjects to treatment groups and show a visual representation of the experiment Group 1 Treatment 1 Random Allocation Compare Group 2 Treatment 2

16 Designing an Experiment
An ad for OptiGro plant fertilizer claims that with this product you will grow “juicier, tastier” tomatoes. You’d like to test this claim, and wonder whether you might be able to get by with half the specified dose. Set up an experiment to test the claim. Let’s work through setting up this design using a completely randomized experiment in one factor.

17 Think Plan – We want to know whether tomato plants grow with OptiGro yield juicier, tastier tomatoes than plants raised in otherwise similar circumstances, but without fertilizer. Response Variable – the juiciness and taste of the tomatoes by asking a panel of judges to rate them on a scale from 1 to 7 on juiciness and in taste. Treatments – the factor is fertilizer, specifically OptiGro fertilizer. We’ll grow tomatoes at 3 different factor levels: some with no fertilizer, some with half the specified amount of OptiGro, and some with the full dose of OptiGro. These are the three treatments.

18 Think Experimental Units – We’ll use 24 tomato plants of the same variety from a local garden store. Experimental Design – Control – We’ll use farm plots near each other so that the plants get similar amounts of sun and rain and experience similar temperatures. We’ll weed the plots equally and otherwise treat the plants alike. Randomize – We’ll randomly assign the plants into three groups using a SRS and random numbers from a table. Replicate – There are 8 plants in each treatment group.

19 Think Make a picture – Group 1 8 plants Treatment 1 Control
24 tomato plants from a garden store Compare juiciness and tastiness Group 2 8 plants Treatment 2 ½ dose Group 3 8 plants Treatment 3 Full fertilizer

20 Think Experimental Design – We will grow the plants until the tomatoes are mature, as judged by reaching a standard color. We’ll harvest the tomatoes when ripe and store them for evaluation. We’ll set up a numerical scale of juiciness and one of tastiness for the taste testers. Several people will taste slices of tomato and rate them.

21 Show We will display the results with side-by-side boxplots to compare the three treatment groups. We will then compare the means of the three groups.

22 Tell If the differences in taste and juiciness among the groups are greater than I would expect by knowing the usual variation among tomatoes, I may be able to conclude that these differences can be attributed to treatment with the fertilizer.

23 Does the Difference Make a Difference?
Statistical Significance – When an observed difference is too large for us to believe that it is likely to have occurred naturally, we consider the difference to be statistically significant. We will cover this idea is great depth later on. For now, we judge loosely based on what we see.

24 Just Checking At one time a method called “gastric freezing” was used to treat people with peptic ulcers. An inflatable bladder was inserted down the esophagus and into the stomach, and then a cold liquid was pumped into the bladder. Now you can find the following notice on the internet site of a major insurance company: [Our company] does not cover gastric freezing (intragastric hypothermia) for chronic peptic ulcer disease… Gastric freezing for chronic peptic ulcer disease is a non- surgical treatment which was popular about 20 years ago but now is seldom performed. It has been abandoned due to a high complication rate, only temporary improvement experienced by patients, and a lack of effectiveness when tested by double-blind, controlled clinical trials.

25 Just Checking What was the factor in this experiment?
What was the response variable? What were the treatments? How did researchers decide which subjects received which treatment? Were the results statistically significant?

26 Experiments and Samples
Both experiments and sample surveys use randomization to get unbiased data (observational studies are not random) Sample surveys attempt to estimate population parameters while experiments assess the effectiveness of treatments. Though random, experiments don’t necessarily have a sampling frame equal to the population. If an experiment is testing the effectiveness of a certain blood pressure medication, the medical researcher will probably only deal with patients that have high blood pressure. Be careful generalizing an experiment for the entire population – can only be generalized for the population studied.

27 Control Treatments Suppose we want to test a $300 piece of software designed to shorten download times. Wouldn’t we want to compare using this software to NOT using it? Control group – the experimental units assigned to a “baseline” treatment level, typically either the default treatment or a placebo. Their responses provide a basis for comparison. Control treatment – the treatment the control group receives.

28 Blinding Humans are notoriously susceptible to errors in judgment. If, for example, an experimenter was interested in whether people prefer Pepsi or Coca Cola, would he tell the subjects of an experiment which cola they were drinking? Why or why not? Blinding – any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups is said to be blind.

29 Types of Blinding There are two classes of individuals that can affect the outcome of the experiment: Those who can influence the results (subjects, treatment administrators, or technicians) Those who evaluate the results (judges, treating physicians, etc.) Single-blind – all the individuals in one of the classes is blinded Double-blind – everyone in both classes is blinded

30 Placebo Often, simply applying ANY treatment can induce improvement. A “fake” treatment that looks just like the treatment being tested is a placebo. A common placebo is a sugar pill. The placebo effect is the tendency of many humans (often 20% or more of experimental subjects) to show a response even when administered a placebo.

31 The “Best” Experiments
are usually… Randomized Double-blind Comparative Placebo-controlled

32 Blocking We wanted to use 24 tomato plants of the same variety for our experiment, but suppose the garden store only had 18 plants left. We drove to another nursery and bought 6 more plants of the same variety. Should we worry that these plants came from a different nursery and perhaps were exposed to differences in rain, sunlight, soil, etc.? Why don’t we block to isolate the variability attributable to the possible differences between the different tomato plants?

33 Back to the Tomatoes Group 1 6 plants Treatment 1 Control Block A
18 tomato plants Compare juiciness and tastiness Group 2 6 plants Treatment 2 ½ dose Group 3 6 plants Treatment 3 Full dose 18 tomato plants from store A and 6 from store B Group 4 2 plants Treatment 1 Control Block A 6 tomato plants Compare juiciness and tastiness Group 5 2 plants Treatment 2 ½ dose Group 6 2 plants Treatment 3 Full dose

34 Different Experimental Designs
Randomized Block Design – randomization occurs only within the blocks Completely Randomized Design – all experimental units have an equal chance of receiving any treatment. Can be single factor, two-factor, three-factor, etc.

35 Back to Observational Studies
Stratified = surveys Blocking = experiments Matching – a pairing of subjects because of similarities not under study in a retrospective or prospective study. For example, a retrospective study on music education may match two students of the same socioeconomic background and grades, but one plays an instrument and one does not.

36 Adding More Factors There are two kinds of gardeners. Some water frequently, making sure that the plants are never dry. Others let Mother Nature take her course and leave the watering up to the rain. The makers of OptiGro want to ensure their product will work under a wide variety of watering conditions. Let’s reevaluate the experiment about OptiGro by adding a second factor, irrigation.

37 Back to the Tomatoes Two factors: How many treatments?
Fertilizer at three levels (none, half, full) Irrigation at two levels (water, no water) How many treatments? No Fert Half Fert Full Fert No Added Water 1 2 3 Daily Watering 4 5 6

38 Compare juiciness and tastiness
Back to the Tomatoes Group 1 4 plants Treatment 1 Control/no water Group 2 4 plants Treatment 2 ½ dose/no water Group 3 4 plants Treatment 3 Full/no water Compare juiciness and tastiness 24 tomato plants Group 4 4 plants Treatment 4 Control/water Group 5 4 plants Treatment 5 ½ dose/water Group 6 4 plants Treatment 6 Full/water

39 Confounding When the levels of one factor are associated with the levels of another factor, we say the factors are confounding. Example: Professor Stephen Ceci of Cornell University performed an experiment to investigate the effect of a teacher’s classroom style on student evaluations. He taught a class in developmental psychology during two successive terms to a total of 472 students in two very similar classes. He kept everything identical (the syllabi, office hours, text, etc.) and only modified his style – using a subdued demeanor in the fall and expansive gestures/enthusiasm in the spring. At the end of the term, the fall students rated him as an average teacher and the spring students rated him as an excellent teacher. The “how much you learned on a scale from 0-5” went from 2.93 in the fall to 4.05 in the spring. However, how much if the difference he observed was due to his difference in manner and how much might have been due to the season of the year? Fall (into winter) is dark, cold, and gloomy while spring is warm, flowers bloom, etc.?

40 Lurking vs. Confounding
A lurking variable is usually thought of as a prior cause of both x and y that makes it appear that x may be causing y. A confounding variable is usually associated in a noncasual way with a factor and affects the response. Because the confounding, we find that we can’t tell whether any effect we see was caused by our factor or by the confounding variable – or even by both working together.

41 What Can Go Wrong? Don’t give up just because you can’t run an experiment – may not be possible to control the factors, may be unethical, etc. When you can’t run an experiment, consider an observational study. Beware of confounding – randomization helps minimize this, but some cannot be avoided. Bad things can happen even to good experiments – record all additional information and as much information as possible Don’t spend your entire budget on the first run – use the first attempt as a “practice” run


Download ppt "Chapter 13: Experiments and Observational Studies"

Similar presentations


Ads by Google