Download presentation
Presentation is loading. Please wait.
Published byGwendoline Nicholson Modified over 7 years ago
1
Aim – How do we assess the differences between an experiment and an observational study?
H.W. – pg 259 #45 – 47, 49 – 51, 53
2
CHAPTER 4 Designing Studies
4.2 Experiments
3
Experiments DISTINGUISH between an observational study and an experiment. EXPLAIN the concept of confounding. IDENTIFY the experimental units, explanatory and response variables, and treatments in an experiment. EXPLAIN the purpose of comparison, random assignment, control, and replication in an experiment. DESCRIBE a completely randomized design for an experiment. DESCRIBE the placebo effect and the purpose of blinding in an experiment. INTERPRET the meaning of statistically significant in the context of an experiment. EXPLAIN the purpose of blocking in an experiment. DESCRIBE a randomized block design or a matched pairs design for an experiment.
4
Observational Study vs. Experiment
An observational study observes individuals and measures variables of interest but does not attempt to influence the responses. An experiment deliberately imposes some treatment on individuals to measure their responses. When our goal is to understand cause and effect, experiments are the only source of fully convincing data. The distinction between observational study and experiment is one of the most important in statistics. In contrast to observational studies, experiments don’t just observe individuals or ask them questions. They actively impose some treatment in order to measure the response.
5
Observation vs. Experiment
Soy good for you? The November 2009 issue of Nutrition Action discusses what the current research tells us about the supposed benefits of soy. For a long time, scientists have believed that the soy foods in Asian diets explain the lower rates of breast cancer, prostate cancer, osteoporosis, and heart disease in places like China and Japan. However, when experiments were conducted, soy either had no effect or a very small effect on the health of the participants. For example, several different studies randomly assigned elderly women to either soy or placebo, and none of the studies showed that soy was more beneficial for preventing osteoporosis. So what explains the lower rates of osteoporosis in Asian cultures? We still don’t know. It could be due to genetics, other dietary factors, or any other difference between Asian cultures and non-Asian cultures.
6
Example cont’d In this example, the explanatory variable was whether soy had any effect on the lower rates of health issues, particularly in the regions where soy is consumed as part of a regular diet. The response variable was the health issues themselves, (i.e. osteoporosis). But in the end, noting was found to be conclusive. There were, as the study stated other variables to consider, such as genetics or even other dietary factors. Whenever we have a situation with more than one variable effects the response variable, and we can’t distinguish between either one, in terms of its impact or result, we call this a confounding variable.
7
Check our understanding so far
Does reducing screen brightness increase batter life in laptop computers? To find out, researchers obtained 30 new laptops of the same brand. They chose 15 of the computers at random and adjusted their screens to the brightest setting. The other 15 laptop screens were left at the default setting – moderate brightness. Researchers then measured how long each machine’s battery lasted. Answer the following: Was this an observational study or an experiment study? Justify your answer: Solution: Experiment because treatment was imposed on the laptops (brightness of screen)
8
Observational Study vs. Experiment
Observational studies of the effect of an explanatory variable on a response variable often fail because of confounding between the explanatory variable and one or more other variables. Well-designed experiments take steps to prevent confounding. Confounding occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other.
9
A.P. Exam tip If you are asked to identify a possible confounding variable in a given setting, you are expected to explain how the variable you chose: Is associated with the explanatory variable. Affects the response variable.
10
Cont’d Does eating dinner with their families improve student’s academic performance? According to an ABC News article, “Teenagers who eat with their families at least 5 times a week are more likely to get better grades in school.” This finding was based on a sample survey conducted by researchers at Columbia University. Was this an observational study or an experiment? Justify your answer. Solution: Observation study, because students were not assigned to eat particular number of meals with their family per week.
11
Cont’d. What are the explanatory and response variables? Solution:
Explanatory – number of meals per week eaten with family. Response - GPA 2. Explain clearly why such a study cannot establish cause and effect relationship. Suggest a variable that may be a confounded with whether families eat dinner together. There are probably other variables that influence the response variable. For example, students who have part-time jobs may not be able to eat many meals with their families and may not have much time to study, leading to possible lower grades.
12
Exit Ticket Look at the following and determine if it is an observational study or an experimental study: Find 100 women, age 20, who do not smoke. Randomly assign 50 of the 100 women to the smoking treatment and the other 50 to the no smoking treatment. Those in the smoking group smoke one pack a day for 10 years while those in the other group are smoke free for 10 years. Measure lung capacity for the 100 women. Analyze, interpret, and draw conclusions from data.
13
Exit Ticket continued:
Find 100 women, age 30, of which 50 have been smoking a pack a day for 10 years, while the other 50 have been smoke free for 10 years. Measure lung capacity for each of the 100 women. Analyze, interpret, and draw conclusions from data.
14
Aim – How can we interpret the language of experimental design?
H.W. – pg. 260 – 261 #59, 66 – 68
15
The Language of Experiments
An experiment is a statistical study in which we actually do something (a treatment) to people, animals, or objects (the experimental units) to observe the response. Here is the basic vocabulary of experiments. A specific condition applied to the individuals in an experiment is called a treatment. If an experiment has several explanatory variables, a treatment is a combination of specific values of these variables. The experimental units are the smallest collection of individuals to which treatments are applied. When the units are human beings, they often are called subjects.
16
Ex – experimental vocabulary:
A study published in the New England Journal of Medicine (March 11, 2010) compared two medicines to treat head lice: an oral medication called ivermectin and a topical lotion containing malathion. Researchers studied 812 people in 376 households in seven areas around the world. Of the 185 households randomly assigned to ivermectin, 171 were free from head lice after 2 weeks compared with only 151 of the 191 households randomly assigned to malathion. Problem: Identify the experimental units, explanatory and response variables, and the treatments in this experiment. Solution: The experimental units are the 376 households, not the 812 people, because the treatments were assigned to entire households, not separately to individuals within the household. The explanatory variable is type of medication, and the response variable is whether the household was lice-free. The treatments were ivermectin and malathion.
17
Another example – if needed
Researchers at the University of North Carolina were concerned about the increasing dropout rate in the state’s high schools, especially for low-income students. Surveys of recent dropouts revealed that many of these students had started to lose interest during middle school. They said they saw littler connection between what they were studying in school and their future plans. To change this perception, researches developed a program called Career Start. The big idea of the program is that teachers show students how the topics they learn get used in specific careers. To test the effectiveness of Career Start, the researchers recruited 14 middle schools in Forsyth County to participate in an experiment. Seven of the schools, determined at random, used Career Start along with the district’s curriculum. The other seven schools just followed the district’s curriculum.
18
Example continued: Researchers followed both groups of students for several years, collecting data on students’ attendance, behavior, standardized test scores, level of engagement in school, and whether or not the students graduated from high school. Results: Students at schools that used Career Start generally had better attendance and fewer discipline problems, earned higher test scores, reported greater engagement in their classes, and were more likely to graduate. Problem – Identify the experimental units, explanatory and response variable, and the treatments in the Career Start Experiment. Solution: The experimental units are 14 middle schools in Forsyth County, NC. The explanatory variable is whether the school used the Career Start program with its students.
19
Example continued There were several response variables: test scores, attendance, behavior, student engagement, and graduation rate. This experiment compares two treatments. The first, the standard middle school curriculum. The second, the curriculum plus Career Start. Again, just as in the previous example, the experimental units are the schools, not the students themselves. Experimental units are the smallest collection of individuals to which treatment is applied. The treatments were given to entire schools, so those are the experimental units.
20
Advantage of experiments
Experiments can give good evidence for causation. In an experiment, we study the effects of the specific treatments we are interested in, while trying to control for the effects of the other variables. There are times when the explanatory variables in an experiment are called factors. Many experiments study the joint effects of several factors. In such an experiment, each treatment is formed by combining a specific value (often called level) of each factor.
21
Example of multifactor experiment:
Growing tomatoes Does adding fertilizer affect the productivity of tomato plants? How about the amount of water given to the plants? To answer these questions, a gardener plants 24 similar tomato plants in identical pots in his greenhouse. He will add fertilizer to the soil in half of the pots. Also, he will water 8 of the plants with 0.5 gallon of water per day, 8 of the plants with 1 gallon of water per day, and the remaining 8 plants with 1.5 gallons of water per day. At the end of 3 months, he will record the total weight of tomatoes produced on each plant. Identify the experimental units or subjects, explanatory and response variables, and the treatments. Solution: The experimental units are the tomato plants. The two explanatory variables (factors) are whether fertilizer is applied and amount of water. The response variable is the weight of tomatoes produced. There are 6 treatments: (1) fertilizer, 0.5 gallon; (2) fertilizer, 1 gallon; (3) fertilizer, 1.5 gallons; (4) no fertilizer, 0.5 gallon; (5) no fertilizer, 1 gallon; (6) no fertilizer, 1.5 gallons.
22
Key ideas: To determine the number of treatments, multiply the number of levels for each factor in the experiment. From the previous example, there were two explanatory variables (fertilizer and water), therefore, we can say, the factor length had two variables and the factor frequency had three levels, (.5 gallons, 1 gallon, and 1.5 gallons): This means there are 2X3 = 6 treatments
23
How to Experiment Badly
Many laboratory experiments use a design like the one in the online SAT course example: Experimental Units Treatment Measure Response In the lab environment, simple designs often work well. Field experiments and experiments with animals or people deal with more variable conditions. Outside the lab, badly designed experiments often yield worthless results because of confounding.
24
Example of experimenting badly:
Many students regularly consume caffeine to help them stay alert. So, it seems plausible that taking caffeine might increase an individual’s pulse rate. Is this true? One way to investigate this claim is to ask volunteers to measure their pulse rates, drink some cola with caffeine, measure their pulse rates again after 10 minutes, and calculate the increase in pulse rate. Unfortunately, even if the pulse rate of every student went up, we couldn’t attribute the increase to caffeine. Perhaps the excitement of being in an experiment made their pulse rates increase. Perhaps it was the sugar in the cola and not the caffeine. Perhaps their teacher told them a funny joke during the 10-minute waiting period and made everyone laugh! In other words, there are many other variables that are potentially confounded with taking caffeine.
25
How to Experiment Well The remedy for confounding is to perform a comparative experiment in which some units receive one treatment and similar units receive another. Most well designed experiments compare two or more treatments. Comparison alone isn’t enough, if the treatments are given to groups that differ greatly, bias will result. The solution to the problem of bias is random assignment. In an experiment, random assignment means that experimental units are assigned to treatments using a chance process.
26
Old example revisited:
Suppose you have a class of 30 students who volunteer to be subjects in the caffeine experiment described earlier. Problem: Explain how you would randomly assign 15 students to each of the two treatments: (a) Using 30 identical slips of paper Solution: Using 30 identical slips of paper, write A on 15 pieces of paper and B on the other 15. Mix them thoroughly in a hat and have each student select one slip of paper. Then ask each student who received an A to drink the cola with caffeine and each student who received a B to drink the cola without caffeine.
27
Continued: (b) Using technology
Number the students from 1 to 30. Use randInt(1,30,15) to select 15 different numbers from 1 to 30. These students will drink the cola with caffeine and the remaining 15 will drink the cola without caffeine. (c) Using Table D Number the students from 01 to 30. Use a line from Table D and read two-digit numbers moving from left to right. The first 15 different numbers from 01 to 30 will identify the students who will drink the cola with caffeine. The remaining 15 students will drink the cola without caffeine.
28
A.P. Exam Tip The method of assigning treatments to experimental units must be random. The method must be described in detail so that people who are knowledgeable enough in statistics can follow your description and carry out the method in exactly the same way. (always remember, be detailed possible)
29
Aim – How do we assess other methods of experimental design?
H.W. – pg. 262 – 263 #69 – 72, 77, 78a, 79, 81 and 85
30
Principles of Experimental Design
The basic principles for designing experiments are as follows: 1. Comparison. Use a design that compares two or more treatments. 2. Random assignment. Use chance to assign experimental units to treatments. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among the treatment groups. 3. Control. Keep other variables that might affect the response the same for all groups. 4. Replication. Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups. Randomized comparative experiments are designed to give good evidence that differences in the treatments actually cause the differences we see in the response.
31
A well – designed experiment:
Researchers in Canada performed an experiment with university students to examine the effects of in-class laptop use on student learning. All participants in the study were asked to attend a university-style lecture and take notes with their laptops. Half of the participants were assigned to complete other non-lecture-related online tasks during the lecture. These tasks were meant to imitate typical student Web browsing during classes. The remaining students simply took notes with their laptops. To assign the treatments, the researchers printed 40 papers with instructions (20 with multitasking and 20 without), shuffled them, and handed them out at random to students in the classroom. At the end of the lecture, all participants took a comprehension test to measure how much they learned from it. The results: students who were assigned to multitask did significantly worse (11%) than students who were not assigned to multitask.
32
A well designed experiment continued:
Explain how each of the four principles of experimental design was used in the multitasking study. Solution: Comparison: The researchers compared students who were multitasking to other students who were not multitasking. Random assignment: It was determined at random which students (and which seat locations) received the instructions to multitask and which students (and seat locations) received the regular instructions. Control: The experiment used undergraduate students from the same university in Canada. All participants listened to the same lecture and took the same comprehension test at the end of the lecture. Replication: There were 20 students in each treatment group. Although this is not a very large number of subjects, the difference in performance was big enough to determine that the difference was due to multitasking and not the chance variation in random assignment.
33
Completely Randomized Design
In a completely randomized design, the treatments are assigned to all the experimental units completely by chance. Some experiments may include a control group that receives an inactive treatment or an existing baseline treatment. Group 1 Group 2 Treatment 1 Treatment 2 Compare Results Random Assignment Experimental Units
34
Ex – television and young people:
Psychologists want to know how different types of television shows impact young children. They recruit 60 four-year-olds and have them watch 9 minutes of a fast-paced children’s program (defined as scene changes every 10–15 seconds), watch 9 minutes of a slow-paced children’s program (defined as scene changes every 30–45 seconds), or draw for 9 minutes. After the 9 minutes, each child will complete several tasks, including tests for mental ability and impulse control. Problem: Describe a completely randomized design for this experiment. Write a few sentences describing how you would implement your design.
35
Ex – solution: To implement the design, use 60 equally sized slips of paper. Label 20 of the slips “1,” 20 of the slips “2,” and 20 of the slips “3.” Mix the numbers in a hat and have each child draw a number without looking. The number that each child chooses will be the group to which he or she is assigned. At the end of the study, compare the mental ability and impulse control for the three treatment groups. (read similar example in your textbook) BTW – Why do you think the researchers had 20 of the children draw for 9 minutes when the study was about different types of television viewing? Answer – this was the control group. If there was no control group, the researchers could compare fast-paced and slow paced television shows but they would not be able to determine if either of these types of shows was worse (or better) than not watching tv at all.
36
Check your understanding:
Music students often don’t evaluate their own performance accurately. Can small group – discussions help? The subjects were 29 students preparing for the end-of-semester performance that is an important part of their grade. The 15 students in one group each videotaped a practice performance, evaluated it themselves, and then discussed the tape with small group of other students. The remaining 14 watched and evaluated their tapes alone. At the end of the semester, the discussion group students evaluated their final performance more accurately. Describe a completely randomized design for this experiment. What is the purpose of the control group in this experiment?
37
Answers: Randomly assign the 29 students to two treatments: evaluating the performance in small groups or evaluating the performance alone. The response variable will be the accuracy of their final performance evaluations. To implement the design, use 29 equally sized slips of paper. Label 15 of them small group and 14 of them alone. Then shuffle the papers and hand them out at random to the 29 students to assign them to a treatment. The purpose of the control group is to provide a baseline for comparison. Without a group to compare to, it is impossible to determine if the small group treatment is more effective.
38
Experiments: What Can Go Wrong?
The logic of a randomized comparative experiment depends on our ability to treat all the subjects the same in every way except for the actual treatments being compared. Good experiments, therefore, require careful attention to details to ensure that all subjects really are treated identically. The response to a dummy treatment is called the placebo effect. In a double-blind experiment, neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received.
39
Another definition Single blind – means one of two things:
Either the individuals who are interacting with the subjects and measuring the response variables don’t know who is receiving which treatment. Or: The subjects are unaware of which treatment they are receiving, but the people interacting with them and measuring the response variable do know.
40
Check our understanding:
In an experiment, researchers examined the effect of ultrasound on birth weight. Pregnant women participating in the study were randomly assigned to one of two groups. The first group of women received an ultrasound, the second group of women did not. When the subjects’ babies were born, their birth weights were recorded. The women who received ultrasounds had heavier babies. Did the experimental design take the placebo effect into account? Why is this important? No. Maybe seeing the image of their unborn child encouraged the mothers who had ultrasound to eat a better diet, resulting in healthier babies.
41
Continued: Was the experimental double-blind? Why is this important?
No, because the mothers knew if they had the ultra-sounds or not. Based on your answer to theses questions, describe an improved design for this experiment. Treat all mothers as if they had an ultrasound, but for some mothers the ultrasound machine wouldn’t be turned on. To prevent mothers from figuring out that the machine was turned off, the ultrasound screen would have to be turned away from all mothers.
42
Inference for Experiments
In an experiment, researchers usually hope to see a difference in the responses so large that it is unlikely to happen just because of chance variation. We can use the laws of probability, which describe chance behavior, to learn whether the treatment effects are larger than we would expect to see if only chance were operating. If they are, we call them statistically significant. An observed effect so large that it would rarely occur by chance is called statistically significant. A statistically significant association in data from a well-designed experiment does imply causation.
43
Blocking When a population consists of groups of individuals that are “similar within but different between,” a stratified random sample gives a better estimate than a simple random sample. This same logic applies in experiments. A block is a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments. In a randomized block design, the random assignment of experimental units to treatments is carried out separately within each block. Completely randomized designs are the simplest statistical designs for experiments. But just as with sampling, there are times when the simplest method doesn’t yield the most precise results.
44
Ex: A popcorn lover wants to know if it is better to use the “popcorn button” on her microwave oven or use the amount of time recommended on the bag of popcorn. To measure how well each method works, she will count the number of unpopped kernels remaining after popping. To obtain the experimental units, she goes to the store and buys 10 bags each of 4 different varieties of microwave popcorn (butter, cheese, natural, and kettle corn), for a total of 40 bags. Explain why a randomized block design might be preferable to a completely randomized design for this experiment.
45
Ex – cont’d. A completely randomized design ignores the difference between the four types of popcorn, which will probably result in a great deal of variability in the number of unpopped kernels for both treatments. For example, if there are many more unpopped kernels in bags of cheese popcorn and kettle corn than the other two varieties, it will be harder to tell if there is a difference in the methods of popping. A randomized block design considers each variety of popcorn separately, which allows us to account for the variability in the number of unpopped kernels created by the difference in varieties. Outline a randomized block design for this experiment. Describe how you would carry out the random assignment required by your design.
46
Ex continued: We will randomly assign 5 bags of each variety to each of the two treatments. To perform the random assignment, place all 10 bags of a particular variety in a large grocery bag. Shake the bag and then randomly select 5 bags to be popped using the “popcorn button.” The remaining 5 bags will be popped using the instructions on the bags. Repeat this process for the remaining 3 varieties. After popping each of the 40 bags in random order, count the number of unpopped kernels in each bag and compare the results within each variety. Then combine the results from the 4 varieties after accounting for the difference in average response for each variety.
47
Blocking sounds a lot like stratifying, doesn’t it? :
How are they different? Pretend you have a mountain of dirty clothes. You want to see the difference between washing dark and light colored clothing with warm and cold water. You need a sample because washing all of them takes to long. Stratifying means you separate the clothes into dark and light colors and you pick a random sample from each color pile. Then you combine them into a miniature sample and you use random assignment to assign the clothes in the sample either cold or warm water treatment.
48
Block vs. stratifying cont’d.
Blocking also means that you separate the clothes into dark and light colors and you pick a simple random sample from each pile. However, instead of combining them to make a miniature sample, you keep them separate and use random assignment within these blocks. For instance – for the dark color pile, you use random assignment to assign half to cold and half to warm water washing, and you repeat the same process for the light pile, keeping them separate the entire time.
49
Matched Pairs Design A common type of randomized block design for comparing two treatments is a matched pairs design. The idea is to create blocks by matching pairs of similar experimental units. A matched pairs design is a randomized blocked experiment in which each block consists of a matching pair of similar experimental units. Chance is used to determine which unit in each pair gets each treatment. Sometimes, a “pair” in a matched-pairs design consists of a single unit that receives both treatments. Since the order of the treatments can influence the response, chance is used to determine with treatment is applied first for each unit.
50
Ex – cont’d. Ex. In the textbook.
51
Experiments DISTINGUISH between an observational study and an experiment. EXPLAIN the concept of confounding. IDENTIFY the experimental units, explanatory and response variables, and treatments in an experiment. EXPLAIN the purpose of comparison, random assignment, control, and replication in an experiment. DESCRIBE a completely randomized design for an experiment. DESCRIBE the placebo effect and the purpose of blinding in an experiment. INTERPRET the meaning of statistically significant in the context of an experiment. EXPLAIN the purpose of blocking in an experiment. DESCRIBE a randomized block design or a matched pairs design for an experiment.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.