Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Gathering Data.

Similar presentations


Presentation on theme: "Chapter 4 Gathering Data."— Presentation transcript:

1 Chapter 4 Gathering Data

2 Looking Back In Chapters 2 & 3 we learned how to describe data both graphically and numerically. For these statistical analyses to be useful, we must have good data. In fact, the way a study is designed (how we gather data) can have a major impact on the results of the study. The purpose of this course is for you to learn what you can conclude about an entire population given a sample from that population. If a study is poorly designed and implemented, the results may be meaningless or misleading.

3 Two Scenarios Study 1 Study 2
A U.S. study (2000) compared 469 patients with brain cancer to 422 patients who did not have brain cancer. The patients’ cell phone use was measured using a questionnaire. The two groups’ use of cell phones was similar. Study 2 An Australian study (1997) conducted a study with 200 transgenic mice. One hundred were exposed for two 30 minute periods a day to the same kind of microwaves with roughly the same power as the kind transmitted from a cell phone. The other 100 mice were not exposed. After 18 months, the brain tumor rate for the exposed mice was twice as high as that for the unexposed mice. Example taken from Statistics: The Art and Science of Learning from Data

4 Questions to Consider How do the two studies differ? Study 1 Study 2

5 Questions to Consider How do the two studies differ? Study 1 Study 2
No treatments assigned Patients merely questioned Study 2

6 Questions to Consider How do the two studies differ? Study 1 Study 2
No treatments assigned Patients merely questioned Study 2 Uses mice in hopes of generalizing to humans

7 Questions to Consider Why do the results of different medical studies sometimes disagree? Could the second study be performed on human beings?

8 Questions to Consider Why do the results of different medical studies sometimes disagree? Differing types of studies, data collection or sample frames Could the second study be performed on human beings?

9 Questions to Consider Why do the results of different medical studies sometimes disagree? Differing types of studies, data collection or sample frames Could the second study be performed on human beings? No, because it would be unethical to knowingly expose humans to possibly harmful waves.

10 Questions to Consider Suppose a friend recently diagnosed with brain cancer was a frequent cell phone user. Is this strong evidence that frequent cell phone use increases the likelihood of getting brain cancer? Informal observations of this type are called _____________ _____________. You should rely on reputable research studies, not anecdotes.

11 Questions to Consider Suppose a friend recently diagnosed with brain cancer was a frequent cell phone user. Is this strong evidence that frequent cell phone use increases the likelihood of getting brain cancer? Informal observations of this type are called anecdotal evidence. You should rely on reputable research studies, not anecdotes.

12 Two Main Ways to Gather Data
Observational Study The researcher observes values of the response and explanatory variables for the sampled subjects without imposing any treatments Example: Experiment The researcher assigns experimental conditions (also called treatments) to subjects (also called experimental units) and then observes outcomes on the response variable. Treatments correspond to values of the explanatory variable

13 Two Main Ways to Gather Data
Observational Study The researcher observes values of the response and explanatory variables for the sampled subjects without imposing any treatments Example: Study 1 Experiment The researcher assigns experimental conditions (also called treatments) to subjects (also called experimental units) and then observes outcomes on the response variable. Treatments correspond to values of the explanatory variable Example:

14 Two Main Ways to Gather Data
Observational Study The researcher observes values of the response and explanatory variables for the sampled subjects without imposing any treatments Example: Study 1 Experiment The researcher assigns experimental conditions (also called treatments) to subjects (also called experimental units) and then observes outcomes on the response variable. Treatments correspond to values of the explanatory variable Example: Study 2

15 Advantages of Experiments over Observational Studies
In an observational study, there can always be lurking variables affecting the results. This means that observational studies can _________ show causation. It is easier to adjust for lurking variables in an experiment. In general, we can study the effect of an explanatory variable on a response variable more accurately with an experiment than with an observational study.

16 Advantages of Experiments over Observational Studies
In an observational study, there can always be lurking variables affecting the results. This means that observational studies can never show causation. It is easier to adjust for lurking variables in an experiment. In general, we can study the effect of an explanatory variable on a response variable more accurately with an experiment than with an observational study.

17 Disadvantages of Experiments
They can be ____________ to perform on the subjects in which you are interested. It can be difficult to monitor subjects to ensure that they are doing what they are told. They can take many years, even decades, to complete. Results of experiments that use animals do not ______________ to humans. They are unnecessary when the question of interest does not involve trying to assess _____________.

18 Disadvantages of Experiments
They can be unethical to perform on the subjects in which you are interested. It can be difficult to monitor subjects to ensure that they are doing what they are told. They can take many years, even decades, to complete. Results of experiments that use animals do not ______________ to humans. They are unnecessary when the question of interest does not involve trying to assess _____________.

19 Disadvantages of Experiments
They can be unethical to perform on the subjects in which you are interested. It can be difficult to monitor subjects to ensure that they are doing what they are told. They can take many years, even decades, to complete. Results of experiments that use animals do not generalize to humans. They are unnecessary when the question of interest does not involve trying to assess _____________.

20 Disadvantages of Experiments
They can be unethical to perform on the subjects in which you are interested. It can be difficult to monitor subjects to ensure that they are doing what they are told. They can take many years, even decades, to complete. Results of experiments that use animals do not generalize to humans. They are unnecessary when the question of interest does not involve trying to assess causality.

21 Example 4.1 A large study of student drug use and how it depends on drug testing enrolled 76,000 middle and high school students. Each student in the study filled out a questionnaire. One question asked whether the student used drugs. The study found that drug use was not affected by student drug testing. This is an example of an Could there be any lurking variables? Example taken from Statistics: The Art and Science of Learning from Data

22 Example 4.1 A large study of student drug use and how it depends on drug testing enrolled 76,000 middle and high school students. Each student in the study filled out a questionnaire. One question asked whether the student used drugs. The study found that drug use was not affected by student drug testing. This is an example of an observational study. Could there be any lurking variables? Example taken from Statistics: The Art and Science of Learning from Data

23 Example 4.1 A large study of student drug use and how it depends on drug testing enrolled 76,000 middle and high school students. Each student in the study filled out a questionnaire. One question asked whether the student used drugs. The study found that drug use was not affected by student drug testing. This is an example of an observational study. Could there be any lurking variables? Frequency of drug testing, whether testing is random, etc. Example taken from Statistics: The Art and Science of Learning from Data

24 Used with permission from Dr. Ellen Toby
Example 4.2 A researcher buys seeds of two different varieties of corn. He randomly selects 30 seeds of each variety and plants them in his backyard, making sure to label the location of each seed and its type. He then measures how long it takes each seed to sprout. At the end of the study he compares the average germination time of the different varieties. This is an example of an Could there be any lurking variables? Used with permission from Dr. Ellen Toby

25 Used with permission from Dr. Ellen Toby
Example 4.2 A researcher buys seeds of two different varieties of corn. He randomly selects 30 seeds of each variety and plants them in his backyard, making sure to label the location of each seed and its type. He then measures how long it takes each seed to sprout. At the end of the study he compares the average germination time of the different varieties. This is an example of an experiment. Could there be any lurking variables? Used with permission from Dr. Ellen Toby

26 Used with permission from Dr. Ellen Toby
Example 4.2 A researcher buys seeds of two different varieties of corn. He randomly selects 30 seeds of each variety and plants them in his backyard, making sure to label the location of each seed and its type. He then measures how long it takes each seed to sprout. At the end of the study he compares the average germination time of the different varieties. This is an example of an experiment. Could there be any lurking variables? Soil quality, temperature Used with permission from Dr. Ellen Toby

27 Used with permission from Dr. Ellen Toby
Example 4.3 A researcher has seeds of only one variety of tomato. She has 60 nearly identical pots of soil and plants one tomato seed in each. She randomly selects 30 pots and keeps them at 75° F. The other 30 pots she keeps at 65° F. Aside from temperature, she provides the same growing conditions to all pots. She then measures how long it takes for the seeds to sprout. At the end of the study she compares the average germination time of the different temperature groups. This is an example of an Are there any lurking variables? Used with permission from Dr. Ellen Toby

28 Used with permission from Dr. Ellen Toby
Example 4.3 A researcher has seeds of only one variety of tomato. She has 60 nearly identical pots of soil and plants one tomato seed in each. She randomly selects 30 pots and keeps them at 75° F. The other 30 pots she keeps at 65° F. Aside from temperature, she provides the same growing conditions to all pots. She then measures how long it takes for the seeds to sprout. At the end of the study she compares the average germination time of the different temperature groups. This is an example of an experiment. Are there any lurking variables? Used with permission from Dr. Ellen Toby

29 Used with permission from Dr. Ellen Toby
Example 4.3 A researcher has seeds of only one variety of tomato. She has 60 nearly identical pots of soil and plants one tomato seed in each. She randomly selects 30 pots and keeps them at 75° F. The other 30 pots she keeps at 65° F. Aside from temperature, she provides the same growing conditions to all pots. She then measures how long it takes for the seeds to sprout. At the end of the study she compares the average germination time of the different temperature groups. This is an example of an experiment. Are there any lurking variables? No, everything has been controlled here. Used with permission from Dr. Ellen Toby

30 Types of Observational Studies
Retrospective Observational studies that look back in time This is sometimes done to find risk factors for certain diseases Cross-Sectional Observational studies that take a cross section of the population at the current time Prospective Observational studies in which subjects are followed into the future

31 Sampling Designs for Observational Studies
Simple Random Sampling (SRS) A simple random sample of n subjects from a population is one in which each possible sample of that size has the _______ chance of being selected.

32 Sampling Designs for Observational Studies
Simple Random Sampling (SRS) A simple random sample of n subjects from a population is one in which each possible sample of that size has the same chance of being selected.

33 Sampling Designs for Observational Studies
Stratified Sampling A stratified random sample divides the population into separate groups, called strata, and then selects an SRS of _________ from each stratum.

34 Sampling Designs for Observational Studies
Stratified Sampling A stratified random sample divides the population into separate groups, called strata, and then selects an SRS of subjects from each stratum.

35 Sampling Designs for Observational Studies
Cluster Sampling A cluster random sample can be used if the target population naturally divides into groups, each of which is representative of the entire target population. In this method, a SRS of ________(or strata) is taken. Every member of the selected groups is put into the sample.

36 Sampling Designs for Observational Studies
Cluster Sampling A cluster random sample can be used if the target population naturally divides into groups, each of which is representative of the entire target population. In this method, a SRS of groups (or strata) is taken. Every member of the selected groups is put into the sample.

37 Sampling Designs for Observational Studies
Systematic Sampling A systematic sample selects every kth person from the sample frame. The researcher randomly selects a number between 1 and k in order to know which person to select first, then selects every kth person after this.

38 Advantages of the Various Sampling Designs
Simple Random Sampling (SRS) It is the easiest most widespread form of sampling. Each subject has an _______ chance to be in the sample. The sample enables us to determine how likely it is that descriptive statistics (like the sample mean) fall close to corresponding values for which we would like to make inference (like the population mean).

39 Advantages of the Various Sampling Designs
Simple Random Sampling (SRS) It is the easiest most widespread form of sampling. Each subject has an equal chance to be in the sample. The sample enables us to determine how likely it is that descriptive statistics (like the sample mean) fall close to corresponding values for which we would like to make inference (like the population mean).

40 Advantages of the Various Sampling Designs
Stratified Sampling It ensures that there are enough _________ in each group that you want to compare. Cluster Sampling It does not require a sampling frame of subjects. It is less ___________ to implement.

41 Advantages of the Various Sampling Designs
Stratified Sampling It ensures that there are enough subjects in each group that you want to compare. Cluster Sampling It does not require a sampling frame of subjects. It is less ___________ to implement.

42 Advantages of the Various Sampling Designs
Stratified Sampling It ensures that there are enough subjects in each group that you want to compare. Cluster Sampling It does not require a sampling frame of subjects. It is less expensive to implement.

43 Bias in Sampling A sampling method is _________ if
The sample tends to favor some parts of the population over others. In other words, the results from the sample are not representative of the population. Obviously, __________ samples are our goal.

44 Bias in Sampling A sampling method is biased if
The sample tends to favor some parts of the population over others. In other words, the results from the sample are not representative of the population. Obviously, __________ samples are our goal.

45 Bias in Sampling A sampling method is biased if
The sample tends to favor some parts of the population over others. In other words, the results from the sample are not representative of the population. Obviously, unbiased samples are our goal.

46 Types of Bias Undercoverage Nonresponse bias Response bias
Occurs when a sampling frame leaves out some groups in the population Nonresponse bias Occurs when some sampled subjects cannot be reached, refuse to participate or fail to answer some questions Response bias Occurs when the subject gives an incorrect response or when the question wording or the way the interviewer asks the questions is confusing or misleading

47 Examples of Poor Samples that Result in Bias
Convenience Samples Voluntary Response Samples

48 Examples of Poor Samples that Result in Bias
Convenience Samples Sampling friends Sampling at the mall Voluntary Response Samples

49 Examples of Poor Samples that Result in Bias
Convenience Samples Sampling friends Sampling at the mall Voluntary Response Samples Internet surveys Call-in surveys

50 Example 4.4 In 1997 in her book Women and Love, Shere Hite presented results of a survey mailed to 100,000 women in the United States. One of her conclusions was that 70% of women who had been married at least five years have extramarital affairs. She based this conclusion on the replies of only 4500 women. This is an example of Example taken from Statistics: The Art and Science of Learning from Data

51 Example 4.4 In 1997 in her book Women and Love, Shere Hite presented results of a survey mailed to 100,000 women in the United States. One of her conclusions was that 70% of women who had been married at least five years have extramarital affairs. She based this conclusion on the replies of only 4500 women. This is an example of nonresponse bias. Example taken from Statistics: The Art and Science of Learning from Data

52 Used with permission from Dr. Ellen Toby
Example 4.5 Ann Landers asked readers, “If you had it to do over again, would you have children?” A few weeks later, her column was headlined, “70% OF PARENTS SAY KIDS NOT WORTH IT.” Of the nearly 10,000 parents who wrote in, 70% said they would not have children if they could go back in time. This is an example of ______________________ sampling. Used with permission from Dr. Ellen Toby

53 Used with permission from Dr. Ellen Toby
Example 4.5 Ann Landers asked readers, “If you had it to do over again, would you have children?” A few weeks later, her column was headlined, “70% OF PARENTS SAY KIDS NOT WORTH IT.” Of the nearly 10,000 parents who wrote in, 70% said they would not have children if they could go back in time. This is an example of voluntary response sampling. Used with permission from Dr. Ellen Toby

54 Example 4.6 In 1936, the Literary Digest conducted a poll to predict the winner of the presidential election. Alf Landon and Franklin Roosevelt were both running for president. The sample frame for the poll was constructed from telephone directories, country club memberships and automobile registrations. The Digest predicted that Landon would win, but in reality FDR won by a landslide. This is an example of _____________ sampling that resulted in _______________. Example taken from Statistics: The Art and Science of Learning from Data

55 Example 4.6 In 1936, the Literary Digest conducted a poll to predict the winner of the presidential election. Alf Landon and Franklin Roosevelt were both running for president. The sample frame for the poll was constructed from telephone directories, country club memberships and automobile registrations. The Digest predicted that Landon would win, but in reality FDR won by a landslide. This is an example of convenience sampling that resulted in undercoverage. Example taken from Statistics: The Art and Science of Learning from Data

56 Used with permission from Dr. Ellen Toby
Example 4.7 An experiment involving adolescent males (ages 15-19) appeared in Science, The purpose of the study was to determine whether there was an association between survey techniques and the desire to give socially acceptable answers. The participants were randomly assigned to one of two different survey forms, each of which had identical questions concerning sexual practices and drug habits. Used with permission from Dr. Ellen Toby

57 Example 4.7 The two versions of the survey were
Paper: participants put answers in an envelope with ID# on it and return in person Computer: participants listened to questions in headphones and then answered on laptops.

58 Types of Experimental Studies
Completely Randomized Design The subjects are randomly assigned to one of the treatments. Matched Pairs Design Each subject is matched up with another subject who is similar in terms of age, health, etc. This creates a ______________ _______. The treatments are then randomly assigned to the subjects in each pair. This ensures that the treatment groups are essentially ______________.

59 Types of Experimental Studies
Completely Randomized Design The subjects are randomly assigned to one of the treatments. Matched Pairs Design Each subject is matched up with another subject who is similar in terms of age, health, etc. This creates a matched pair. The treatments are then randomly assigned to the subjects in each pair. This ensures that the treatment groups are essentially ______________.

60 Types of Experimental Studies
Completely Randomized Design The subjects are randomly assigned to one of the treatments. Matched Pairs Design Each subject is matched up with another subject who is similar in terms of age, health, etc. This creates a matched pair. The treatments are then randomly assigned to the subjects in each pair. This ensures that the treatment groups are essentially identical.

61 Types of Experimental Studies
Crossover Design The subjects cross over during the experiment from one treatment to another. Randomized Block Design Similar subjects are matched up to create a large set of experimental units. This is called a _________. The treatments are then randomly assigned to units within the blocks.

62 Types of Experimental Studies
Crossover Design The subjects cross over during the experiment from one treatment to another. Randomized Block Design Similar subjects are matched up to create a large set of experimental units. This is called a block. The treatments are then randomly assigned to units within the blocks.

63 Elements of a Good Experiment
Control group Allows us to compare against an existing treatment Enables us to control the __________ _______ The placebo effect occurs when patients seem to improve regardless of the treatment they receive. Randomization Eliminates ______ that can result when researchers assign treatments to the subjects Balances the group on variables that you know affect the response Balances the group on _________ variables that may be unknown to you

64 Elements of a Good Experiment
Control group Allows us to compare against an existing treatment Enables us to control the placebo effect The placebo effect occurs when patients seem to improve regardless of the treatment they receive. Randomization Eliminates ______ that can result when researchers assign treatments to the subjects Balances the group on variables that you know affect the response Balances the group on _________ variables that may be unknown to you

65 Elements of a Good Experiment
Control group Allows us to compare against an existing treatment Enables us to control the placebo effect The placebo effect occurs when patients seem to improve regardless of the treatment they receive. Randomization Eliminates bias that can result when researchers assign treatments to the subjects Balances the group on variables that you know affect the response Balances the group on _________ variables that may be unknown to you

66 Elements of a Good Experiment
Control group Allows us to compare against an existing treatment Enables us to control the placebo effect The placebo effect occurs when patients seem to improve regardless of the treatment they receive. Randomization Eliminates bias that can result when researchers assign treatments to the subjects Balances the group on variables that you know affect the response Balances the group on lurking variables that may be unknown to you

67 Elements of a Good Experiment
Blinding Increases reliability of the results _________-blind: subjects do not know the treatment assignment _________-blind: neither the subjects nor those in contact with the subjects know the treatment assignment Replication Assigns several _________________ ________ to each treatment

68 Elements of a Good Experiment
Blinding Increases reliability of the results Single-blind: subjects do not know the treatment assignment _________-blind: neither the subjects nor those in contact with the subjects know the treatment assignment Replication Assigns several _________________ ________ to each treatment

69 Elements of a Good Experiment
Blinding Increases reliability of the results Single-blind: subjects do not know the treatment assignment Double-blind: neither the subjects nor those in contact with the subjects know the treatment assignment Replication Assigns several _________________ ________ to each treatment

70 Elements of a Good Experiment
Blinding Increases reliability of the results Single-blind: subjects do not know the treatment assignment Double-blind: neither the subjects nor those in contact with the subjects know the treatment assignment Replication Assigns several experimental units to each treatment

71 Example 4.9 A pharmaceutical company has developed a new drug for treating high blood pressure. To determine the effectiveness of the drug, the company conducted an experiment in which subjects with a history of high blood pressure were treated with the new drug. A later experiment randomly divided subjects with a history of high blood pressure into two groups. Group A was treated with the new drug as before. Group B received the most popular drug on the market at that time. The subjects were unaware of which treatment they received. 60% of the patients in Group A improved, while 63% of the patients in Group B improved. The __________ experiment is better because

72 Example 4.9 A pharmaceutical company has developed a new drug for treating high blood pressure. To determine the effectiveness of the drug, the company conducted an experiment in which subjects with a history of high blood pressure were treated with the new drug. A later experiment randomly divided subjects with a history of high blood pressure into two groups. Group A was treated with the new drug as before. Group B received the most popular drug on the market at that time. The subjects were unaware of which treatment they received. 60% of the patients in Group A improved, while 63% of the patients in Group B improved. The second experiment is better because it employs a control group and blinding.

73 Example 4.10 To investigate whether antidepressants help smokers to quit smoking, one study used 429 men and women who were 18 or older and had smoked 15 cigarettes or more per day in the previous year. They were all highly motivated to quit and in good health. They were assigned to one of two groups: one group took an antidepressant called Zyban, while the other group did not take anything. At the end of a year, the study observed whether each subject had successfully abstained from smoking. Example taken from Statistics: The Art and Science of Learning from Data

74 Logic Behind Randomized Comparative Experiments
Randomization ensures that the groups of subjects are similar in all respects before the treatments are applied. Using a control group for comparison ensures that external influences operate equally on both groups. If the groups are large enough, natural differences in subjects will average out. This means that there be little difference in the results for the groups unless the treatments themselves actually cause the difference.

75 Did You Know? Observational studies can also have control groups.
These are called ______-________ studies. The cases are people who have a certain disease or condition, and the controls are people who do not have the disease. Their purpose is to see if one of the explanatory variables is related to the disease. _________ from the beginning of these notes is an example of a case-control study.

76 Did You Know? Observational studies can also have control groups.
These are called case-control studies. The cases are people who have a certain disease or condition, and the controls are people who do not have the disease. Their purpose is to see if one of the explanatory variables is related to the disease. _________ from the beginning of these notes is an example of a case-control study.

77 Did You Know? Observational studies can also have control groups.
These are called case-control studies. The cases are people who have a certain disease or condition, and the controls are people who do not have the disease. Their purpose is to see if one of the explanatory variables is related to the disease. Study 1 from the beginning of these notes is an example of a case-control study.

78 Important Points Observational studies Types Sampling Designs
Retrospective, Cross-Sectional, Prospective Sampling Designs Simple random sample (SRS), Stratified random sample, Cluster sample, Systematic sample Bias Types Undercoverage, Response bias, Nonresponse bias Sources of Bias Convenience sampling, Voluntary response sampling

79 Important Points Experiments Types Elements of Good Experiments
Completely randomized design, matched pairs designs, crossover designs, randomized block designs Elements of Good Experiments Control group, randomization, blinding and replication Advantages Can show causation Disadvantages Can be unethical Can take decades to complete

80 Important Points If a group is underrepresented in the sample, we cannot make inference about it. We must be careful when interpreting the results of observational studies. For comparison of several treatments to be valid, you must apply all treatments to similar groups of experimental units. Interesting questions are usually pretty tough to answer. This is due in part to the fact that no single experiment or observational study can determine causation.


Download ppt "Chapter 4 Gathering Data."

Similar presentations


Ads by Google