Section 1-1 Review and Preview

Section 1-1 Review and Preview

Preview Polls, studies, surveys and other data collecting tools collect data from a small part of a larger group so that we can learn something about the larger group. This is a common and important goal of statistics: Learn about a large group by examining data from some of its members.

Data Data is a collection of observations. (such as measurements, genders, survey responses).

Statistics Statistics is the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.

Population Population is the complete collection of all individuals to be studied (scores, people, measurements, and so on); the collection is complete in the sense that it includes all of the individuals to be studied.

Census versus Sample Census Collection of data from every member of a population. Sample Subcollection of members selected from a population.

Chapter Key Concepts Sample data must be collected in an appropriate way, such as through a process of random selection. If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical calculations can salvage them.

Section 1-2 Statistical Thinking

Key Concept This section introduces basic principles of statistical thinking used throughout this book. Whether conducting statistical analysis of data that we have collected, or analyzing a statistical analysis done by someone else, we should not rely on blind acceptance of mathematical calculation. We should consider these factors:

Key Concept (continued)
Context of the data Source of the data Sampling method Conclusions Practical implications

Context (Questions to be asked about the context)
What do the values represent? Where did the data come from? Why were they collected? An understanding of the context will directly affect the statistical procedure used.

Source of data (Questions to be asked about the source)
Is the source objective (unbiased)? Is the source biased? Is there some incentive to distort or spin results to support some self-serving position? Is there something to gain or lose by distorting results? Be vigilant and skeptical of studies from sources that may be biased.

Sampling Method (Questions to be asked about the sample)
Does the method chosen greatly influence the validity of the conclusion? Voluntary response (or self-selected) samples often have bias (those with special interest are more likely to participate). These samples’ results are not necessarily valid. Other methods are more likely to produce good results.

Conclusions (How to draw conclusions)
Make statements that are clear to those without an understanding of statistics and its terminology. Avoid making statements not justified by the statistical analysis.

Examples 1 – 4, use common sense to determine whether the given event is: (a) impossible; (b) possible, but very unlikely; (c) possible and likely Example 1: The Chicago Cubs win the 2013 World Series. Impossible; It is 2016 and they did not win. Example 2: The Chicago Cubs win the World Series within the next five years. Possible, but very unlikely (statistically speaking) Example 3: Halloween will fall on a Thursday this year. Impossible; It is on a Monday this year. Example 4: All of the students in this class will post something on social media this week. Possible and likely

Example 5: Refer to the data in the table below. The x-values are weights (in pounds) of cars; the y-values are the corresponding highway fuel consumption amounts (in miles/gallon). Weight (lb) 4035 3315 4115 3650 3565 Highway Fuel Consumption (mi/gal) 26 31 29 30 a) Are the x-values matched with the corresponding y-values? That is, is each x-value somehow associated with the corresponding y-value in some meaningful way? If the x and y values are matched, does it make sense to use the difference between each x-value and the y-value that is in the same column? Why or why not?

The x values are matched with the y values
The x values are matched with the y values. It does not make sense to use the difference between each x value and the y value that is in the same column. The x values are weights (in pounds) and the y values are fuel consumption amounts (in mi/gal), so the differences are meaningless.

Example 5 Cont.: b) Given the context of the car measurement data, what issue can be addressed by conducting a statistical analysis of the values? Is there a relationship or association between the weight of a car and its fuel consumption amount?

Example 5 Cont.: c) Comment on the source of the data if you are told that car manufacturers supplied the values. Is there an incentive for car manufacturers to report values that are not accurate? Consumers may be inclined to purchase cars with better fuel efficiency. Car manufacturers can then profit by selling cars that appear to have high levels of fuel efficiency, so there would be an incentive to make the fuel consumption amounts appear to be as favorable as possible. In this case, the source of the data would be suspect with a potential for bias.

Example 5 Cont.: d) If we use statistical methods to conclude that there is a correlation (or relationship or association) between the weights of cars and the amounts of fuel consumption, can we conclude that adding weight to a car causes it to consume more fuel? No. A conclusion of a correlation (or association) does not imply that one variable is the cause of the other.

Practical Implications/Practical Significance
State practical implications of the results. There may exist some statistical significance yet there may be NO practical significance. Common sense might suggest that the finding does not make enough of a difference to justify its use or to be practical.

Statistical Significance
Consider the likelihood of getting the results by chance. If results could easily occur by chance, then they are not statistically significant. If the likelihood of getting the results is so small, then the results are statistically significant.

Practical vs. Statistical Significance
In short, statistical significance is mathematical - it comes from the data (sample size) and from your confidence (how confident you want to be in your results). Practical significance is more subjective and is based on other factors like cost, requirements, program goals, etc. Will the difference affect a decision to be made?

Example 6: Form a conclusion about statistical significance. Either use the results provided or make a subjective judgment about the results: In a study of the Ornish weight loss program, 40 subjects lost a mean of 3.3 lb after 12 months. Methods of statistics can be used to show that if this diet had no effect, the likelihood of getting these results is roughly 3 chances in Does the Ornish weight loss program have statistical significance? Does it have practical significance? Why or why not?

The Ornish weight loss program has statistical significance, because the results are so unlikely (3 chance in 1000) to occur by chance. There is no practical significance. Losing 3.3lbs in 12 months is not practical or a very effective diet.

Example 7: Form a conclusion about statistical significance. Do not make any formal calculation. Either use the results provided or make a subjective judgment about the results: One of Gregor Mendel’s famous hybridization experiments with peas yielded 580 offspring with 152 of those peas (or 26%) having yellow pods. According to Mendel’s theory, 25% of the offspring peas should have yellow pods. Do the results of the experiment differ from Mendel’s claimed rate of 25% by an amount that is statistically significant?

Determining whether or not the difference between Mendel’s actual results (26%) and the results predicted by his theory (25%) is statistically significant requires applying techniques presented in future chapters. However, common sense suggests the 1% difference is of no practical difference. Considering the sample size, the actual difference between the observed and expected results is 152 – 145 = 7. Common sense suggests that a discrepancy of 7 (relative to an expected result of 145 plants from a total sample of 580 plants) is within the natural fluctuation inherent in normal biological processes, and that the difference is not statistically significant.

Skills Check Questions
1.) What is a voluntary response sample?

2.) Why is a voluntary response sample generally not suitable for a statistical study?

3.) What is the difference between statistical significance and practical significance?

4.) You have collected a large sample of values. Why is it important to understand the context of the data?

5.) In a study of the Weight Watchers weight loss program, 40 subjects lost a mean of 3.0 lbs after 12 months. Methods of statistics can be used to verify that the diet is effective. Does the Weight Watchers weight loss program have statistical significance? Does it have practical significance? Why or why not?

6.) In the Weight Watchers study of the weight loss program, subjects were found using the method described as the following: “We recruited study candidates from the Greater Boston area using newspaper advertisements and TV publicity.” Is the sample a voluntary response sample? Why or why not?

Section 1-1 Review and Preview

Similar presentations

Presentation on theme: "Section 1-1 Review and Preview"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Section 1-1 Review and Preview

Similar presentations

Presentation on theme: "Section 1-1 Review and Preview"— Presentation transcript:

Similar presentations

About project

Feedback