Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vocabulary of Statistics Part 2. Data Collection  First problem a statistician faces: how to obtain the data.  It is important to obtain good, or representative,

Similar presentations


Presentation on theme: "Vocabulary of Statistics Part 2. Data Collection  First problem a statistician faces: how to obtain the data.  It is important to obtain good, or representative,"— Presentation transcript:

1 Vocabulary of Statistics Part 2

2 Data Collection  First problem a statistician faces: how to obtain the data.  It is important to obtain good, or representative, data.  Inferences are made based on statistics obtained from the data.  Inferences can only be as good as the data.

3 Data Collection - Surveys  Telephone pros: less costly, more candid pros: less costly, more candid cons: no phone, no call list cons: no phone, no call list  Mailed questionnaire pros: cover more area, less cost pros: cover more area, less cost cons: low response, inappropriate responses cons: low response, inappropriate responses  Personal Interview pros: in-depth responses, pros: in-depth responses, cons: training, cost, bias cons: training, cost, bias

4 Biased Sampling Method: A sampling method that produces data which systematically differs from the sampled population. An unbiased sampling method is one that is not biased. Sampling methods that often result in biased samples: Convenience sample Volunteer sample

5 Process of data collection: 1. Define the objectives of the survey or experiment. 2. Define the variable and population of interest. 3. Defining the data-collection and data- measuring schemes. This includes sampling procedures, sample size, and the data- measuring device (questionnaire, scale, ruler, etc.). 4. Determine the appropriate descriptive or inferential data-analysis techniques.

6 What Causes a Biased Sample? A biased sample is any sample that is not representative of the target population. (The sample is different from the population) This is commonly caused by a difference in the following factors, but can be caused by a difference in other factors: race genderhousehold income religion geographic location age age

7 Let’s assume that you were interested in profiling the religious beliefs of the people who live in Oyster Bay. Would you poll the people walking down Anstice St on Sunday morning? Why not?

8 Let’s assume that you were interested in the percentage of Americans who listen to country music. Could you find this information by polling people on the street in New York City? Why not?

9 Types of Bias Selection Bias: When the method of data collection leads to individuals being more or less likely to be selected for the study than the overall population. Nonresponse Bias: When those responding to a survey differ significantly from those who do respond.

10 If you’re trying to find the percentage of high school students who use illegal drugs, can you poll them in front of their parents? What kind of bias is this? How can you change your method to minimize this bias? How will your results be affected?

11 Sampling Frame: A list of the elements (people) belonging to the population from which the sample will be drawn. Note: It is important that the sampling frame be representative of the population. Ex: Phone directory, electoral register

12 Sample Design: The process of selecting sample elements from the sampling frame in order to create the sample. Note: There are many different types of sample designs. Usually they all fit into two categories: judgment samples and probability samples. Ex: Simple Random, Stratified Random, Cluster, Systematic

13 Judgment Samples: Samples that are selected on the basis of being “typical.” Items are selected that are representative of the population. The validity of the results from a judgment sample reflects the soundness of the collector’s judgment. Probability Samples: Samples in which the elements (people) to be selected are drawn on the basis of probability. Each element in a population has a certain probability of being selected as part of the sample.

14 Misuse of Statistics

15   The pure and simple truth is rarely pure and never simple Oscar Wilde   First get your facts; then you can distort them at your leisure Mark Twain   There are three kinds of lies: lies, damn lies, and statistics Benjamin Disraeli

16 Three out of four doctors recommend new Zimento

17 Suspect Samples  How many doctors were actually used?  4?  40?  100?  10,000?  How were they chosen?

18 Misleading Graphs

19

20

21 51% agree 49% disagree

22 Misleading wording in surveys  Do you support bringing freedom and democracy to the people of Iraq?  Do you support the unprovoked military action of the U.S. taking place in Iraq?

23 Changing values to represent the same data  The incumbent states, “During my tenure expenditures have only risen 1%.”  The challenger states, “During my opponents tenure expenditures have risen $10,000,000.”  Both statements are true, but one uses percentage and the other dollar amounts.

24 Detached Statistics  Our brand of crackers has one-third fewer calories. compared to what?  Zimento works four times faster. faster than what?

25 Implied connections EEEEating fish may help to reduce your cholesterol. SSSStudies suggest that using Zimento will help you reduce your weight. TTTTaking calcium will lower blood pressure in some people.


Download ppt "Vocabulary of Statistics Part 2. Data Collection  First problem a statistician faces: how to obtain the data.  It is important to obtain good, or representative,"

Similar presentations


Ads by Google