Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lies, Damn Lies, and Statistics.

Similar presentations


Presentation on theme: "Lies, Damn Lies, and Statistics."— Presentation transcript:

1 Lies, Damn Lies, and Statistics.
Benjamin Disraeli, twice Prime Minister of Great Britain (1868 & 1874 – 1880) was quoted as saying: “There are three kinds of lies: lies, damned lies, and statistics."

2 Lies, Damn Lies, and Statistics.
"Lies, damned lies, and statistics" is part of a phrase attributed to Benjamin Disraeli and popularised in the United States by Mark Twain: "There are three kinds of lies: lies, damned lies, and statistics." The statement refers to the persuasive power of numbers, the use of statistics to bolster weak arguments, and the tendency of people to disparage statistics that do not support their positions. Lies, Damn Lies, and Statistics.

3 Lies, Damn Lies, and Statistics.
Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data. It also provides tools for prediction and forecasting based on data. It is applicable to a wide variety of academic disciplines, from the natural and social sciences to the humanities, government and business. Statistical methods can be used to summarise or describe a collection of data; this is called descriptive statistics. In addition, patterns in the data may be modelled in a way that accounts for randomness and uncertainty in the observations, and are then used to draw inferences about the process or population being studied; this is called inferential statistics. Lies, Damn Lies, and Statistics.

4 Yes Prime Minister Please view video clip at here.
here. Lies, Damn Lies, and Statistics.

5 Lies, Damn Lies, and Statistics.
Harvard President Lawrence Lowell wrote in 1909 that statistics: "...like veal pies, are good if you know the person that made them, and are sure of the ingredients". Lies, Damn Lies, and Statistics.

6 Lies, Damn Lies, and Statistics.
Loaded Questions The answers to surveys can often be manipulated by wording the question in such a way as to induce a prevalence towards a certain answer from the respondent. For example, in polling support for a war, the questions: Do you support the attempt by (the war-making country) to bring freedom and democracy to other places in the world? Do you support the unprovoked military action by (the war-making country)? will likely result in data skewed in different directions, although they are both polling about the support for the war. Lies, Damn Lies, and Statistics.

7 Loaded Questions (continued)
Another way to do this is to precede the question by information that supports the "desired" answer. For example, more people will likely answer "yes" to the question "Given the increasing burden of taxes on middle-class families, do you support cuts in income tax?" than to the question "Considering the rising federal budget deficit and the desperate need for more revenue, do you support cuts in income tax?" Lies, Damn Lies, and Statistics.

8 Which survey question below contains the least bias?
Do you support giving intelligence and law enforcement agencies the ability to covertly track and monitor communications of terrorist suspects within our borders, as needed, even if a court order has not yet been obtained? Do you support giving intelligence and law enforcement agencies additional rights to monitor communications within our country? Do you support altering the Constitution to allow intelligence and law enforcement agencies to monitor communications of citizens without the need for pesky court orders and warrants? Lies, Damn Lies, and Statistics.

9 Lies, Damn Lies, and Statistics.
Biased Samples A biased sample is a statistical sample of a population in which some members of the population are less likely to be included than others. An extreme form of biased sampling occurs when certain members of the population are totally excluded from the sample (that is, they have zero probability of being selected). For example, a survey of high school students to measure teenage use of illegal drugs will be a biased sample because it does not include home schooled students or dropouts. A sample is also biased if certain members are under-represented or over-represented relative to others in the population. For example, a "man on the street" interview which selects people who walk by a certain location is going to have an over-representation of healthy individuals who are more likely to be out of the home than individuals with a chronic illness. Lies, Damn Lies, and Statistics.

10 Examples of biased samples
Online and phone-in polls are biased samples because the respondents are self-selected. Those individuals who are highly motivated to respond, typically individuals who have strong opinions, are over-represented, and individuals who are indifferent or apathetic are less likely to respond. This often leads to a polarization of responses with extreme perspectives being given a disproportionate weight in the summary. As a result, these types of polls are regarded as unscientific. Lies, Damn Lies, and Statistics.

11 Examples of biased samples
A classic example of a biased sample and the misleading results it produced occurred in In the early days of opinion polling, the American Literary Digest magazine collected over two million postal surveys and predicted that the Republican candidate in the U.S. presidential election, Alf Landon, would beat the incumbent president, Franklin Roosevelt by a large margin. The result was the exact opposite. The Literary Digest survey represented a sample collected from readers of the magazine, supplemented by records of registered automobile owners and telephone users. This sample included an over-representation of individuals who were rich, who, as a group, were more likely to vote for the Republican candidate. In contrast, a poll of only 50 thousand citizens selected by George Gallup's organization successfully predicted the result, leading to the popularity of the Gallup poll. Lies, Damn Lies, and Statistics.

12 The Infamous Literary Digest Poll, and the Election of 1936
In 1936, Franklin Delano Roosevelt had been President for one term.  The magazine, The Literary Digest, predicted that Alf Landon would beat FDR in that year's election by 57 to 43 percent.  Lies, Damn Lies, and Statistics.

13 The Infamous Literary Digest Poll, and the Election of 1936
The Digest mailed over 10 million questionnaires to names drawn from lists of automobile and telephone owners, and over 2.3 million people responded - a huge sample. Lies, Damn Lies, and Statistics.

14 The Infamous Literary Digest Poll, and the Election of 1936
At the same time, a young man named George Gallup sampled only people and predicted that Roosevelt would win.  Gallup's prediction was ridiculed as naive.  After all, the Digest had predicted the winner in every election since 1916, and had based its predictions on the largest response to any poll in history.  But Roosevelt won with 62% of the vote.  The size of the Digest's error is staggering.  How could they have been so far off? George Gallup (1901 – 1984) Lies, Damn Lies, and Statistics.

15 The Infamous Literary Digest Poll, and the Election of 1936
The Literary Digest had made two fatal mistakes.  Their list of names was biased in favour of those with enough money to buy cars and phones, a much smaller portion of the population in the thirties than it is today.  And, more seriously, the Digest had depended on voluntary response.  FDR was the incumbent, and those who were unhappy with his administration were more likely to respond to the Digest survey. When a sample is biased, a large number of subjects cannot correct for the error. [The magazine folded, not too much later; some think the wrong prediction was largely responsible.  Gallup's scientific polling organisation still exists, highly respected for the quality of their work for over 70 years.] Lies, Damn Lies, and Statistics.

16 Lies, Damn Lies, and Statistics.
Another classic example occurred in the 1948 Presidential Election. On Election night, the Chicago Tribune printed the headline DEWEY DEFEATS TRUMAN, which turned out to be mistaken. Lies, Damn Lies, and Statistics.

17 Lies, Damn Lies, and Statistics.
In the morning the grinning President-Elect, Harry S. Truman, was photographed holding a newspaper bearing this headline. Lies, Damn Lies, and Statistics.

18 Examples of biased samples
The reason the Tribune was mistaken is that their editor trusted the results of a phone survey. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses. (In many cities, the Bell System telephone directory contained the same names as the Social Register.) Lies, Damn Lies, and Statistics.

19 Statistical corrections for a biased sample
If entire segments of the population are excluded from a sample, then there are no adjustments that can produce estimates that are representative of the entire population. But if some groups are under-represented and the degree of under-representation can be quantified, then sample weights can correct the bias. For example, a hypothetical population might include 10 million men and 10 million women. Suppose that a biased sample of 100 patients included 20 men and 80 women. A researcher could correct for this imbalance by attaching a weight of 2.5 for each male and for each female. This would adjust any estimates to achieve the same expected value as a sample that included exactly 50 men and 50 women, unless men and women differed in their likelihood of taking part in the survey. Lies, Damn Lies, and Statistics.

20 Lies, Damn Lies, and Statistics.
TRUE or FALSE A population includes 10 million men and 10 million women. A biased sample of 100 patients included 25 men and 75 women. A researcher could correct for this imbalance by attaching a weight of 2.0 for each male and 2/3 for each female. This would adjust any estimates to achieve the same expected value as a sample that included exactly 50 men and 50 women. Lies, Damn Lies, and Statistics.

21 Lies, Damn Lies & Statistics…
Lies, Damn Lies, and Statistics.

22 Lies, Damn Lies & Statistics…
Lies, Damn Lies, and Statistics.

23 Which graph better represents the data?
Lies, Damn Lies, and Statistics.

24 Consider this data Tested Positive Tested Negative Total
Has the disease 4 6 10 Does not have the disease 986 990 8 992 1 000 Lies, Damn Lies, and Statistics.

25 Consider this data Tested Positive Tested Negative Total Has the disease 4 6 10 Does not have the disease 986 990 8 992 1 000 In speaking up their new test for a disease, a pharmaceutical company claims 99 % success. This is not a “lie”, but does it give a true indication of how good the test is? YES NO Lies, Damn Lies, and Statistics.

26 Consider this data Tested Positive Tested Negative Total Has the disease 4 6 10 Does not have the disease 986 990 8 992 1 000 In speaking up their new test for a disease, a pharmaceutical company claims 99 % success. The test is not really that great. Only 40% of those with the disease test positive for the disease, and only half of those who test positive actually do have the disease. Lies, damn lies and statistics! Lies, Damn Lies, and Statistics.

27 Lies, Damn Lies, and Statistics.
References / Sources: Lies, Damn Lies, and Statistics.


Download ppt "Lies, Damn Lies, and Statistics."

Similar presentations


Ads by Google