Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Boon Xuan, Mei Ying and Fatin

Similar presentations


Presentation on theme: "By Boon Xuan, Mei Ying and Fatin"— Presentation transcript:

1 By Boon Xuan, Mei Ying and Fatin
Problem with Sampling By Boon Xuan, Mei Ying and Fatin Based on: Are First-Borns more likely to attend Harvard? Case Study by Anthony Millner and Raphael Calel (2012)

2 Overview Background Michael Sandel Problems with his Claim
Base rate fallacy Bayes’ Theorem Lack of Data Conclusion

3 Background There are 75 to 85 percent of Harvard students who are first-borns and Michael Sander showed this by asking his class to raise their hands when they are first-born. From this, he suggested that birth order has a significant level of effect on the amount of effort put in studies of the child.

4 Michael Sandel Who is he?
He is an American political philosopher and a political philosophy professor at Harvard University. His course “Justice” is the first Harvard course to be made freely available online and on television. It has been viewed by tens of millions of people around the world, including in China, where Sandel was named the “most influential foreign figure of the year.” (China Newsweek).

5 Problem with his claim Base rate Fallacy Lack of Data Sampling Bias
Also called the base rate neglect or base rate bias. It is a formal fallacy whereby if presented with related base rate information and a specific information, the mind tends to ignore the general information and focus more on the specific information. Lack of Data Sampling Bias

6 Base rate fallacy What is Sandel really doing?
He is finding out the probability that you are a first-born in Harvard when what he really wants to find is the probability that you are in Harvard when you are nth-born. P(1st-born | Harvard) INSTEAD of P(Harvard | nth-born)

7 Base rate fallacy Another Example of Base rate fallacy (drunk drivers): A group of police officers have breathalyzers displaying false drunkenness in 5% of the cases in which the driver is sober. However, the breathalyzers never fail to detect a truly drunk person. One in a thousand drivers is driving drunk. Suppose the police officers then stop a driver at random, and force the driver to take a breathalyzer test. It indicates that the driver is drunk. We assume you don't know anything else about him or her. How high is the probability he or she really is drunk? Many would answer as high as 0.95, but the correct probability is about 0.02.

8 Base rate fallacy Another Example of Base rate fallacy (drunk drivers): To find the probability, what we need is to use Bayes’ theorem OR an easier explanation would be given 1000 drivers, 1 driver is drunk and is confirmed there is a true positive result using the breathalyzer. 999 Drivers are not drunk and among them, there are 5 percent of the drivers with false positive results with of them. Hence, the probability of one of the drivers among the positive results is really drunk is 0.02.

9 Bayes’ Theorem To really find the probability of either nth-born in Harvard or the driver is drunk given that the breathalyzer indicates he or she is drunk, we need Bayes’ Theorem. What is it? It describes the probability of an event based on prior knowledge of the conditions that might be linked to the event.

10 Bayes’ Theorem Usage of Bayes’ theorem to find the probability of drunk drivers when the breathalyzer shows a positive result: What we need to find - P(drunk | positive) Given - P(drunk) = 0.001 P(sober) = 0.999 P(positive | drunk) = 1.00 P(positive | sober) = 0.05 P(positive) = (1.00 x 0.001) + (0.05 x 0.999) =

11 Bayes’ Theorem Usage of Bayes’ theorem to find the probability of drunk drivers when the breathalyzer shows a positive result: What we need to find - P(drunk | positive) Formula = P(drunk | positive) = ( P(positive | drunk) x P(drunk) ) / P(positive) = (1.00 x 0.001) / =

12 Lack of Data There are limitless number of intermediate possibilities such as fertility rate that play a part in explaining that birth-order does in fact affect whether a child is smart enough to enter Harvard University. From the information that Sandel gave us, it is not possible to determine that birth-order effect is the only variable that affects the probability of you getting into Harvard.

13 Sampling Bias What is it? How did Sander unknowingly commit this?
It is a bias in which a sample is collected in such a way that some members of the intended population are less likely to be included than others. It results in a biased sample, a non-random sample of a population (or non-human factors) in which all individuals, or instances, were not equally likely to have been selected. If this is not accounted for, results can be erroneously attributed to the phenomenon under study rather than to the method of sampling. How did Sander unknowingly commit this? By restricting the population of study to his class only, he is excluding the students of Harvard University and therefore, his sample may not be representative of Harvard’s population.

14 Sampling Bias Real Life examples on how it may affect us:
During 1936, in the early days of opinion polling, the American Literary Digest Magazine collected over 2 million postal surveys and predicted that the Republican candidate in the US presidential election, Alf Landon would win Franklin Roosevelt by a large margin. However, the result was the exact opposite. The sample collected from readers of the magazine included an over-representation of the rich and hence as a group, more likely to vote for the Republican candidate. During 1948 presidential election night, the Chicago Tribune printed the headlines wrongly as their editor trusted the results of a phone survey and the telephones then were not widely used yet. Therefore, not representative of the general population.

15 Sampling Bias How can we reduce Sampling Bias?
Avoid Judgement Sampling or Convenience Sampling Make sure that the target population is defined properly and the sample frame match it as close as possible.

16 Conclusion Be careful of making the mistake of neglecting the base rate Gather enough reliable data to substantiate your claims Do make sure to reduce sampling bias THANK YOU!


Download ppt "By Boon Xuan, Mei Ying and Fatin"

Similar presentations


Ads by Google