Presentation is loading. Please wait.

Presentation is loading. Please wait.

Professor Chris Wilds material University of Auckland

Similar presentations


Presentation on theme: "Professor Chris Wilds material University of Auckland"— Presentation transcript:

1 Professor Chris Wilds material University of Auckland
P Values Robin Beaumont 8/2/2012 With much help from Professor Chris Wilds material University of Auckland Hi Robin Beaumont here this the first in a series of 4 youtube videos talking about p values what image does the term p value kunjer up for you [click 1] A fairly typical response – and hopefully by the end of this this image will have faded if only slightly! I must mention that I have made use of some excellent materials available from Professor Chris Wilds website – many thanks

2 Where do they fit in! Where do P values fit in?
Well this is the typical research cycle for quantitative research, the diagram being courtesy of prof. Chris wilds. The penultimate stage, analysis, and last stage conclusions, is where P values are frequently used to allow the formulation of sound conclusions. However if p values are used in isolation there is a distinct danger of you making erroneous conclusions concerning your research. P values should always be interpreted in relation to sample size, power and effect size measures. Also you should consider the various assumptions and the specific probability distribution that has given rise to the specific P value or values in your research. Now lets consider why people find understanding P values so difficult and painful?

3 P Value probability sampling statistic Rule The pain of the P value
I think there are three main reasons: Many introductory books and websites (with the notable exception of wikipedia) are inaccurate. Providing wrong or ambiguous or even obviously illogical and conflicting explanations. Secondly there are two different ways of using the P value resulting in opposing interpretations. Finally understanding what a P value is demands that you also understand: Probability [click] Random sampling theory [click] What a ‘statistic’ is [click] And finally . . What a rule is in this situation [click] All this requires time and effort to understand and unfortunately only one of them is a easy topic , but me being one never to avoid pain let’s tackle the most difficult of the topics first – probability.

4 All possible outcomes at any one time must add up to 1
P Value A P value is a special type of probability: It considers more than one outcome (one event can have more than one outcome) Is a conditional probability Probability Values A typical probability value: 0.25 A probability must be between 0 and 1 e.g. Probability of winning the lottery yes no A p value is NOT the same as a probability Value. A P value is a special type of probability: [click] It considers more than one outcome (one event can consist of more than one outcome) It is a conditional probability [click] So lets see a Probability value [click] Here is a typical probability value of this is the same as saying the have a probability of 25% and in the UK this is probably the probability that it is going to rain today! [click] A probability must lie between zero and 1 [click] 0 being the event will not occur and 1 being that it is certain to occur. Taking the example of winning the lottery if we have a probability of winning it at any one time of 0.000,000,1 per ticket brought we can calculate the chances of not winning by just subtracting this value from 1 which is – by the way I have made up these values in reality, the probability of winning is less. [click] The take home message is that at any one time all the possible probabilities for all the outcomes must add up to 1 , additionally we can group sets of outcomes together to form events – which we will now consider. Remembering we are considering the more than one outcome situation because this is one of the characteristics of the P value, that is it represents a Probability for more than one outcome. All possible outcomes at any one time must add up to 1

5 Probabilities are rel. frequencies
Here is a histogram which presents the different scores for a exam obtained from 48 students. We can see that 11 students got 40 to 44 whereas one very bright student got a mark within the range. [click] Now we can also think of these bars as representing a set of relative frequencies, which results in the total area being equal to one. So if this is that case, remembering that the probably for all outcomes at any one time must equal one, it sounds like they are equivalent, and they are, Except the histogram is now called a probability distribution. So now we know we can work out probabilities from a histogram – lets see how we can use this for an event that consists of more than one outcome –again remembering we are doing this to help use understand more about the fact that a P value is a type of probability which represents more than one outcome

6 Multiple outcomes at any one time
Probability Density Function Scores Probability 1 2 3 4 5 6 7 8 9 10 11 33 37 43 47 53 57 63 67 73 77 83 87 The total area = 1 total 48 scores Density p(score<45) = area A A p(score > 50) = area B B P(score<45 and score >50) = Just add up the individual outcomes So here is our Probability histogram but this time I’m assuming that we have recorded the actual score rather than where it fell in a range. Doing this means that the name of the graph changes to a Probability density function, same idea but different name! We can represent the values by using the tree method we first used thinking of it this time on its side. [click] This time it has as many branches as there are different outcomes, and there can be several outcomes with the same value. for example 3 or our 48 students may have scored 43 each with a probability of 1/48. Lets consider the probability of getting a range of scores say, a score of less than 45 , we know this is equivalent to finding the area A Similarly we know that the probability of getting a score of more than 50 is shown by area B Notice that both area A and area B are considering multiple scores that is they are in this instance considering scores MORE EXTREME than a given value. How do we calculate the probabilities (i.e areas) for A and B– well we simply add the separate outcomes together. Furthermore if we added both areas A and B together [click] we would end up with the probability of obtaining any score which was more than 45 or less than 50 [click] Important things to remember here are that we can work out probabilities of multiple outcomes simply by adding the individual outcomes together and the multiple outcomes might represent outcomes said to be MORE EXTREME than a specific value. Lets look a little more at this ‘more extreme’ idea.

7 The probability of a value more extreme?
The ‘more extreme’ idea The probability of a value more extreme? -3 -2 -1 1 2 3 0.0 0.1 0.2 0.3 0.4 Normal Distribution: x 5 10 15 20 25 30 0.00 0.02 0.04 0.06 0.08 0.10 Chi-Squared Distribution: df = 9 Density In this example we are taking values more extreme than those of approximately a value of 1.9 so all the values beyond 1.9 and -1.9 are shaded this is different from our previous example as the more extreme here means those values greater than a specific value in both the positive and negative direction. [Click] I this example we are only considering those extreme values in just the positive direction. In both the above examples we are assuming that the values can continue for ever in the first example in both the negative and positive directions and in the second one only in the positive direction, clearly while the probability distributions have most of the values bunched up there is a possibly although, its very small of obtaining these extreme values. Now lets move onto the next aspect of probability – the situation where one event effects another such probabilities are called conditional probabilities.

8 = Conditional Probability
What happens if events affect each other? = Conditional Probability Multiple each branch of the tree to get end value Example from Taylor – From patient data to medical knowledge p160 20 in a room : 8 female + 12 male 4 of which have a beard P(bearded) = 4/20 = 0.2 P(male) = 12/20 = .6 So does the probability of being a bearded male = 0.2 x 0.6 = 0.12 NO P(Male AND bearded) = 0.6 x = 0.2 P(bearded|male) 4/12 = .3333 P(male) 12 12/20 = .6 P(clear|male) 20 P(bearded AND male) = P(male) x P(bearded| male) 8/20 = .4 So what happens to probabilities when two of more events effect each other? I have taken this example from the excellent book by Paul Taylor – From patient data to medical knowledge p160 Say we have 20 people in a room : 8 female + 12 male 4 of which have a beard from this information we can work out various probabilities, For example [click 1] P(bearded) = 4/20 = 0.2 [click 2] P(male) = 12/20 = .6 [Click 3] So does the probability of being a bearded male = 0.2 x 0.6 = The answer I’m afraid to say is NO – intuitively we feel that the gender of a person has an effect upon the probability of them being bearded, so how do we take this into account, the simple answer is that we consider the subset of the group that we feel has the effect upon the characteristic. So in this instance we only consider the males. Lets consider the problem in the form of a probability tree [click 4] So the first branch gives us the probability of being a male or female. Now lets expand it to consider the beardedness aspect [click 5] We know there are 12 males 4 of which are bearded so we can say given that you are male the probability of you being bearded is 4/12 = [click 6] we show this as a conditional probability , the vertical line representing the conditional probability, we read this as The probability of being bearded given that you are male. And we can see this equals 4/12 = .333 [click 7] So we can work out now the probability of selecting someone who is both male AND bearded in the room by multiplying each of the relevant branches which is in this instances 0.6 X giving 0.2 You may notice that this is the same probability as we got for the probability of being bearded but these are the same because we have no bearded females – if we had the results would have been different. [click 8] Finally I would like to say that we can show the male and bearded result as a probability statement – which just reflects each of the relevant branches. Now lets look at a more useful example. P(female) 8

9 Screening Example 0.1% of the population (i.e 1 in a thousand) carry a particular faulty gene, predisposing the person to some nasty disease. A test exists for detecting whether an individual is a carrier of the gene. In people who actually carry the gene, the test provides a positive result with probability % of the time we get the correct result In people who don’t carry the gene, the test provides a positive result with probability % of the time we get a incorrect positive result Let G = person carries gene P = test is positive for gene N = test is negative for gene Errors Given that someone has a positive result, find the probability that they actually are a carrier of the gene. We want to find Need P(P) looking at the two P(P) branches P(P) = P(G and P) + P(G' and P) = = P( P | G) 0.1% of the population (i.e 1 in a thousand) carry a particular faulty gene. [click 1] A test exists for detecting whether an individual is a carrier of the gene. [click 2] In people who actually carry the gene, the test provides a positive result with probability % of the time we get the correct result [click 3] In people who don’t carry the gene, the test provides a positive result with probability – only 1% of the time we get the correct result [click 4] Let G = person carries gene P = test is positive for gene N = test is negative for gene [click 5] Here is the information in the form of a probability tree. G is the branch with the faulty gene and G dash is the one without it, so considering everyone we have the probability of having a positive test back from a person with the gene as 9 times in 10 thousand, remember this is taking into account the fact that we start with the probability that the gene only occurs in one in a thousand people. [click 6] [click 7] We also have two errors representing the false positive and the false negatives notice that adding these two together we state a higher change of getting one of these than we do a true positive result! [click 8] Retuning back to the specific branches we have the probability of having a positive result given that you have the gene as .9 that is a 90% chance of a positive test given they carry the gene. What would be useful is to know the probability the other way around that is the probability of having the gene given that you have a positive result, by taking the probability equation we had on the previous slide we can achieve this. [click 9] Firstly we need to find the probability of having a positive test regardless of your gene status this is just the values at the end of two of the branches added together. Now we can plug this value into the equation and get the value we require. [click 10] The probability of someone having the gene given that they have positive test is That is those who test positive, actually only 8% of them have the gene. You are more likely not to have the gene if you test positive than of having it! This figure is so low because we have a number of false positives, nearly one in every hundred and the prevalence of the gene is so low. So dispute the high accuracy of the test we end up with a low probability of someone having the gene with a positive test. [click 11] The other important thing to notice is that order is important when it comes to conditional probabilities, The probability of a positive test given that the person carries the gene is not equal to the probability of the person carrying the gene given they have a positive test that is 0.9 does not equal Obviously this approach can be applied more generically lets consider this next. P(P | G) ≠ P (G | p) ORDER MATTERS

10 = Conditional Probability
Disease / Test = Conditional Probability P(test+|disease) Disease X AND test+ P(disease) We can think in terms of diseases and tests in general the test might be in house ones or mass screening interventions. They all can be analysed using the same framwork.

11 X observed | hypothesised
The probability of obtaining the hypothesised value GIVEN THAT we obtained the summary value x X Hypothesised value Summary value=x P(hypothesised value|summary value=x) The probability of obtaining summary value x GIVEN THAT I have this hypothesised value summary value=x Hypothesised value P(summary value=x|hypothesised value) At a more abstract level we can think of obtaining a summary value from some dataset like the mean, standard deviation, chi square or t value. And also the chances of getting a specific value called the hypothesised value. We can look at this in terms of our probability tree, that is what is the probability of obtaining the hypothesised summary value given the value obtained from our dataset. Or the other way round [click 1] What is the probability of obtaining a specific summary value given a specific hypothesised value. As you know now in conditional probabilities order matters, and in this situation it is no different, these two conditional probabilities are not equivalent. [click] Most people when they carry out experiments would like to know the probability of the hypothesised value given the summary value x in our sample, this being represented in the first situation. Unfortunately the P value is concerned with the second situation the probability of obtaining a summary value x from our sample given the hypothesised value. But the P value does more than that by considering a range of outcomes for our summery value specifically it provided the probability of a value More extreme than that observed – lets consider this quickly again but adding in this conditional aspect as well.

12 Combining conditional probability + multiple outcomes = P value
Here we have a probability distribution of possible observed values for the chi-square summary statistic GIVEN THAT The hypothesised value is ZERO 5 10 15 20 25 30 0.00 0.02 0.04 0.06 0.08 0.10 Chi-Squared Distribution: df = 9 Density The blue bit presents all those values greater than 15 0.0909 Area = This is the P value P value = P(observed chi square value or one more extreme |value = 0) A P value is a conditional probability considering a range of outcomes [click 1] Here we have a particular probability distribution for a summary statistic know as the chi square and for various reasons it is for a sample of 10 independent observations here. The curve provides the distribution of probabilities given that the actual value is equal to zero. [Click 2] Say we obtained a chi square value of 15 in our sample we could work out the probability of obtaining any chi square value more extreme than this from a population with a chi square value of zero by working out the area – the blue bit represents this. [click 3] And is equal to [click 4] This is the P value in this instance and here it is representing the probability of an observed chi square value of 15 or one more extreme GIVEN THAT the value is equal to zero in the population. [Click 5] So a P Value is a conditional probability which considers a range of outcomes. In the above I have mentioned samples and populations we will revisit this latter, as it is another important aspect of the P value, particularly the size of the sample. Lets rap up this section now.

13 Probability summary All outcomes at any one time add up to 1 Probability histogram = area under curve =1 -> specific areas = sets of outcomes “More extreme than x” Conditional probability –– ORDER MATTERS A P value is a conditional probability which considers a range of outcomes Summary of the probability aspects of the p value [click] All outcomes at any one time add up to 1 [click 2] Probability histogram provides the total area under the curve to equal 1 [click 3] Therefore we can calculate specific areas which equal probabilities representing sets of outcomes [click 4] Specifically we can consider the “More extreme than x” situation [click 5] The important of considering Conditional probability where ORDER MATTERS was highlighted with the screening example were there was a greater chance of having a positive test result than having the condition (this was fictitious data – but might represent reality in some circumstances) [click 6] Finally all this allowed us to appreciate the fact that A P value is a conditional probability which considers a range of outcomes. But . .

14 Putting it all together
probability P Value sampling statistic Rule We have only looked at the first of the 4 aspects of the P value. We can move onto the next aspect that of sampling Here we will see how the p value is effected by sampling, and specifically sample size and the importance of understanding the difference between a sample and a population to further understand what I meant by the terms summery statistic and hypothesised value when I was discussing the conditional probability aspects of the P value. Bye for now.


Download ppt "Professor Chris Wilds material University of Auckland"

Similar presentations


Ads by Google