Probability: The Study of Randomness Randomness and Probability Models

Probability: The Study of Randomness Randomness and Probability Models
IPS Chapters 4.1 and 4.2 ? draw coin and leave on WB w prob each; do same with die; do same with 2-die // web page with images for easy switch?  change BB probabilities to reflect NBAers? © 2009 W.H. Freeman and Company

Learning Objectives Randomness and Probability models
Describe the difference between events that are independent, and those that are not. Be able to take a sample space and create a probability model. Be able to interpret a probability model. Describe the difference between events that are DISJOINT and those that are not. Be able to apply the addition rule for disjoint events. Describe the difference between an empirical probability and a theoretical probability.

“Yeah, yeah” Often we look at topics that seem like “common sense” and say to ourselves, ‘yeah, yeah, I get that’. BE CAREFUL! Often there are many ways in which we think we understand something, but there still remain (many!!) gaps in our knowledge and understanding. This is not only true of statistics – happens in all kinds of places/studies. However, it is a particularly common pitfall in stats. Be sure to review the concepts and complement them with lots of problems. There are not shortcuts – if there were I would tell them to you!! Review and lots of problems is the only way to do learn this material!

Randomness and probability
A phenomenon is random if individual outcomes are uncertain, yet given a large number of repetitions, you would see a regular distribution of outcomes. For example, a single individual flip of a coin is random. However a large number of coin-flips will result in about 50% heads and 50% tails. The probability of any outcome of a random phenomenon can be defined as the proportion of times the outcome would occur in a very long series of repetitions. For example, the probability of flipping a heads is 50% given a large number of repetitions (flips).

Coin toss The result of any single coin toss is entirely random. But the result over many tosses IS predictable. The probability of heads is 0.5 = the proportion of times you get heads in many repeated trials. First series of tosses Second series

* Two events are independent if the probability that one event occurs on any given trial of an experiment (e.g. a coin toss) is not affected or changed by the occurrence of the other event. When are trials not independent? Imagine that these coins were spread out so that half the coins showed heads and half showed tails. Close your eyes and pick one. The probability of it being heads is However, if you don’t put it back in the pile, the probability of picking up another coin that is heads up is now less than 0.5. The trials could only be considered independent if you put the coin back each time.

Probability Models Probability models describe, mathematically, the outcome of random processes. A probability model consists of two parts: 1) Sample Space (‘S’): This is the set, of all possible outcomes of a random process. 2) A probability for each possible outcome in the sample space. Example: Probability Model for a Coin Toss: S = {Head, Tail} Probability of heads = 0.5 Probability of tails = 0.5

Example of a Probability Model
Example: A couple wants three children. What are the numbers of girls they could end up with? What are the probabilities for each outcome? Create a probability model. Also: Be sure to note and be comfortable with the way we use variables and symbols such as P(X=2) Sample space: Let X = possible number of girls: {0, 1, 2, 3}  P(X = 0) = P(BBB) = 1/8  P(X = 1) = P(BBG or BGB or GBB) = P(BBG) + P(BGB) + P(GBB) = 3/8  P(X = 2) = P(BGG or GBG or GGB) = P(BBG) + P(BGB) + P(GBB) = 3/8  P(X = 3) = P(GGG) = 1/8 Probability Model:

Terminology Example: A couple wants three children.
NB: It is vitally important to clearly define what X represents. In this case, X is a count of the number of girls that the couple will end up with out of their 3 children. So if your statistics prof gives you the situation in the previous slide and then writes P(X=2) what exactly are you being asked? Answer: “What is the probability of the couple ending up with exactly two girls?” If you were asked P(X>2)? Answer: What is the probability of the couple ending up with exactly 3 girls? What if you were asked P(X>=2)? Answer: What is the probability that the couple ending up with either 2 girls or 3 girls? You must be very clear with this concept. Note that on this slide, I do not spend the time working out the mathematical answer. This is intended as a reminder to you that the most important thing to do first is to clearly understand what the question is asking and to understand precisely what the variables represent.

Terms / Models – Learn ‘em!!!
Sometimes we can intuitively figure things out. What is the probability of drawing an Ace of Spades from a deck of cards? Answer: 1/52 What is the probability of drawing an Ace of Spades from the deck if the card just before was the 5 of Diamonds? Answer: 1/52 Sometimes, however, things become much more subtle and/or complicated. This is why it is important to become very familiar with terms such as: probability models, independent vs non-independent events disjoint vs non-disjoint conditional probability etc etc. Always focus on applying the proper model and you will have a much better chance at understanding the concepts and coming up with the correct answers. Remember, it’s the concept that matters. If you don’t have the concept, you risk incorrect statistics.

** Sample spaces It’s the question being asked that determines the sample space.
H HHH M … M M HHM H HMH M HMM … S = { HHH, HHM, HMH, HMM, MHH, MHM, MMH, MMM } Note: 8 elements, 23 A. A basketball player shoots three free throws. What are the possible sequences of hits (H) and misses (M)? B. A basketball player shoots three free throwsWhat is the number of baskets made? S = { 0, 1, 2, 3 } A single coin flip A single dice roll 2 coin flips Note finite vs infinite sample space C. A nutrition researcher feeds a new diet to a young male white rat. What are the possible outcomes of weight gain (in grams)? S = [0, ∞[ (the ‘[‘ means that infinity is excluded)

Sample Space continued:
What is the sample space for a single coin flip? S(Single Coin Flip) = {H, T} What is the sample space for two coin flips? S(Two coin flips) = {HH, HT, TH, TT} What is the sample space for a single baseball at-bat? S(At bat result) = {Hit, Out} Note that in order to define the sample space, you must read the question carefully!

Probability of Outcomes in the Sample Space
Each outcome in the sample space has a probability of occuring. We notate this outcome as P(….) Eg: Looking at a single coin flip. Let X = the flip result. What is P(X=heads)? Answer: 0.5 Eg: Looking at two coin flips, Let X = the number of heads. What is P(X=2)? Answer is 0.25 – we will explain how we got this number later. The most important point here, is NOT the numerical answer. It is to note the terminology/symbology being used. Key Point: Each of the possible outcomes of a sample space may have different probabilities. That is, they are not necessarily identical. Coin flip outcomes {H,T} do have identical probabilities P(Heads) = 0.5, P(Tails) = 0.5 A good professional basketball player may hit 70% of free throws (well, those whose names aren’t Shaq). P(Basket) = 0.7, P(Miss) = 0.3 An at-bat baseball player’s outcomes {hit, out} have different probabilities P(Hit) = 0.294, P(Out) = 0.706

* Outcomes / Events Proper and specific identification of the event and sample space can sometimes fool you! Be sure to read the question with some care! Without proper identification of the sample space, event, and the probability you are looking for, you’re gonna end up with the wrong answer! An event (also called an outcome) is some subset of the sample space that we are interested in. For the following two examples, define the sample space, the event, and the probability (in terms of symbols): If I roll a single die 50 times, how often will I roll a 1? Sample space = {1, 2, 3, 4, 5, 6} Event = {1} Let X = the number of times a 1 is rolled, i.e. What is P(X=1)? If I roll two die 50 times, how often will I get a 7 or an 11? Sample space = {1/1, 1/2, 1/3, 1/4, 1/5, 1/6, 2/1, 2/2, 2/3, etc, etc} Event = {7, 11} Let X = the number of times a 7 OR an 11 is rolled. What is P(X=7 or X=11)? An event is always some subset of the sample space. Notice that the event is NOT the same thing as the probability. The event is simply the result you are interested in calculating a probability for. The probability refers to the likelihood of that event occuring.

When doing probability work, we are most often interested in determining the likelihood (probability) that a particular event inside the sample space will occur. Classical Probability formula. This is calculated by looking at the sample space and counting all of the times your event occurs. Divide that number by the total number of outcomes in the sample space. P(A) = # of possible outcomes in which event A occurs / total # outcomes in the sample space EXAMPLE: What is the probability of rolling a double in a single roll of the die? The sample space has 36 possible values: {1/1, 1/2, 1/3, 1/4, 1/5, 1/6, 2/1, 2/2, 2/3 ….. 6/5, 6/6} The event you are looking for are doubles {1/1, 2/2, 3/3, 4/4, 5/5, 6/6} If you look through your sample space, you will find 6 occurences of your event. The total number of outcomes in the sample space is 36. Therefore, P(A double is rolled) = 6/36 (0.167). EXAMPLE: What is the probability of rolling a double in 2 rolls of the die? The event you are looking for are doubles in the first roll {1/1, 2/2, 3/3, 4/4, 5/5, 6/6} OR doubles in the second roll. (i.e. You can two tries). P(double in 2 rolls) = 12 possible outcomes that have a double / 36 possible outcomes = 0.333 We will have a slightly more formal discussion of this shortly.

Probability rules Coin Toss Example: S = {Head, Tail} Probability of heads = 0.5 Probability of tails = 0.5 1) Probabilities range from 0 (no chance of the event) to 1 (the event will always happen). For any event A, 0 ≤ P(A) ≤ 1 Probability of getting a Head = 0.5 We write this as: P(Head) = 0.5 P(neither Head nor Tail) = 0 P(getting either a Head or a Tail) = 1 2) Because some outcome must occur on every trial, the sum of the probabilities of all the possible outcomes (the sample space) must be exactly 1. P(sample space) = 1 Coin toss: S = {Head, Tail} P(head) + P(tail) = =1  P(sample space) = 1 Rule #1: A probabilitiy has a value between 0 and 1 Don’t take #2 for granted, it is a useful rule…

If a basketball free-thrower makes the basket 72% of the time:
S = {Hit, Miss} P(Hit) = 0.72 P(Miss) = 0.28 P(Hit + Miss) = 1.0 Recall, all probababilities in the sample space must add up to 1. P(Neither hit nor miss) = 0

New Term: Disjoint Important for our next probability rule…
Disjoint essentially means ‘Mutually Exclusive’ If two events are disjoint, then if one event is true, the other can NOT be true. The two examples here are disjoint since if one of them is true, then the other can not be. Eg: P(born in Canada or born in Mexico) Eg: P(draw Ace of Spades or draw Queen of Hearts)

Probability rules (cont’d )
Venn diagrams: A and B disjoint Probability rules (cont’d ) 3) Two events A and B are disjoint if they have no outcomes in common and can never happen together. If two events A and B are disjoint, then the probability that A or B occurs is the sum of their individual probabilities. P(A or B) aka P(A U B) = P(A) + P(B) This rule is called the addition rule for disjoint events. A and B not disjoint (Later we will learn a different formula to use for NON-disjoint events) When you find yourself using the word “OR” in a probability question, you are probably going to need to use some form of the addition rule. You will need to decide whether to use addition rule for disjoint events, or the addition rule for non-disjoint events.

Example: Addition Rule for Disjoint Events
Example: If you flip two coins, what is the probability of getting ONLY heads or ONLY tails? S = {HH, HT, TH, TT}. Here is the probability table (model): Note that the event {HH} and the event {TT} are disjoint. Therefore, we can use our addition rule for disjoint events. Recall that this rule says that: If two events are disjoint, the probability of one event happening OR the other event happening is simply the sum of the probabilities of each event individually. Answer: So, the probability that you obtain “only heads or only tails” is: P(HH or TT) = P(HH) + P(TT) = = 0.50 Event HH HT TH TT Prob 0.25 Think of a single coin-flip or single dice-roll. What about 2 consecutive coin flips?

The “Addition Rule for Disjoint Events”
This probability rule for disjoint events says that the probability of A happening OR the probability of B happening equals the sum of those two individual probabilities. Put as a fomula: P(A or B) = P(A) + P(B) HOWEVER!  It must be noted that this rule only applies for “disjoint” events. If the events are not disjoint, you can not use this formula to determine the P(A or B). If you want to find the probability of P(A or B) when the events are not disjoint, you must use a different formula. You will encounter rules such as these throughout the course. They key point of this slide: It is important to pay attention to the details! (e.g. “disjoint”)

Example of Non-Disjoint events
Example: P(draw 6 of Spades or draw an even numbered card)  Yes, we see the word ‘OR’ here, which makes it tempting to use the addition rule for disjoint events. However, note that it IS possible for both of these events to be true. Therefore, the rule can not be used. Example: Suppose you have a group of 100 athletes. Of those athletes, 30% are basketball players. In that group of 100 athletes, 25% are over 6-feet tall. What is the probability that an athlete is a basketball player OR is 6 feet tall? As with the previous example, you may be tempted to use our addition rule for disjoint events: P(Basketball Player) + P(6-feet tall) However, because these two events are not disjoint, doing so would give an incorrect result! It is possible for any member of the population to be BOTH a basketball player and also 6 feet tall. The addition rule that we have discussed will NOT give a valid answer in this case. We will soon modify our addition formula to deal with non-disjoint events.

Example: On a single roll of a die, what is the probability of rolling a 1 or a 3 or a 5? You see the word ‘OR’, so you should start thinking about the addition rule. In that case, the first thing you need to ask yourself is, are the events disjoint? Answer: Yes – they can not overlap. That is, on a single roll, you can’t end up with more than one of the possible events at the same time. So, using our addition rule: S = {1,2,3,4,5,6} E = {1, 3, 5} P(1 or 3 or 5) = P(1) + P(3) + P(5) = (1/6) + (1/6) + (1/6) = 3/6 i.e. P(1 or 3 or 5) = 0.5

Another example of a Probability Model:
Example: A couple wants three children. What are the numbers of girls they could end up with? What are the probabilities for each outcome? Create a probability table. Did you try to use the multiplication rule?! To calculate the probabilities for each possible value, we must in this case use the addition rule. Sample space: Let X = possible number of girls: {0, 1, 2, 3}  P(X = 0) = P(BBB) = 1/8  P(X = 1) = P(BBG or BGB or GBB) = P(BBG) + P(BGB) + P(GBB) = 3/8  P(X = 2) = P(BGG or GBG or GGB) = P(BBG) + P(BGB) + P(GBB) = 3/8  P(X = 3) = P(GGG) = 1/8 Important: Note that the sample space in this question is NOT {BBB, BBG, BGB, GBB, GGB, GBG, BGG, GGG} Remember that it is vitally important to properly identify your sample space!!!

Examples using a Probability Table
X = # of girls in a household with 3 children Example: What percentage of households has fewer than 2 girls? Answer: P(X<2) = 3/8 + 1/8 = 0.5 Example: What is the probability that a randomly selected household has at least one girl? Answer: P(X>=1) = 3/8 + 3/8 + 1/8 = 7/8 A better idea would be to use the complement rule: P(X=0)C = 1 – 1/8

Probability rules (cont’d)
Coin Toss Example: S = {Head, Tail} Probability of heads = 0.5 Probability of tails = 0.5 Probability rules (cont’d) 4) The complement of any event A refers to an event in which A does not occur. It is written as Ac. Put as a formula: The complement rule states that the probability of an event not occurring is 1 minus the probability that is does occur. P(not A) = P(Ac) = 1 − P(A) Tailc = not Tail = Head P(Tailc) = 1 − P(Head) = 0.5 Venn diagram: Sample space made up of an event A and its complementary Ac, i.e., everything that is not A. Complement is important… Seems simple, but again, don’t take it for granted. Note the notation we use

Complement: Can be very useful, sometimes it’s just a shortcut. Other times, it is a tool where it would be essentially impossible to solve the problem without it. As a simple example, suppose you want to calculate the probability of rolling a 1 or a 2 or a 3 or a 4 or a 5. This could be determined by calcuating P(1) + P(2) + P(3) + P(4) + P(5) However, an easier way would be to calculate the complement of rolling a 6: P(6C)= 1 – P(6) = 1-1/6 = 5/6

Probabilities: finite number of outcomes
Finite sample spaces deal with discrete data — data that can only take on a limited number of values. For example: Sample space of a roll of a die is finite. {1,2,3,4,5,6} Sample space of people’s heights in not finite. [0”, 110”] Throwing a die: S = {1, 2, 3, 4, 5, 6} As with disjoint/non-disjoint, we will see that there are different probability formulas that are used depending on whether the sample space is discrete v.s. non-discrete.

* M&M candies If you draw an M&M candy at random from a bag, the candy will have one of six colors. The probability of drawing each color depends on the proportions manufactured, as described here: What is the probability that an M&M chosen at random is blue? Color Brown Red Yellow Green Orange Blue Probability 0.3 0.2 0.1 ? S = {brown, red, yellow, green, orange, blue} P(S) = P(brown) + P(red) + P(yellow) + P(green) + P(orange) + P(blue) = 1 P(blue) = 1 – [P(brown) + P(red) + P(yellow) + P(green) + P(orange)] = 1 – [ ] = 0.1 What is the probability that a random M&M is either red, yellow, or orange? P(red or yellow or orange) = P(red) + P(yellow) + P(orange) = = 0.5

M&M candies eg: Blue = 0.06, Fuscia = 0.04? Blue = 0.05, Fuscia= 0.05?
Color Brown Red Yellow Green Orange Blue Fuscia Probability 0.3 0.2 0.1 ? Suppose I told you there was an SEVENTH color, ‘fuscia’. Now what is the probability that an M&M chosen at random is blue? Answer: You can’t tell. The probabilities must add up to 1. All colors except blue and fuscia add up to Blue and fuscia must make up the remaining We have no way of knowing the exact amounts. eg: Blue = 0.06, Fuscia = 0.04? Blue = 0.05, Fuscia= 0.05? Blue = 0.03, Fuscia = 0.07? etc

Methods of determining the probability to an event:
There are only 2 ways of assigning probabilities with some degree of accuracy: Theoretically  from our understanding of the phenomenon and symmetries in the problem For a 6-sided fair die, P(2) = 1/6 For a deck of cards, P(Ace of Hearts) = 1/52 Empirically  We look at data. In other words, we are deriving our knowledge based on numerous similar past events. We determine a probability empirically when there isn’t a theoretical method of determining the probability. EXAMPLE: What is the probability that a random student in IT-223 wil receive an A? This probability can not be determine theoretically the way, say, a coin flip can be. Instead, we’d have to look at past data from IT-223 courses and see how many As were recorded. If in a sample of 600 students, 54 A’s were recorded, then P(grade of A) = 54 / 600. EXAMPLE: What is the probability of a baseball player getting a hit on a given at bat? Again, we would have to look at past data to see his/her batting average. If over the last year, he hit 122 times out of 411 at bats, P(Hit) = 122/400.

See example on next slide.
Shortcut: Determining a probability when all outcomes are equally likely When all outcomes are equally likely (have the same probability), there is a nice quick way of calculating probabilities: If a random phenomenon has a group of equally likely possible outcomes (e.g. die roll, coin flip) then each individual outcome has probability 1/(# of possible outcomes). Eg: For a coin flip, all possible outcomes have an equal probability (0.5) Eg: For a die roll, all outcomes have an equal probability (1/6) Formula: For any event A where all outcomes have the same probability: So when looking at situations where all outcomes have the same likelihood, this is a very convenient way of calculating a probability. See example on next slide.

Dice You toss two dice. What is the probability of the outcomes summing to 5? S: {(1,1), (1,2), (1,3), ……etc.} There are 36 possible outcomes in S, all equally likely (given fair dice). Thus, the probability of any one of them is 1/36. P(sum of 5) = 4 possible outcomes P(outcomes in Sample space) = 36 outcomes in sample space = 4/36 Keep contrasting with finite but non-equal outcomes (e.g. bumped passengers on a plane) - Settimi’s

EXAMPLE: What is the probability of the ball landing in an even number slot?
Answer: All slots have an equal probability. So: P(Even) = count of Even outcomes / count of All outcomes = 18 / 38 = 0.47

Multiplication Rule for Independent Events
Probability rules (cont’d) Multiplication Rule for Independent Events 5) Two events A and B are independent if knowing that one event has occured does not change the probability that the other will occur. The multiplication rule for independent events: P(A and B) = P(A) * P(B) Again, this rule only applies to independent events. Later we will show a second rule that applies to non-independent events If you have two children, what is the probability of having 2 girls?

Independence Flip a coin once and get ‘heads’. Does this result change the probability of rolling a heads again on the second roll? No, they are independent results In a game of Poker, you draw a card from the deck and get an Ace. Does this result change the probability of getting an Ace on a second draw from that same deck? Yes, there are now only 3 aces left in the deck. These events are not independent. (Initially, 4/52 chance, after removing 1 ace, 3/51 chance)

Multiplication Rule for Independent Events
Example Multiplication Rule for Independent Events Example: What is the probability of getting a tails on two consecutive coin tosses? When we encounter the term ‘AND” we typically think of the multiplication rule. At that point we must ask ourselves if the events are independent. Because in two consecutive coin tosses, they are indeed indepdendent, then: P(first = Tail and second = Tail) = P(first Tail) * P(second Tail) = 0.5 * 0.5 = 0.25 If you have two children, what is the probability of having 2 girls?

Example: A couple intends to have three children. What is the likelihood that they will have only boys? Each birth is independent of the next, so we can use the multiplication rule. Example: P(BBB) = P(B)* P(B)* P(B) = (1/2)*(1/2)*(1/2) = 1/8

Example: You are dealt two cards out of a deck. Calculate the probability the the cards are the Ace of Spades and the Ace of Hearts. Solution: You will probably be tempted to use the multiplication rule. However, the rule you’ve been given only applies to independent events. Unfortunately, these events are not independent. Once one event has occurred (e.g. drawing the ace of spades from the deck), the probability for the second event changes. Later, we will learn a slight modification to this rule so that it can also be applied to non-independent events. If, however, we changed the question to state that instead of pulling two cards out of the deck, we pull out one card, then put it back and pull out another card, we could use our multiplication rule, since these are now independent events.

Disjoint vs Independent
The two are often confused. Be sure that you are clear on the meaning of each. One common question has to do with the relationship between them. Is there one? For example, can an event be both disjoint and independent? Disjoint = If two events, A and B are disjoint, we are saying that if P(A) is true, then P(B) can not be true. So let’s say that our event is indeed disjoint. If our event is disjoint, this says that P(B) is affected by P(A). If P(B) is affected by the occurrence of A , then the two events are NOT independent. So the answer to our question is NO! When you think about it, if two events are disjoint, then those events can NOT be independent.

Example: (4.33 p.247) A state lottery’s Pick 3 game asks players to choose a 3-digit number, 000 to The state chooses the winning number at random, so each number has probability 1/ You win if the winning number contains the digits in your number, in any order. Your number is 456. What is your probability of winning? (Write on a piece of paper) Your number is 212. What is your probability of winning. Solution: As always, you have to read the question carefully, and always be sure to watch the fine print! In this case, the three tiny words at the end, “in any order” change the entire nature of the problem. So, instead of winning with 456 where your probability would be 1/1000, you can also win with 465, or 546, or 564 etc. If you look at all possible permutations, you will see that there are 6 distinct possibilities: {456, 465, 546, 564, 645, 654}. Since each one has a 1/1000 chance of winning, and you have six possibilities, the probability of winning is 6/1000 or For the second part, you only have three distinct arrangements: {212, 221, 122}. So your probability of winning is 3 * 1/1000 =

Example (p.247) The PINs for ATMs usually consist of 4 digits. You ntoice that most PINs have at least one 0 and you wonder if the issuers use lots of 0s to make the numbers easy to remember. Suppose that PINs are assigned at random so that all 4-digit numbers are equally likely. How many possible PINs are there? – Write down your answer What is the probability that a PIN assigned at random has at least one 0? – Write. SOLUTION: Possible PINs: 104 = 10,000 The operative word for the second question is “at least”. In other words, what is the probability that 1 or 2 or 3 or all 4 numbers are zero? This turns out to be a surprisingly involved calculation. However, if you remember your trusty complement rule, it becomes surprisingly easy! You realize that you could phrase things another way: If I can determine the probability that NONE of the numbers are 0, then the complement of that equals the probability at AT LEAST one is 0. P(one number = 0) = 0.1, P(not 0)=0.9 P(all 4 numbers are not 0) = 0.9*0.9*0.9*0.9 = Don’t forget that this is the probability that NONE are 0. So the probability that 1 or more are zero = =

Topics we’ve just covered:
Randomness and Probability models Probability and Randomness Sample spaces Probability rules Assigning probabilities: finite number of outcomes Assigning probabilities: equally likely outcomes Independence and multiplication rule Write a short summary of each of these on a sheet Keep that sheet in front of you as you are studying, and refer to it frequently. It is very useful both for review, and to keep from confusing all these different concepts.

Probability: The Study of Randomness Random Variables
IPS Chapters 4.3 and 4.4 © 2009 W.H. Freeman and Company

Objectives (IPS Chapters 4.3 and 4.4)
Random variables Discrete random variables Continuous random variables Normal probability distributions Mean of a random variable Law of large numbers Variance of a random variable Rules for means and variances

Discrete random variables
A random variable is a variable whose value is a numerical outcome of a random phenomenon. A basketball player shoots three free throws. We define the random variable X as the number of baskets successfully made. A discrete random variable X has a finite number of possible values. A basketball player shoots three free throws. The number of baskets successfully made is a discrete random variable (X). X can only take the values 0, 1, 2, or 3. The ‘random’ comes from the fact that we don’t know in advance how many baskets the player will make.

* Probability Distribution
The probability distribution of a random variable X lists the values and their probabilities: The probabilities pi must add up to 1. (Recall Probability Rule #2) A basketball player with a 50% free throw accuracy shoots three free throws. The random variable X is the number of baskets successfully made. Here is a probability table: Value of X Probability 1/8 3/8 3/8 1/8 You want to get good at making a probability distribution. They are not hard, but you must CORRECTLY know the sample space probability for each event (in this case, the event is the number of baskets made) H H HHH M … M M HHM H HMH M HMM … HMM HHM MHM HMH MMM MMH MHH HHH

The probability of any event is the sum of the probabilities pi of the various values of X that make up the event. A basketball player shoots three free throws. He typically makes a basket exacty 50% of the time. The random variable X is the number of baskets successfully made. What is the probability that the player successfully makes at least* two baskets (“at least two” means “two or more”)? Value of X Probability 1/8 3/8 3/8 1/8 HMM HHM MHM HMH MMM MMH MHH HHH P(X≥2) = P(X=2) + P(X=3) = 3/8 + 1/8 = 1/2 In this case, the event is “2 or more free throws”. The possible values of X making up this event are {2, 3} * “at least”: An important concept that has a way of showing up very often in real life (and on exams)

The probability of any event is the sum of the probabilities pi of the [various] values of X that make up the event. A basketball player shoots three free throws. The random variable X is the number of baskets successfully made. Value of X Probability 1/8 3/8 3/8 1/8 What is the probability that the player successfully makes fewer than three baskets? HMM HHM MHM HMH MMM MMH MHH HHH P(X<3) = P(X=0) + P(X=1) + P(X=2) = 1/8 + 3/8 + 3/8 = 7/8 We can also use the complement: P(X<3) = 1 – P(X=3) = 1 – 1/8 = 7/8 What is the event? What are the values of X that make up this event? What is the event? What are the values of X that make up this event? Event: <3 baskets Values of X: {0, 1, 2}

Continuous random variables
A continuous random variable X takes all values in an interval (range). Example: There is an infinity of numbers between 0 and 1 (e.g., 0.001, 0.4, ). How do we assign probabilities to events in a sample space that is infinite?!? Answer: We use density curves and compute the probabilities for intervals. The probability of any event is the area under the density curve for the values of X that make up the event. cf: Discrete random variables “Uniform Density Curve”: means that all values are equally probable. Note this this density curve is a straight line. No one value has a higher likelihood than any other value. Shown here is a uniform density curve for the variable X. The probability that X falls between 0.3 and 0.7 is the area under the density curve for that interval: P(0.3 ≤ X ≤ 0.7) = (0.7 – 0.3)*1 = 0.4 X

“Uniform” Density Curve
“Uniform” means that all probabilities are identical A uniform density curve is a straight line (red line in this diagram) The area under the curve represents the total probability and sums to 1 Recall that density curves can come in all kinds of shapes The bell curve is only one kind of distribution. It is so familiar to us because this kind of distribution shows up very often in the real world people’s heights exam scores intelligence scores etc In probabilities, we frequently encounter situations where all possible outcomes have the same probabilities. (e.g. on a dice roll, all numbers have an equal probability of being rolled). In this case, there will not be any curve to the density curve. Intsead, it will be a straight horizontal line. This is called a “uniform density curve”.

Intervals The probability of a single event is meaningless for a continuous random variable. Only intervals can have a non-zero probability. Again, probability is represented by the area under the density curve for that interval. The probability of a single event is zero: P(X=1) = (1 – 1)*1 = 0 Height = 1 X The probability of an interval is the same whether boundary values are included or excluded: P(0 ≤ X ≤ 0.5) = (0.5 – 0)*1 = 0.5 P(0 < X < 0.5) = (0.5 – 0)*1 = 0.5 P(0 ≤ X < 0.5) = (0.5 – 0)*1 = 0.5 P(X < 0.5 or X > 0.8) = P(X < 0.5) + P(X > 0.8) = 1 – P(0.5 < X < 0.8) = 0.7

Example: Random number between 0-2
In this case, you are generating a single random number between 0 and 2. The density curve is shown below. What is the height? Answer: 0.5  Recall that the total area under a density curve represents the complete range of probabilities. Also recall that the total probability must sum to 1. So if your range is from 0-2, then the total height must be 0.5. What is the probability that a random number generated will be <1.5? Answer: Find the area between 0 and Ie: (1.5-0) * 0.5 = 0.75 Find the probability of a random number between 0.6 and 1.7 Answer: ( )*0.5 =

1 2 1 2 0.5 1.5 Example: Another example of a density curve
We generate two random numbers between 0 and 1 and take Y to be their sum. Therefore, Y can take any value between 0 and 2. The density curve for Y is: Height = 1. We know this because the base = 2, and the area under the curve has to equal 1 by definition. The area of a triangle is ½ (base*height). Y 1 2 What is the probability that Y is < 1? What is the probability that Y < 0.5? Another example of a density curve. Suppose we want to generate a random number between 0 and 2. This is what our density curve would look like. 1 2 0.5 0.25 0.125 1.5

1 2 Question: Is this a uniform density curve?
Answer: No. Recall that a uniform density curve is one in which all possible values in the sample space have the exact same probability. For example, P(1) is different from P(1.3) which is different from P(0.14), etc etc. Y 1 2

Continuous random variable and population distribution
% individuals with X such that x1 < X < x2 There are two ways of looking at the information under a density curve. The shaded area under a density curve shows the proportion, or %, of individuals in a population with values of X between x1 and x2. Because the probability of drawing one individual at random depends on the frequency of this type of individual in the population, the probability is also the shaded area under the curve. Eg: a height between 63 and 64 inches How many women are between 63 and 64 inches? If I pick a random person, what is the probability that she is between 63 and 64 inches?

Because the woman is selected at random, X is a random variable.
What is the probability, if we pick one woman at random, that her height will be some value X? For instance, between 68 and 70 inches P(68 < X < 70)? Because the woman is selected at random, X is a random variable. As before, we calculate the z-scores for 68 and 70. For x = 68", For x = 70", N(µ, s) = N(64.5, 2.5) 0.9192 0.9861 The area under the curve for the interval [68" to 70"] is − = Thus, the probability that a randomly chosen woman falls into this range is 6.69%. P(68 < X < 70) = 6.69%

What is the probability, if we pick one woman at random, that her height will be 70 inches?
Answer: We can NOT say. Recall that with continuous variables, it is not possible to determine the probability of a single value. We can only determine the probability of a range of values.

* Mean of a random variable
The mean (x-bar) of a set of actual observations is their arithmetic average. E.g. The mean exam score of all students in a stats class. The mean µ of a random variable X is a weighted average of the possible values of X, reflecting the fact that all outcomes might not be equally likely. A “50% free throwing” basketball player shoots three free throws. The random variable X is the number of baskets successfully made (“H”). Incidentally, do you think these probabilities are valid? The probabilities in this chart are only if a person randomly hits baskets. Obviously if a professional basketball player was throwing, the probabilities would be very different. Compare with, say, a statistics teacher who is 5’7”… HMM HHM MHM HMH MMM MMH MHH HHH Value of X Probability 1/8 3/8 3/8 1/8 The mean of a random variable X is also called expected value of X.

Mean of a discrete random variable
For a discrete random variable X with the following probability distribution  the mean µ of X is found by multiplying each possible value of X by its probability, and then adding the products. A basketball player shoots three free throws. The random variable X is the number of baskets successfully made. The mean µ of X is µ = (0*1/8) + (1*3/8) + (2*3/8) + (3*1/8) = 12/8 = 3/2 = 1.5 Value of X Probability 1/8 3/8 3/8 1/8

Mean of a continuous random variable
The probability distribution of continuous random variables is described by a density curve. With symmetric curves, such as the Normal curve, the mean lies at the center. Determining the exact mean of a distribution with a skewed density curve is more complex.

** Law of large numbers As the number of randomly drawn observations (n) in a sample increases, the mean of the sample (x-bar) gets closer and closer to the population mean m. This is the law of large numbers. It is valid for any population. Note: Humans typically expect predictability over a few random observations, but this is wrong! (e.g. flipping a coin 10 times will not typically yield 0.5; filip a million times and you’ll get extremely close to 0.5)The law of large numbers only applies to really large numbers.

Standard Deviation (variance) of a random variable
The standard deviation / variance are the measures of spread that accompany the choice of the mean to measure center. Recall that SD is simply the square root of the variance. The standard deviation of a random variable is a weighted average of the deviations of the variable X from its mean. Each outcome is weighted by its probability in order to take into account outcomes that are not equally likely. Yes, this is the same variance and sd from before. However, when calculating these values in a probability situation, we need to factor in the different probabilities of each possible outcome. KP: We need to take each possible outcome and assign a weight to it according to its probability. Unlike dice rolls / coin flips, many cases to NOT have outcomes of identical probability.

Standard Deviation of a discrete random variable
For a discrete random variable X with the probability distribution shown and mean µX, the sd of X is found by multiplying each deviation by its probability and then adding all the products. This formula shows variance: A basketball player shoots three free throws. The random variable X is the number of baskets successfully made. µX = 1.5. ** WEIGH: multiply each random variable by its probability Value of X Probability 1/8 3/8 3/8 1/8 The variance σ2 of X is σ2 = 1/8*(0−1.5)2 + 3/8*(1−1.5)2 + 3/8*(2−1.5)2 + 1/8*(3−1.5)2 = 2*(1/8*9/4) + 2*(3/8*1/4) = 24/32 = 3/4 = .75

Practice on your own: 4.32/4.33 p. 267 and 269 (p.278 & 280 in 6th ed)
Linda is a sales associate at a large auto dealership. At her comission rate of 25% of gross profit on each vehicle she sells, Linda expects to earn $350 for each car sold and $400 for each truck or SUV sold. Linda motivates herself by using probability estimates of her sales. For a sunny Saturday in April, she estimates her car sales as follows: Cars Sold 1 2 3 Prob 0.3 0.4 0.2 0.1 Trucks Sold 1 2 3 Prob 0.4 0.5 0.1 What is Linda’s mean expected income? What is the spread of her expected income?

Objectives (IPS Chapter 4.5)
Generalized probability rules General addition rules Conditional probability General multiplication rules Tree diagrams Bayes’s rule

General addition rule General addition rule for any two events A and B: The probability that A occurs, B occurs, or both events occur is: P(A or B) = P(A) + P(B) – P(A and B) Including non-disjoint events! What is the probability of randomly drawing either an ace or a heart from a deck of 52 playing cards? There are 4 aces in the pack and 13 hearts. However, 1 card is both an ace and a heart. If you’re not careful, you’ll count it twice. So you need to subtract it once. Thus: P(ace or heart) = P(ace) + P(heart) – P(ace and heart) = 4/ /52 - 1/52 = 16/52 ≈ .3

Eg: A die roll: P(even or >2)?
No, because we count a 4 and 6 two times. (Draw a Venn diagram of dice). We must subtract the number of times they show up together: 4 (even and >2) and also 6 (even and >2). In other words, 2/6 times, they are together. So: 7/6 – 2/6 = 5/6

Another example: In a group of 100 athletes, 30% are basketball players. In that same group, 25% are over 6-feet tall. P(Basketball Player) = 0.3 P(>6 feet tall) = 0.25 What is P(BB or >6 feet)? If we count , we will get an artificially high number. This is because of the 25 people who are 6 feet, many are basketball players as well. So we will have counted those people TWICE. In other words, P(BB or 6’): It is NOT P(Basketball Player) + P(>6-feet tall) The reason is that these events are NOT disjoint. That is, it is possible for any member of the population to be BOTH a basketball player and also 6 feet tall. The addition rule that we have discussed will NOT give a valid answer in this case. To get an accurate result, we have to factor in that group that we counted twice, namely, those people who are BB and 6 feet. = P(BB) + P(6’) – P(BB and 6’)

* More than one non-disjoint event
EXAMPLE: What is the probability that a card from a deck is either a King or a Queen or a Diamond? = P(King) + P(Queen) + P(Diamond) – all possible non-disjoint events = P(King) + P(Queen) + P(Diamond) – P(King and Diamond) – P(Queen and Diamond) = 4/ / / – / – /52 = 19 / 52 = 0.365

Conditional probability
Conditional probabilities reflect how the probability of an event can change if we know that some other event has occurred. Example: The probability that a cloudy day will result in rain is different if you live in Los Angeles than if you live in Seattle. Every single day, our brains calculate conditional probabilities, updating our “degree of belief” with each new piece of evidence. Notation: The conditional probability of event B “given” event A is: Note: We assume that P(A) ≠ 0 Spoken as: “Probability of B given A”

P(A|B) is not the same as P(B|A)
P(Rain | Seattle) is not the same thing as P(Seattle | Rain)

General multiplication rule (“And”)
Probability that any two events, A and B, both occur: P(A and B) = P(A) * P(B|A) This is the general multiplication rule. (i.e. NOT limited to independent events) Recall that if A and B are independent, then P(A and B) = P(A) * P(B) (A and B are independent when they have no influence on each other’s occurrence.) Example: What is the probability of randomly drawing either an ace or heart from a deck of 52 playing cards? Answer: Use the addition rule What is the probability of randomly drawing either an ace and a heart from a deck of 52 playing cards? P(ace and heart) = P(ace)* P(heart | ace) = (4/52)*(1/4) = 1/52

Sometimes it may help to switch ‘em around
Recall: P(A and B) = P(A) * P(B | A) Sometimes when trying to figure out P(A and B), you can’t figure out P(B | A). Sometimes though, you do know the other ‘side’: P(A | B). Now observe that P(A and B) is the same thing as saying P(B and A). Therefore, you can say: P(B and A) = P(B) * P(A | B) EXAMPLE: Suppose you visit Seattle 8 weekends out of the year. P(Seattle) = 8/52. You also know that it rains in Seattle 38 weekends out of the year: P(rain | Seattle) = 38/52. What is the probability that you will be in Seattle this weekend and that it will be raining? P(Rain and Seattle) = P(Rain) * P(Seattle | rain)  awkward – in fact, not possible since we don’t know some of the probabilities required here such as P(Rain). P(Seattle and Rain) = P(Seattle) * P(Rain | Seattle)  much easier

Example – 4.43, p.283 Slim is playing poker and holds 3 diamonds. He hopes to draw two diamonds in a row in order to get a flush (all cards of the same suit). Of all the cards showing (those in Slim’s hand and those upturned on the table), he sees 11 cards. 4 of which are diamonds. So 9 of the 41 remaining unseen cards must be diamonds. What is the probability that Slim will draw his two diamonds? P(first card is a diamond) = 9/41 P(second card diamond | given first card diamond) = 8/40 9/41 * 8/40 = 0.044

Conditional probability can be applied to more than two events
What is the probability that a person chosen at random off the street is: Italian, Male and plays the lute? A = probability of being Italian B = male C = plays lute P(A and B and C) = P(A) * P(B | A) * P(C | A and B) Key Point: It can get a bit unweildy, but if you have the data, it is possible to do these calculations.

Probability trees (skip)
Conditional probabilities can get complex, and it is often a good strategy to build a probability tree that represents all possible outcomes graphically and assigns conditional probabilities to subsets of events. 0.47 Internet user Tree diagram for chat room habits of three adult age groups: 18-29, 30-49, >=50. P(C and A1) = P(A1)*P(C|A1) = 0.29*0.47 P(chatting) = = 0.252 About 25% of all adult Internet users visit chat rooms.

Breast cancer screening
If a woman in her 20s gets screened for breast cancer and receives a positive test result, what is the probability that she does have breast cancer? Cancer No cancer Mammography Positive Negative Disease incidence Diagnosis sensitivity Diagnosis specificity False negative False positive 0.0004 0.9996 0.8 0.2 0.1 0.9 Incidence of breast cancer among women ages 20–30 Mammography performance She could either have a positive test and have breast cancer or have a positive test but not have cancer (false positive).

Cancer No cancer Mammography Positive Negative Disease incidence Diagnosis sensitivity Diagnosis specificity False negative False positive 0.8 0.0004 0.2 0.1 0.9996 Incidence of breast cancer among women ages 20–30 0.9 Mammography performance Possible outcomes given the positive diagnosis: positive test and breast cancer or positive test but no cancer (false positive). This value is called the positive predictive value, or PV+. It is an important piece of information but, unfortunately, is rarely communicated to patients.

Bayes’s rule An important application of conditional probabilities is Bayes’s rule. It is the foundation of many modern statistical applications beyond the scope of this course. * If a sample space is decomposed in k disjoint events, A1, A2, … , Ak — none with a null probability but P(A1) + P(A2) + … + P(Ak) = 1, * And if C is any other event such that P(C) is not 0 or 1, then: However, it is often intuitively much easier to work out answers with a probability tree than with these lengthy formulas.

If a woman in her 20s gets screened for breast cancer and receives a positive test result, what is the probability that she does have breast cancer? Cancer No cancer Mammography Positive Negative Disease incidence Diagnosis sensitivity Diagnosis specificity False negative False positive 0.8 0.0004 0.2 0.1 0.9996 Incidence of breast cancer among women ages 20–30 0.9 Mammography performance This time, we use Bayes’s rule: A1 is cancer, A2 is no cancer, C is a positive test result.

Probability: The Study of Randomness Randomness and Probability Models

Similar presentations

Presentation on theme: "Probability: The Study of Randomness Randomness and Probability Models"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Probability: The Study of Randomness Randomness and Probability Models

Similar presentations

Presentation on theme: "Probability: The Study of Randomness Randomness and Probability Models"— Presentation transcript:

Similar presentations

About project

Feedback