Data Analysis and Statistical Software I (323-21-403) Quarter: Autumn 02/03 Daniela Stan, PhD Course homepage: http://facweb.cs.depaul.edu/Dstan/csc323 Office hours: (No appointment needed) M, 3:00pm - 3:45pm at LOOP, CST 471 W, 3:00pm - 3:45pm at LOOP, CST 471 2/18/2019 Daniela Stan - CSC323
Outline Chapter 3: Probability – The Study of Randomness Randomness Probability Models 2/18/2019 Daniela Stan - CSC323
Random Phenomenon Toss a fair coin, sometimes you get heads sometimes you get tails. Roll a die: the die can land on any of the 6 faces. 3. Waiting time at the dentist: sometimes you wait less than 10 minutes & sometimes you wait longer. In a random phenomenon individual outcomes are uncertain. 2/18/2019 Daniela Stan - CSC323
Random Phenomenon (cont.) There is still a regular pattern in the phenomenon, that is discovered only after many repetitions. Tossing a coins The proportion of heads in “n” tosses of a coin changes as we make more tosses. Eventually it approaches 0.5 2/18/2019 Daniela Stan - CSC323
Probability or Chance The chance or probability of a certain outcome is the percentage of times the outcome is expected to happen, when the process is repeated over and over again, independently and under the same conditions. Tossing a coin: what is the chance of getting a head? Rolling a die: what is the chance of getting a 3? It is 1 in 2 that is 50% It is 1 in 6, that is 16.7% 2/18/2019 Daniela Stan - CSC323
Probability Models The sample space S of a random phenomenon is the set of all possible outcomes. An event is an outcome or a set of outcomes of a random phenomenon (a subspace of the sample space) Chances are between 0% and 100%; equivalently, we say probabilities are between 0 and 1. The impossible event occurs 0% of the time, hence has 0% chance to happen (probability=0) The certain event happens every time, hence has 100% chance to happen (probability=1) Example: All possible outcomes together must have probability 1: P(S)=1 2/18/2019 Daniela Stan - CSC323
Computing Chances To calculate the chance or probability of an event Count all the possible outcomes of the random process Count the outcomes that are favorable to the event The chance is calculated as the ratio # favorable outcomes chance= # all possible outcomes One deck of cards is shuffled and the top card is placed face down on the table. What is the chance that the card is a king of hearts? How many cards are in a deck? How many king of hearts? 2/18/2019 Daniela Stan - CSC323
Computing Chances What’s the probability of selecting a female student in this class? # of students = 22 # female students = 5 What’s the probability of choosing a red M&M from a bag containing 10 red, 5 blue and 3 yellow candies? 2/18/2019 Daniela Stan - CSC323
Throwing a pair of dice There are 36 ways of throwing two dice. What is the probability of getting a 7? Count the favorable outcomes and divide by 36. 2/18/2019 Daniela Stan - CSC323
The Complement Rule What is the chance that the first card is not a king of hearts? How many cards are in a deck? How many cards are not king of hearts? The answer is 51/52 =1-1/52, that is 1 minus the chance that the first card is a king of hearts; this is an example of the complement rule. The chance of something happening is equal to 100% minus the chance of the opposite event. This is useful if the chance of the “opposite” event is easier to compute. 2/18/2019 Daniela Stan - CSC323
The Complementary Rule The complement rule states that: P(Ac) = 1 – P(A) 2/18/2019 Daniela Stan - CSC323
Computing Probabilities (cont) A deck of cards is shuffled and the second card is placed faced down on the table: What is the probability for the second card to be a king of hearts? What is the probability of the first card to be an ace? How many aces in a deck of cards? How many cards? It is equal to the chance of the first card to be a king of hearts 2/18/2019 Daniela Stan - CSC323
Mutually Exclusive Events If two events cannot occur together, the probability that one or the other occurs is the sum of their individual probabilities. P(A or B)=P(A)+P(B) What’s the probability of getting 2 or 3 in a roll of a die? P(2 spot face or 3 spot face) = P(2 spot face)+P( 3 spot face) = 2/18/2019 Daniela Stan - CSC323
Known Probabilities All human blood can be typed as one of O, A, B, AB. The distribution of the types varies a bit with the race. The table displays the probabilities of a randomly chosen white American. The probabilities are calculated as the proportion of individuals with a given blood type in a very large sample of white Americans. Blood Type O A B AB Probability 0.45 0.40 0.11 ? What’s the probability that a white American has type AB blood? Judy has type B blood. She can safely receive transfusions from people with type O and type B blood. What is the probability that a randomly chosen white American can donate her blood? 2/18/2019 Daniela Stan - CSC323
Intranet Design The Intranet is a private Internet operating on a company's internal network. It is a convenient vehicle within the company for sharing information, documents and databases. In U.S. companies, Intranet is designed either by IT personnel, graphic artists, consultant IT personnel or consultant graphic artists. The probabilities of selecting a company that has Intranet developed by a given professional is displayed below. Intranet developers Internal IT personnel Internal graphic artists Consultant IT personnel Consultant graphic artists Probability 0.40 0.20 0.25 0.15 What is the probability that a company’s Intranet wasn’t designed by a graphic artist? 2/18/2019 Daniela Stan - CSC323
Conditional Probabilities You win one dollar if the second card is a queen of spades. I) What is the chance of winning one dollar? Probability of second card being a queen of spades: 1/52=0.0192 II) You know that the first card is an ace, what is the chance of winning one dollar? There are 51 cards left in the deck of cards, and there is just one chance out of 51 to get the queen of spades. So the chance is 1/51=0.0196 A conditional probability is the probability of an event, knowing that another event has occurred. 2/18/2019 Daniela Stan - CSC323
Ex. of conditional probability The lab has 30 pc’s, SAS is installed on 20 pc’s. A student wants to use SAS and chooses a pc at random. What is the probability of choosing a pc that runs SAS? The probability is ? Two students are already using two pc’s and working with SAS. What is the probability of choosing a pc that runs SAS? Pr(SAS| two students use SAS)= 2/18/2019 Daniela Stan - CSC323
Multiplication Rule What is the probability that two students choose two computers that have SAS? Two events need to happen together: the first student selects a pc that has SAS the second student selects another pc that has SAS (given that the first student is using a pc with SAS). The first chance is calculated above = 20/30 The chance of the second event is conditional and is 19/29 The probability of the two students choosing two computers with SAS is the product of I and II. The answer is chance = 20/30*19/29=43.67% 2/18/2019 Daniela Stan - CSC323
P(A occurs & B occurs)=P(A occurs) × P(B occurs | A has occurred) Multiplication Rule The chance of two events happening together is equal to the chance that the first will happen multiplied by the chance that the second will happen given that the first has happened. P(A occurs & B occurs)=P(A occurs) × P(B occurs | A has occurred) given 2/18/2019 Daniela Stan - CSC323
Independent Events Two events are independent if the chance for the second given the first are the same, no matter how the first turns out. For instance, roll a die 3 times. Each time you roll a die, this is independent of the other times. Coin tosses are independent of each other, the probability of getting a head won’t change. Lottery drawings are independent!!! The probability of winning does not change from drawing to drawing!! It is always very very small In the Illinois lotto, what’s the “luckiest” combination between these two? 1 2 3 4 5 6 or 3 56 23 67 32 1. 2/18/2019 Daniela Stan - CSC323
Independent Events If two events are independent, the chance of them happening together is equal to the product of their probabilities. The chance of getting two aces in two rolls of a die is P(1st roll =ace & 2nd roll=ace) = P(1st roll =ace)P(2nd roll=ace) = 1/6 1/6 = 1/36 2/18/2019 Daniela Stan - CSC323
Telephone cable A transatlantic telephone company contains repeaters at at regular intervals to amplify the signal. If a repeater fails it must be replaced by fishing the cable to the surface at great expense. Each repeater has 0.999 probability of functioning without failure for 10 years. Repeaters fail independently from each other. What is the probability that two receivers both last for 10 years? Pr(No failure for receiver 1 and 2)= A company has 10 receivers, what is the probability that only one receiver will fail in 10 years? Pr(only one receiver will fail)=Pr(No failure for 9 receiver & one receiver will fail). The events are all independent, so the probability Pr(No failure for 9 receiver & one receiver will fail) is the product of individual probabilities. 2/18/2019 Daniela Stan - CSC323
Chance error = number of heads – half the number of tosses The Law of Averages A coin lands heads with chance 50%, thus P(fair coin=heads)=0.5 If we toss a coin many times, say 1,000 times, we would expect to get 1,000/2 = 500 heads. This rarely happens in nature. We will most likely see, for example 503 heads, 498 heads, 510 heads or 490 heads. This is because of chance variability: Chance error = number of heads – half the number of tosses 2/18/2019 Daniela Stan - CSC323
Example: For instance, in 1000 tosses we get 550 heads. The chance error = 500 – 550= 50 tosses (in absolute terms) Also, we can say that (relative to the number of tosses) the chance error is 50/1000=0.05 or 5% of the number of tosses 2/18/2019 Daniela Stan - CSC323
Suppose 4 coins were tossed 1600 times each. The “chance error” number of heads – half the number of tosses was plotted against the number of tosses Number of Tosses half the number of tosses Number of heads minus After 400 tosses of a coin, the chance errors for the four coins were: 10 = 210 – 200 –8 = 192 – 200 –12 = 188 – 200 3= 203 – 200 After 1600 tosses of a coin, the chance errors for the four coins were: 30 = 830 – 800 –26= 774 – 800 –14= 786 –800 8= 808 – 800 2/18/2019 Daniela Stan - CSC323
For the same 4 coins, here are the chance errors expressed as a percentage of the number of tosses Percentages of heads–50% If a coin is tossed 400 times the percentage of heads is 50% give or take 4% If a coin is tossed 1600 times the percentage of heads is 50% give or take 2% 2/18/2019 Daniela Stan - CSC323
As the number of tosses increases the chance error (= the difference between number of heads and half number of tosses) gets bigger. half the number of tosses Number of heads minus Number of Tosses However if we consider percentages of number of tosses: as the number of tosses goes up, the difference between the percentage of heads and 50% tends to get smaller Percentages of heads–50% Number of Tosses 2/18/2019 Daniela Stan - CSC323
The law of averages The law of averages is about the chance error. The law of averages says that the chance error is likely to be large in absolute value, but small relative to the number of times the process is repeated (e.g. number of tosses). A coin lands heads 550 times in 1,000 tosses. The chance error is 550 – 500 = 50 in absolute terms 50 / 1,000 = 5% of the number of tosses A coin lands heads 499,000 times in 1,000,000 tosses. The chance error is 499,000 – 1,000,000 = 1,000 in absolute terms 1,000 / 1,000,000 = 0.1 % of the number of tosses As the number of tosses gets larger, the percentage of heads gets closer to 50% 2/18/2019 Daniela Stan - CSC323