The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 5 Probability: What Are the Chances? 5.1 Randomness, Probability, and Simulation
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 thth Edition2 INTERPRET probability as a long-run relative frequency. USE simulation to MODEL chance behavior. Randomness, Probability, and Simulation
What is randomness? Pick a number: What did you pick? Almost 75% of people will pick 3. 20% pick 2 or 4. Only about 5% choose 1!
Give an example of a false positive: Give an example of a false negative:
The Practice of Statistics, 5 thth Edition3 The Idea of Probability Chance behavior is unpredictable in the _________, but has a regular and ____________ in the long run. The law of large numberssays that if we observe more and more ________ of any chance process, the proportion of times that a specific outcome occurs approaches a single value. The probabilityof any outcome of a chance process is a number between _________ that describes the proportion of times the outcome would occur in a very _____ series of repetitions.
Suppose that 4 friends get together to study at Tim’s house for their next test in AP Statistics. When they go for a snack in the kitchen, Tim’s three-year-old brother makes a tower using their textbooks. Unfortunately, none of the students wrote his name in the book, so when they leave each student takes one of the books at random. When the students returned the books at the end of the year and the clerk scanned their barcodes, the students were surprised that none of the four had their own book. How likely is it that none of the four students ended up with the correct book?
Another way to interpret probability of an outcome is its predicted long-run relative frequency. For example, if we do many trials of flipping a fair coin, we would expect to see the proportion of heads to be about.5. BUT each trial is completely random and not based on any previous flip or set of flips.
Horse race simulation: We are using the sum of the numbers on a roll of 2 die to simulate horses moving around a track. You can choose to be horse # 2, 3, 4,..., 12. Which number would you choose? Why?
The Practice of Statistics, 5 thth Edition4 Myths About Randomness The idea of probability seems straightforward. However, there are some myths of chance behavior we must address. The myth of short-run regularity: The idea of probability is that randomness is predictable in the long run. Our intuition tries to tell us random phenomena should also be predictable in the short run. However, probability does not allow us to make short-run predictions. The myth of the “ law of averages ” : Probability tells us random behavior evens out in the long run. Future outcomes are not affected by past behavior. That is, past outcomes do not influence the likelihood of individual outcomes occurring in the future.
What are some myths about randomness? Myth: Random events are predictable in the short run. Myth: A "hot hand" indicates that a streak is likely to continue. Myth: The "Law of Averages" says a streak makes other outcomes more likely. Truth: Random events ARE predictable in the long run. Truth: Coins, dice, cards, etc. have no memories. LLN is long run. Truth Myth
Imagine you are flipping a coin. Write down the results of 50 imaginary flips (e.g. HTTHT…): Use technology to simulate: (Write down the steps and results here) What is the longest run in each set?
HW page 300 (1, 3, 8, 9, 11, 37, 38)
Dear Abby, My husband and I just had our 8th child. Another girl, and I am really one disappointed woman. I suppose i should thank God she was healthy, but Abby, this one was supposed to have been a boy. Even the doctor told me that the law of averages was in our favor 100 to one." Abigail Van Buren, 1974
The Practice of Statistics, 5 thth Edition5 Simulation The __________ of chance behavior, based on a model that accurately reflects the situation, is called asimulation. State : Ask a question of interest about some chance process. Plan: Describe how to use a chance device to imitate one repetition of the process. Tell what you will record at the end of each repetition. Do : Perform many repetitions of the simulation. Conclude: Use the results of your simulation to answer the question of interest. Performing a Simulation We can use physical devices, random numbers (e.g. Table D), and technology to perform simulations.
The Practice of Statistics, 5 thth Edition6 Example: Simulations with technology In an attempt to increase sales, a breakfast cereal company decides to offer a NASCAR promotion. Each box of cereal will contain a collectible card featuring one of these NASCAR drivers: Jeff Gordon, Dale Earnhardt, Jr., Tony Stewart, Danica Patrick, or Jimmie Johnson. The company says that each of the 5 cards is equally likely to appear in any box of cereal. A NASCAR fan decides to keep buying boxes of the cereal until she has all 5 drivers’cards. She is surprised when it takes her 23 boxes to get the full set of cards. Should she be surprised? Problem : What is the probability that it will take 23 or more boxes to get a full set of 5 NASCAR collectible cards?
The Practice of Statistics, 5 thth Example: Simulations with technology Plan: We need five numbers to represent the five possible cards. Let’s let 1 = Jeff Gordon, 2 = Dale Earnhardt, Jr., 3 = Tony Stewart, 4 = Danica Patrick, and 5 = Jimmie Johnson. We’ll use randInt(1,5) to simulate buying one box of cereal and looking at which card is inside. Because we want a full set of cards, we’ll keep pressing Enter until we get all five of the labels from 1 to 5. We’ll record the number of boxes that we had to open.
The Practice of Statistics, 5 thth Edition8 Example: Simulations with technology Conclude : We never had to buy more than 22 boxes to get the full set of NASCAR drivers’cards in 50 repetitions of our simulation. So our estimate of the probability that it takes 23 or more boxes to get a full set is roughly 0. The NASCAR fan should be surprised about how many boxes she had to buy.
Suppose I want to choose a simple random sample of size 6 from a group of 60 seniors and 30 juniors. To do this, I write each person’s name on an equally sized piece of paper and mix them up in a large grocery bag. Just as I am about to select the first name, a thoughtful student suggests that I should stratify by class. I agree, and we decide it would be appropriate to select 4 seniors and 2 juniors. However, since I already mixed up the names, I don’t want to have separate them all again. Can I just draw names until I get 4 seniors and 2 juniors? Design and carry out a simulation using Table D to estimate the probability that you must draw 8 or more names to get 4 seniors and 2 juniors.
What are some common errors when using a table of random numbers? Answer Every label needs to be the same length. If you are not using all of the labels of a certain length, state that the extra labels will be ignored. If you are sampling without replacement, state that you will ignore any repeated labels.
Section Summary In this section, we learned how to… The Practice of Statistics, 5 thth Edition9 $ INTERPRET probability as a long-run relative frequency. $ USE simulation to MODEL chance behavior. Randomness, Probability, and Simulation