The Statistical Imagination Chapter 6. Probability Theory and the Normal Probability Distribution
Probability Theory Probability theory is the analysis and understanding of chance occurrences
What is a Probability? A probability is a specification of how frequently a particular event of interest is likely to occur over a large number of trials Probability of success is the probability of an event occurring Probability of failure is the probability of an event not occurring
The Basic Formula for Calculating a Probability p [of success] = the number of successes divided by the number of trials or possible outcomes, where p [of success] = the probability of “the event of interest"
Basic Rules of Probability Theory There are five basic rules of probability that underlie all calculations of probabilities
Probability Rule 1: Probabilities Always Range Between 0 and 1 Since probabilities are proportions of a total number of possible events, the lower limit is a proportion of zero (or a percentage of 0%) A probability of zero means the event cannot happen, e.g., p [of an individual making a free-standing leap of 30 feet into the air] = 0 A probability of 1.00 (or 100%) means that an event will absolutely happen, e.g., p [that a raw egg will break if struck with a hammer ]= 1.00
Probability Rule 2: The Addition Rule for Alternative Events An alternative event is where there is more than one outcome that makes for success The addition rule states that the probability of alternative events is equal to the sum of the probabilities of the individual events For example, for a deck of 52 playing cards: p [ace or jack] = p [ace] + p [jack] The word or is a cue to add probabilities; substitute a plus sign for the word or
Probability Rule 3: Adjust for Joint Occurrences Sometimes a single outcome is successful in more than one way An example: What is the probability that a randomly selected student in the class is male or single? A single-male fits both criteria We call “single-male” a joint occurrence an event that double counts success When calculating the probability of alternative events, search for joint occurrences and subtract the double counts
Probability Rule 4: The Multiplication Rule The multiplication rule states that the probability of a compound event is equal to the multiple of the probabilities of the separate parts of the event A compound event is a multiple-part event, such as flipping a coin twice E.g., p [queen then jack] = p [queen] • p [jack] By multiplying, we extract the number of successes in the numerator, and the number of possible outcomes in the denominator
Probability Rule 5: Replacement and Compound Events With compound events we must stipulate whether replacement is to take place. For example, in drawing a queen and then a jack from a deck of cards, are we to replace the queen before drawing for the jack? The probability “with replacement” will compute differently than “without replacement”
Using the Normal Curve as a Probability Distribution With an interval/ratio variable that is normally distributed, we can compute Z-scores and use them to determine the proportion of a population’s scores falling between any two scores in the distribution Partitioning the normal curve refers to computing Z-scores and using them to determine any area under the curve
Three Ways to Interpret the Symbol, p A distributional interpretation that describes the result in relation to the distribution of scores in a population or sample A graphical interpretation that describes the proportion of the area under a normal curve A probabilistic interpretation that describes the probability of a single random drawing of a subject from this population
Procedure for Computing Areas Under the Normal Curve Draw and label the normal curve stipulating values of X and corresponding values of Z Identify and shade the target area ( p ) under the curve Compute Z-scores Locate a Z-score in column A of the normal curve table Obtain the probability ( p ) from either column B or column C
Information Provided in the Normal Curve Table Column A contains Z-scores for one side of the curve or the other Column B provides areas under the curve ( p ) from the mean of X to the Z-score in column A Column C provides areas under the curve from the Z-score in column A out into the tail
Critical Z-scores Critical Z-scores are ones of great importance in statistical procedures and are used very frequently Some widely used critical Z-scores are 1.64, 1.96, 2.33, 2.58, 3.08, and 3.30
Percentiles and the Normal Curve A percentile rank is the percentage of a sample or population that falls at or below a specified value of a variable If a distribution of scores is normal in shape, then the normal curve and Z-scores can be used to quickly calculate percentile ranks