In a random event, outcomes are uncertain, but there is nonetheless a regular distribution of outcomes in a large number of repetitions. We define the probability of any outcome of a random phenomenon as the proportion of times the outcome would occur in a very long series of repetitions. The sex of a newborn is random P(Male) ≈ ??, the proportion of times a newborn is male in many, many births. Randomness and probability Year Males2,0582,1052,1842,173 Females1,9642,0072,0812,074 Proportion of males
Probability models Probability model for the sex of a newborn S = {??} P(Male) = ??; P(Female) = ?? Probability models mathematically describe the outcome of random processes. They consist of two parts: 1) S = Sample Space: This is a list or description of all possible outcomes of a random process. An event is a subset of the sample space. 2) A probability is assigned for each possible simple event in the sample space S. Year Males2,0582,1052,1842,173 Females1,9642,0072,0812,074 Proportion of males
Continuous sample space: Blood types For a random person: S = {??} Discrete vs. continuous sample spaces Discrete sample space: Cholesterol level For a random person: S = ?? Continuous variables can take on any values over an interval. Discrete variables can take on only certain values (a whole number or a descriptor).
A. A couple wants three children. What are the possible sequences of boys (B) and girls (G)? B B B - BBB G … G G - BBG B - BGB G - BGG … S = {??} B. What is the number of days last week that a randomly selected teen exercised for at least one hour? S = {??} C. A researcher designs a new maze for lab rats. What are the possible outcomes for the time to finish the maze (in minutes)? S = ??
Probabilities range from 0 (no chance) to 1 (event has to happen): For any event A, 0 ≤ P(A) ≤ 1 Probability rules The probability of the sample space S must equal 1: P( sample space ) = 1 The probability that an event A does not occur (not A) equals 1 minus the probability that A does occur: P( not A ) = 1 – P(A)
The 2011 National Youth Risk Behavior Survey provides insight on the physical activity of U.S. high school students. Physical activity was defined as any activity that increases heart rate. Here is the probability model obtained by asking, “During the past 7 days, on how many days were you physically active for a total of at least 60 minutes per day?” What is the probability that a randomly selected U.S. high school student exercised at least 1 day in the past 7 days? A) 0.08 B) 0.23 C) 0.77 D) 0.85 E) 0.92
Two events are disjoint, or mutually exclusive, if they can never happen together (have no outcome in common). “male” and “pregnant” ?? “male” and “Caucasian” ?? Events A and B are disjoint. Events A and B are NOT disjoint. Disjoint events
Addition rule for disjoint events: When two events A and B are disjoint: P(A or B) = P(A) + P(B) General addition rule for any two events A and B: P(A or B) = P(A) + P(B) – P(A and B) Addition rules
Probability that a random person is type O+ P(O+) = ?? P(all blood types) = ?? Probability that a random person is not type A+ P(not A+) = ?? Probability that a random person is “blood group O” P(O) = ?? Probability that a random person is “rhesus neg” P(O- or A- or B- or AB-) = ?? Probability that a random person is either “blood group O” or “rhesus neg” P(O or -) = ??
HI but not B 0.23 B but not HI 0.06 Are the traits HI and B disjoint?? P (HI) = ?? P (B) = ?? P (HI or B) = ?? Neither HI nor B 0.66 HI and B 0.05 Probabilities of hearing impairment and blue eyes among Dalmatian dogs. HI = Dalmatian is hearing impaired B= Dalmatian is blue eyed
Continuous random variables A continuous sample space consists of an interval. We use density curves to model continuous probability distributions. We assign probabilities over the range of values making up the sample space.
Events are defined over subintervals within the sample space. Probabilities are computed as areas under the corresponding portion of the density curve for the chosen subinterval. The total area under a density curve equals 1. (Why?) The probability of an event being equal to a single numerical value is zero when the sample space is continuous. (Why?) Continuous probabilities are assigned for intervals
P(y = 0.5) = ?? Height = 1 y P(0 ≤ y ≤ 0.5) = ?? P(0 < y < 0.5) = ?? P(0 ≤ y < 0.5) = ?? Let Y be a continuous random variable with a uniform distribution (sample space is interval from 0 to 1). P(y ≤ 0.5) = ?? P(y > 0.8) = ?? P(y ≤ 0.5 or y > 0.8) =??
Risk and odds In the health sciences, probability concepts are often expressed in terms of risk and odds. The risk of an undesirable outcome of a random phenomenon is the probability of that undesirable outcome. risk(event A) = P(event A) The odds of any outcome of a random phenomenon is the ratio of the probability of that outcome over the probability of that outcome not occurring. odds(event A) = P(event A) / [1 − P(event A)]
Sickle-cell anemia is a serious, inherited blood disease affecting the shape of red blood cells. Individuals carrying only one copy of the defective gene (“sickle- cell trait”) are generally healthy but may pass on the gene to their offspring. If a couple learns from prenatal tests that they both carry the sickle-cell trait, genetic laws of inheritance tell us that there is a 25% chance that they could conceive a child who will suffer from sickle-cell anemia. What are the corresponding risk and odds? risk of sickle-cell anemia = ?? odds of sickle-cell anemia = ??