Review Session Chapter 2-5
Chapter 2 Population and Sample Population: The entire collection of all objects under study Sample: Any subset of the population Data summary statistics and data display Location : Sample mean, Median, Quartiles Spread: Range, Inter-quantile range (IQR), Sample Variance, Sample standard deviation Data display: Dot-diagram, Stem-and-leaf diagram, Histogram, Box-plot Scatter diagram and sample correlation coefficient Scatter diagram is graphical description for looking at relationship between two variables Sample correlation coefficient numerical summary for linear relationship between two variables
Chapter 3 Probability and Random variable Continuous random variable Discrete random variable Multiple random variables
Probability Some dentitions : Experiment, Sample space, Event Fundamentals of set theory : Union, Intersection, Complement, Mutually Exclusive Three Conditions of probability
Properties of probability
Example: probability When driving to campus, there are two intersections with traffic lights on your way. The probability that you must stop at the first signal is 0.30 and the probability that you must stop at the second signal is 0.45. The probability that you must stop at at least one of the signals is 0.50. What is the probability that you must stop at both signals? Define events A =(Stop at first signal), B =(Stop at second signal)
Example: probability When driving to campus, there are two intersections with traffic lights on your way. The probability that you must stop at the first signal is 0.30 and the probability that you must stop at the second signal is 0.45. The probability that you must stop at at least one of the signals is 0.50. What is the probability that you must stop at the first signal but not at the second one? Define events A =(Stop at first signal), B =(Stop at second signal)
Example: probability When driving to campus, there are two intersections with traffic lights on your way. The probability that you must stop at the first signal is 0.30 and the probability that you must stop at the second signal is 0.45. The probability that you must stop at at least one of the signals is 0.50. What is the probability that you must stop exactly one signal? Define events A =(Stop at first signal), B =(Stop at second signal)
Conditional Probability and Independence
Example: A red die and a white die are rolled. Define the events A = 4 on red die B = Sum of two dice is odd Are these two events independent? P(A) = 1/6 1/4
Example: A red die and a white die are rolled. Define the events A = 4 on red die B = Sum of two dice is odd Are these two events independent? P(B) = 1/2 1/4
Example: A red die and a white die are rolled. Define the events A = 4 on red die B = Sum of two dice is odd Are these two events independent? P(A∩B) = P(4 on red die and sum of two dice is odd ) = 1/12 1/14
Example: A red die and a white die are rolled. Define the events A = 4 on red die B = Sum of two dice is odd Are these two events independent? no yes
Example: Toss a coin three times. Let p be the probability of obtaining a head on each toss. Find P {HHT} Define the events A=Head is observed on the first toss B=Head is observed on the second toss C=Tail is observed on the third toss Then A ∩ B ∩ C = {HHT}
Example con’t Toss a coin three times. Let p be the probability of obtaining a head on each toss. Find P {HHT} From the experiment, events A, B and C are independent. Thus P {HHT} = P(A ∩ B ∩ C) = P(A)P(B)P(C) =p x p x (1-p)
Random variable
Examples: random variable Suppose f(x) = 𝐶 𝑥 2 for -1 < x < 1 and f(x) = 0 otherwise. Determine C and find the following probabilities.
Examples: random variable
Random variable
Examples: random variable Let X denote the number of patients who suffer an infection within a floor of a hospital per month with the following probabilities:
Examples: random variable Verify that the function f(x) is a probability mass function, and determine the requested values.
Continuous Distribution
Examples: random variable Review homework 2 and 4
Discrete Distribution
Example: Because not all airline passengers show up for their reserved seat, an airline sells 135 tickets for a flight that holds only 130 passengers. The probability that a passenger does not show up is 0.08, and the passengers behave independently. What is the probability that every passenger who shows up can take the flight? What is the probability that the flight departs with empty seats?
Example: In 1898 L. J. Bortkiewicz published a book entitled The Law of Small Numbers. He used data collected over 20 years to show that the number of soldiers killed by horse kicks each year in each corps in the Prussian cavalry followed a Poisson distribution with a mean of 0.61. What is the probability of more than one death in a corps in a year? What is the probability of no deaths in a corps over five years?
Example: Poisson process (HW 5) Poisson distribution Poisson process Normal approximation of Poisson distribution
Linear Combination of R.V.s (HW 6)
Central limit theorem (HW 6)
Chapter 4 Point estimation Hypothesis test for one population Confidence interval for one population Goodness of fit test
Point estimation
Hypothesis Testing
One-sample Z test
Sample size
One-sample T test
One-sample chi-square test
One-sample approximated Z test
Testing for goodness of fit
Chapter 5 Hypothesis test for two populations Confidence interval for two populations
Confidence Interval for 𝑯 𝟎 ≔𝒗𝒔 𝑯 𝟏 :≠ Large sample size or from Normal population Z: from N(0,1) or T: from T-dist Caution: the multiplier depends on the significance level From original data
Large sample size or from Normal population Formula for p-values Large sample size or from Normal population From original data From H0 Caution: The direction of the tail depends on alternative hypothesis Compare z to N(0,1) or t to T distribution for p-value
Decision making for two samples Parameter Distribution Standard Error (CI) Standard Error (Test) Difference in Proportions Normal Difference in Means Variance Known Variance Unknown but same Pooled t, df = 𝑛 1 + 𝑛 2 -2 Variance Unknown but diff. Unpooled t, df = min(n1, n2) – 1 Difference in Means (Paired) t, df = n – 1