Stat 301 – Day 18 Normal probability model (4.2)
Last Time – Sampling cont. Different types of sampling and nonsampling errors Can only judge sampling bias if know the right answer and can see if statistics from repeated samples center at the population value (systematic tendency) Random sampling eliminates sampling bias and leads to a predictable pattern (probability distribution) in the sample results (sampling distribution)
Last Time – Statistical Significance Method 1: If have a random sample without replacement from a finite population and know (or conjecture) the number of successes in the population, can use hypergeometric Method 2: If have a representative sample from a random process (e.g., coin tosses, dice rolls, water measurements (Inv 3.3.1), batting average), can use binomial (independent observations, constant probability of success) Method 3: If have a random sample without replacement from a large population (N>20n), can use the binomial approximation to the hypergeometric (n, = M/N) How often would a random sample (under a conjecture) lead to data like this?
Last Time – Binomial distribution Helper/hinderer study p-value = P(X > 14) P(X = 14) = probability of 14 successes and 2 failures but then also have to pick which of the 16 outcomes will be the 14 successes => P(X = 14) = C(16,14) 14 (1- ) 2 In general:
Null and alternative hypotheses Helper/hinder study: Parameter of interest: probability of a baby picking the helper toy Value unknown, when it’s a proportion call it The uninteresting case is that the babies are choosing at random H 0 : =.5 (no preference) What the researchers are hoping to show is that the infants will have a preference for the helper H a : >.5 (genuine preference for helper) Note: Usually with a process the parameter is the probability of success Or H 0 : <.5
Practice problem (p. 223) Let represent the probability that this individual guesses correctly H 0 : =.25 (person is just guessing) H a : >.25 (better than guessing) Let equal the proportion of the population who prefer Kerry H 0 : =.5 (not a majority) H a : >.5 (majority) Let represent the probability of a sick day on M/F H 0 : =.4 (if the 5 days equally likely, will be 2/5) H a : >.4 (sick more often on M/F) When sampling from a population the parameter is the population proportion of successes
Yet another approach Often these sampling distributions are bell-shaped and symmetric, don’t even look that discrete… Not always…
Normal Distribution Characteristics: mound-shaped, symmetric, bell-shaped Parameters Mean, , peak Standard deviation, , inflection points f(x)f(x) N( , ) model x Area under curve = 1
Normal Distribution Properties: Empirical Rule
Normal Distribution Using the model New handout for Minitab 15 probability distribution graph Data vs. model
For Friday HW 4 Include all graphs! (New HW posted this weekend) PP for Monday now available No Friday Office Hours or late Thursday … Be nice to Dr. Sklar Be working on Lab 3 for Tuesday