Last Time Normal Distribution –Density Curve (Mound Shaped) –Family Indexed by mean and s. d. –Fit to data, using sample mean and s.d. Computation of Normal Probabilities –Using Excel function, NORMDIST –And Big Rules of Probability
Reading In Textbook Approximate Reading for Today’s Material: Pages 61-62, 66-70, 59-61, Approximate Reading for Next Class: Pages ,
Normal Density Fitting Idea: Choose μ and σ to fit normal density to histogram of data, Approach: IF the distribution is “mound shaped” & outliers are negligible THEN a “good” choice of normal model is:
Normal Density Fitting Melbourne Average Temperature Data
Computation of Normal Probs EXCEL Computation: probs given by “lower areas” E.g. for X ~ N(1,0.5) P{X ≤ 1.3} = 0.726
Computation of Normal Probs Computation of upper areas: (use “1 –”, i.e. “not” formula) = 1 -
Computation of Normal Probs Computation of areas over intervals: (use subtraction) = -
Z-score view of populations Idea: Reproducible view of “where data point lies in population”
Z-score view of populations Idea: Reproducible view of “where data point lies in population” Context 1: List of Numbers Context 2: Probability distribution
Z-score view of Lists of #s Idea: Reproducible view of “where data point lies in population”
Z-score view of Lists of #s Idea: Reproducible view of “where data point lies in population” Thought model: population is Normal
Z-score view of Lists of #s Idea: Reproducible view of “where data point lies in population” Thought model: population is Normal Population mean: μ
Z-score view of Lists of #s Idea: Reproducible view of “where data point lies in population” Thought model: population is Normal Population mean: μ Population standard deviation: σ
Z-score view of Lists of #s Idea: Reproducible view of “where data point lies in population” Thought model: population is Normal Population mean: μ Population standard deviation: σ Interpret data as “s.d.s away from mean”
Z-score view of Lists of #s Approach: Transform data
Z-score view of Lists of #s Approach: Transform data By subtracting mean & dividing by s.d
Z-score view of Lists of #s Approach: Transform data By subtracting mean & dividing by s.d. To get
Z-score view of Lists of #s Approach: Transform data By subtracting mean & dividing by s.d. To get (gives mean 0, s.d. 1)
Z-score view of Lists of #s Approach: Transform data By subtracting mean & dividing by s.d. To get (gives mean 0, s.d. 1) Interpret as
Z-score view of Lists of #s Approach: Transform data By subtracting mean & dividing by s.d. To get (gives mean 0, s.d. 1) Interpret as I.e. “ is sd’s above the mean”
Z-score view of Normal Dist. Approach: For
Z-score view of Normal Dist. Approach: For Subtract mean & divide by s.d
Z-score view of Normal Dist. Approach: For Subtract mean & divide by s.d. To get
Z-score view of Normal Dist. Approach: For Subtract mean & divide by s.d. To get (gives mean 0, s.d. 1, i.e. Standard Normal)
Z-score view of Normal Dist. Approach: For Subtract mean & divide by s.d. To get (gives mean 0, s.d. 1, i.e. Standard Normal) Interpret as
Z-score view of Normal Dist. Approach: For Subtract mean & divide by s.d. To get (gives mean 0, s.d. 1, i.e. Standard Normal) Interpret as I.e. “ is sd’s above the mean”
Z-score view of Normal Dist. HW: 1.117
Interpretation of Z-scores Z-scores
Interpretation of Z-scores Z-scores are on N(0,1) scale,
Interpretation of Z-scores Z-scores are on N(0,1) scale,
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas:
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas: 1.Within 1 sd of mean
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas: 1.Within 1 sd of mean
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas: 1.Within 1 sd of mean “the majority”
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas: 1.Within 1 sd of mean “the majority” ≈ 68%
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas: 2.Within 2 sd of mean “really most” ≈ 95%
Interpretation of Z-scores Z-scores are on N(0,1) scale, so use areas to interpret them Important Areas: 3.Within 3 sd of mean “almost all” ≈ 99.7%
Interpretation of Z-scores Summary: these are called the “ % Rule”
Interpretation of Z-scores Summary: these are called the “ % Rule” Mean – 3 sd’s
Interpretation of Z-scores Summary: “ % Rule” Excel Calculation From Class Example 9:
Interpretation of Z-scores Summary: “ % Rule” Excel Calculation
Interpretation of Z-scores HW: 1.115, (50%, 2.5%, ) 1.119
Inverse Normal Probs Idea, for a given cutoff value, x
Inverse Normal Probs Idea, for a given cutoff value, x Calculated P{X < x}
Inverse Normal Probs Idea, for a given cutoff value, x Calculated P{X < x} as Area under normal density
Inverse Normal Probs Idea, for a given cutoff value, x Calculated P{X < x} as Area under normal density Using Excel function: NORMDIST
Inverse Normal Probs Now for a given P{X < x}, i.e. Area
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find corresponding cutoff x
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find corresponding cutoff x Terminology:
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find corresponding cutoff x Terminology: Quantile
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find corresponding cutoff x Terminology: Quantile Percentile
Inverse Normal Probs E.g. Given area = 80%
Inverse Normal Probs E.g. Given area = 80% This x is the
Inverse Normal Probs E.g. Given area = 80% This x is the 0.8-quantile
Inverse Normal Probs E.g. Given area = 80% This x is the 0.8-quantile 80-th percentile
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find: Quantile Percentile
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find: Quantile Percentile Excel Computation: NORMINV
Inverse Normal Probs Excel Computation: NORMINV
Inverse Normal Probs Excel Computation: NORMINV (very similar to other Excel functions)
Inverse Normal Probs Excel Computation: NORMINV (very similar to other Excel functions) (and reasonably well organized)
Inverse Normal Probs Excel Computation: NORMINV Examples in:
Inverse Normal Probs Excel Computation: NORMINV
Inverse Normal Probs Excel Computation: NORMINV Set: Mean = 0
Inverse Normal Probs Excel Computation: NORMINV Set: Mean = 0 s.d. = 1 prob = 0.8
Inverse Normal Probs Excel Computation: NORMINV Set: Mean = 0 s.d. = 1 prob = 0.8 Get answer
Inverse Normal Probs Excel Computation: NORMINV or can just type in formula
Inverse Normal Probs Excel Computation: NORMINV or can just type in formula Get answer
Inverse Normal Probs Now for a given P{X < x}, i.e. Area Find: Quantile Percentile = 0.84
Inverse Normal Probs Excel Computation: NORMINV Another example: for X ~ N(100,20)
Inverse Normal Probs Excel Computation: NORMINV Another example: for X ~ N(100,20)
Inverse Normal Probs Excel Computation: NORMINV Another example: for X ~ N(100,20) Find x, so that 30% = P{X < x}
Inverse Normal Probs Excel Computation: NORMINV Another example: for X ~ N(100,20) Find x, so that 30% = P{X < x} i.e. the 30-th percentile
Inverse Normal Probs Excel Computation: NORMINV Another example: for X ~ N(100,20) Find x, so that 30% = P{X < x} i.e. the 30-th percentile Answer: slightly less than mean
Inverse Normal Probs Example: Quality Control
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz.
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. The machine is “out of control” when it overfills.
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. The machine is “out of control” when it overfills. Choose an “alarm level”, which will give only 1 % false alarms.
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. The machine is “out of control” when it overfills. Choose an “alarm level”, which will give only 1 % false alarms. Want: cutoff, x, so that Area above = 1%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. The machine is “out of control” when it overfills. Choose an “alarm level”, which will give only 1 % false alarms. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%
Inverse Normal Probs When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99% So set alarm threshold to 25.47
Inverse Normal Probs HW: (-0.675, 0.385) (1294)
And Now for Something Completely Different A fun idea. Can you read this?
And Now for Something Completely Different A fun idea. Can you read this? Olny srmat poelpe can raed this.
And Now for Something Completely Different A fun idea. Can you read this? Olny srmat poelpe can raed this. I cdnuolt blveiee that I cluod aulaclty uesdnatnrd what I was rdanieg.
And Now for Something Completely Different A fun idea. Can you read this? Olny srmat poelpe can raed this. I cdnuolt blveiee that I cluod aulaclty uesdnatnrd what I was rdanieg. The phaonmneal pweor of the hmuan mnid, aoccdrnig to rscheearch at Cmabrigde Uinervtisy.
And Now for Something Completely Different The phaonmneal pweor of the hmuan mnid, aoccdrnig to rscheearch at Cmabrigde Uinervtisy.
And Now for Something Completely Different The phaonmneal pweor of the hmuan mnid, aoccdrnig to rscheearch at Cmabrigde Uinervtisy. It deosn't mttaer in what oredr the ltteers in a word are, the olny iprmoatnt tihng is that the first and last ltteer be in the rghit pclae.
And Now for Something Completely Different The phaonmneal pweor of the hmuan mnid, aoccdrnig to rscheearch at Cmabrigde Uinervtisy. It deosn't mttaer in what oredr the ltteers in a word are, the olny iprmoatnt tihng is that the first and last ltteer be in the rghit pclae. The rset can be a taotl mses and you can still raed it wouthit a porbelm.
And Now for Something Completely Different The rset can be a taotl mses and you can still raed it wouthit a porbelm.
And Now for Something Completely Different The rset can be a taotl mses and you can still raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the word as a wlohe.
And Now for Something Completely Different The rset can be a taotl mses and you can still raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the word as a wlohe. Amzanig huh?
And Now for Something Completely Different The rset can be a taotl mses and you can still raed it wouthit a porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the word as a wlohe. Amzanig huh? Yaeh and I awlyas tghuhot slpeling was ipmorantt!
Checking Normality Idea: For which data sets, will the normal distribution be a good model?
Checking Normality Idea: For which data sets, will the normal distribution be a good model? Recall fitting normal density to data:
Normal Density Fitting Idea: Choose μ and σ to fit normal density to histogram of data, Approach: IF the distribution is “mound shaped” & outliers are negligible THEN a “good” choice of normal model is:
Normal Density Fitting Melbourne Average Temperature Data
Checking Normality Idea: For which data sets, will the normal distribution be a good model? Useful graphical device to check: IF the distribution is “mound shaped” & outliers are negligible
Checking Normality Useful graphical device:
Checking Normality Useful graphical device: Quantile – Quantile plot
Checking Normality Useful graphical device: Quantile – Quantile plot Varying Terminology:
Checking Normality Useful graphical device: Quantile – Quantile plot Varying Terminology: Q-Q plot
Checking Normality Useful graphical device: Quantile – Quantile plot Varying Terminology: Q-Q plot Normal Quantile plot (text book)
Checking Normality Q-Q plot
Checking Normality Q-Q plot Idea: graphical comparison
Checking Normality Q-Q plot Idea: graphical comparison of data distribution
Checking Normality Q-Q plot Idea: graphical comparison of data distribution vs. normal distribution
Checking Normality Q-Q plot Idea: graphical comparison of data distribution vs. normal distribution as data quantiles vs. normal quantiles
Checking Normality Q-Q plot, implementation:
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles Assign corresponding probabilities:
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles Assign corresponding probabilities: (equally spaced, strictly between 0 and 1)
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles Assign corresponding probabilities: Compute corresponding normal quantiles
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles Assign corresponding probabilities: Compute corresponding normal quantiles (using NORMINV)
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles Assign corresponding probabilities: Compute corresponding normal quantiles (using NORMINV) Make plot with x-axis
Checking Normality Q-Q plot, implementation: Sort data, to find data quantiles Assign corresponding probabilities: Compute corresponding normal quantiles (using NORMINV) Make plot with x-axis & y-axis
Checking Normality Q-Q plot, interpretation:
Checking Normality Q-Q plot, interpretation: When distribution is normal:
Checking Normality Q-Q plot, interpretation: When distribution is normal: –Points lie close to a line
Checking Normality Q-Q plot, interpretation: When distribution is normal: –Points lie close to a line –For standard normal quantiles
Checking Normality Q-Q plot, interpretation: When distribution is normal: –Points lie close to a line –For standard normal quantiles Y-intercept of line is mean Slope of line is s.d.
Checking Normality Q-Q plot, interpretation: When distribution is normal: –Points lie close to a line –For standard normal quantiles Y-intercept of line is mean Slope of line is s.d. For non-normal distribution:
Checking Normality Q-Q plot, interpretation: When distribution is normal: –Points lie close to a line –For standard normal quantiles Y-intercept of line is mean Slope of line is s.d. For non-normal distribution: –Q-Q plot will curve away from line
Checking Normality Q-Q plot, e.g.
Checking Normality Q-Q plot, e.g. Excel analyses available in:
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as:
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab Data Analysis
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab Data Analysis Random Number Generation
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab Data Analysis Random Number Generation Set parameters
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab Data Analysis Random Number Generation Set parameters
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab Data Analysis Random Number Generation Set parameters
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Data simulated as: Data Tab Data Analysis Random Number Generation Set parameters
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next sort data Copy to another column
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next sort data Copy to another column Highlight
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next sort data Copy to another column Highlight Data Tab
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next sort data Copy to another column Highlight Data Tab Sort Button
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next sort data Copy to another column Highlight Data Tab Sort Button Gives Data Quantiles
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next compute Normal Quantiles
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next compute Normal Quantiles 1 st type indices Range of probs i / (n+1)
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Next compute Normal Quantiles 1 st type indices Range of probs i / (n+1) Normal quantiles
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Now plot Data Quantiles vs. Normal Quantiles
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Now plot Data Quantiles vs. Normal Quantiles
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Now plot Data Quantiles vs. Normal Quantiles Insert Tab
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Now plot Data Quantiles vs. Normal Quantiles Insert Tab Scatter Button
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Now plot Data Quantiles vs. Normal Quantiles Insert Tab Scatter Button Fill out menu (as before)
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Results: Looks very linear
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Results: Looks very linear As expected
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Results: Looks very linear As expected Y-intercept = 0 (= mean)
Checking Normality Q-Q plot, e.g. n = 1000 from N(0,1) Results: Looks very linear As expected Y-intercept = 0 (= mean) Slope = 1 (= s.d.)
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Recall Histogram
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Recall Histogram - Roughly symmetric
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Recall Histogram - Roughly symmetric - Mound shaped
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Recall Histogram - Roughly symmetric - Mound shaped - Does Normal Curve fit the data?
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Approximately linear
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Approximately linear Suggests normal
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Approximately linear Suggests normal But some wiggles?
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Approximately linear Suggests normal But some wiggles? Due to natural sampling variation?
Checking Normality Q-Q plot, e.g. Buffalo Snowfalls Approximately linear Suggests normal But some wiggles? Due to natural sampling variation? Study with smaller simulation
Checking Normality Q-Q plot, e.g. n = 100 from N(0,1)
Checking Normality Q-Q plot, e.g. n = 100 from N(0,1) Approximately linear
Checking Normality Q-Q plot, e.g. n = 100 from N(0,1) Approximately linear Some wiggliness
Checking Normality Q-Q plot, e.g. n = 100 from N(0,1) Approximately linear Some wiggliness Suggests Buffalo variation is usual
Checking Normality Q-Q plot, e.g. n = 100 from N(0,1) Approximately linear Some wiggliness Suggests Buffalo variation is usual Make this more precise?
Checking Normality Q-Q plot, e.g. British Suicides
Checking Normality Q-Q plot, e.g. British Suicides Recall Histogram
Checking Normality Q-Q plot, e.g. British Suicides Recall Histogram Strong right skewness
Checking Normality Q-Q plot, e.g. British Suicides Recall Histogram Strong right skewness So mean >> median
Checking Normality Q-Q plot, e.g. British Suicides Recall Histogram Strong right skewness So mean >> median Not mound shaped
Checking Normality Q-Q plot, e.g. British Suicides
Checking Normality Q-Q plot, e.g. British Suicides Distinct non-linearity (curvature)
Checking Normality Q-Q plot, e.g. British Suicides Distinct non-linearity (curvature) Conclude data not normal
Checking Normality Q-Q plot, e.g. British Suicides Distinct non-linearity (curvature) Conclude data not normal Characteristic of right skewness
Checking Normality Q-Q plot, e.g. Log10 British Suicides Recall: log10 transformation resulted in mound shape
Checking Normality Q-Q plot, e.g. Log10 British Suicides Recall Histogram
Checking Normality Q-Q plot, e.g. Log10 British Suicides Recall Histogram: o Much more mound shaped
Checking Normality Q-Q plot, e.g. Log10 British Suicides Recall Histogram: o Much more mound shaped o Check for normality with Q-Q plot
Checking Normality Q-Q plot, e.g. Log10 British Suicides
Checking Normality Q-Q plot, e.g. Log10 British Suicides Looks very linear
Checking Normality Q-Q plot, e.g. Log10 British Suicides Looks very linear Indicates normal distribution is good fit
Checking Normality Q-Q plot, e.g. Log10 British Suicides Looks very linear Indicates normal distribution is good fit I.e. transformation worked!
Checking Normality HW: (a. approx. normal + big outlier; b. close to normal; c. right skew + one big outlier; d. Non-normal with several clusters