Stat 155, Section 2, Last Time Reviewed Excel Computation of: –Time Plots (i.e. Time Series) –Histograms Modelling Distributions: Densities (Areas) Normal.

Slides:



Advertisements
Similar presentations
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Advertisements

Chapter 8 Linear Regression © 2010 Pearson Education 1.
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 4: The Normal Distribution and Z-Scores.
BPS - 5th Ed. Chapter 31 The Normal Distributions.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Normal Distribution Recall how we describe a distribution of quantitative (continuous) data: –plot the data (stemplot or histogram) –look for the overall.
Normal Distribution Recall how we describe a distribution of data:
Examples of continuous probability distributions: The normal and standard normal.
Week 7: Means, SDs & z-scores problem sheet (answers)
Chapter 7 The Normal Probability Distribution. Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.  In Chapter 6: We saw Discrete.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
1.1 Displaying Distributions with Graphs
3.3 Density Curves and Normal Distributions
Covariance and correlation
Normal Distribution MATH 102 Contemporary Math S. Rook.
Chapter 2.2 STANDARD NORMAL DISTRIBUTIONS. Normal Distributions Last class we looked at a particular type of density curve called a Normal distribution.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
NOTES The Normal Distribution. In earlier courses, you have explored data in the following ways: By plotting data (histogram, stemplot, bar graph, etc.)
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Stat 155, Section 2, Last Time Numerical Summaries of Data: –Center: Mean, Medial –Spread: Range, Variance, S.D., IQR 5 Number Summary & Outlier Rule Transformation.
Essential Statistics Chapter 31 The Normal Distributions.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 2 Modeling Distributions of Data 2.2 Density.
CHAPTER 3: The Normal Distributions
Chapter 10 Correlation and Regression
Stat 155, Section 2, Last Time Normal Distribution: –Interpretation: 68%-95%-99.7% rule –Computation of areas (frequencies) –Inverse Normal area computation.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 2 Modeling Distributions of Data 2.2 Density.
Stat 31, Section 1, Last Time Time series plots Numerical Summaries of Data: –Center: Mean, Medial –Spread: Range, Variance, S.D., IQR 5 Number Summary.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Last Time Normal Distribution –Density Curve (Mound Shaped) –Family Indexed by mean and s. d. –Fit to data, using sample mean and s.d. Computation of Normal.
Last Time Interpretation of Confidence Intervals Handling unknown μ and σ T Distribution Compute with TDIST & TINV (Recall different organization) (relative.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Chapter 2 Modeling Distributions of Data Objectives SWBAT: 1)Find and interpret the percentile of an individual value within a distribution of data. 2)Find.
Stat 31, Section 1, Last Time Distributions (how are data “spread out”?) Visual Display: Histograms Binwidth is critical Bivariate display: scatterplot.
Ch 2 The Normal Distribution 2.1 Density Curves and the Normal Distribution 2.2 Standard Normal Calculations.
Administrative Matters Midterm II Results Take max of two midterm scores:
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
1 Chapter 2: The Normal Distribution 2.1Density Curves and the Normal Distributions 2.2Standard Normal Calculations.
Stat 31, Section 1, Last Time Course Organization & Website What is Statistics? Data types.
Stat 31, Section 1, Last Time Big Rules of Probability –The not rule –The or rule –The and rule P{A & B} = P{A|B}P{B} = P{B|A}P{A} Bayes Rule (turn around.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
The Practice of Statistics Third Edition Chapter 15: Inference for Regression Copyright © 2008 by W. H. Freeman & Company.
AP Statistics Review Day 1 Chapters 1-4. AP Exam Exploring Data accounts for 20%-30% of the material covered on the AP Exam. “Exploratory analysis of.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Scatterplots & Correlations Chapter 4. What we are going to cover Explanatory (Independent) and Response (Dependent) variables Displaying relationships.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
The Normal Model Chapter 6 Density Curves and Normal Distributions.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics, A First Course 4 th.
Chapter 2.2 STANDARD NORMAL DISTRIBUTIONS. Normal Distributions Last class we looked at a particular type of density curve called a Normal distribution.
Stat 31, Section 1, Last Time Linear transformations
Last Time Proportions Continuous Random Variables Probabilities
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Stat 31, Section 1, Last Time Sampling Distributions
CHAPTER 2 Modeling Distributions of Data
Chapter 2 Data Analysis Section 2.2
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Statistics for Managers Using Microsoft® Excel 5th Edition
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
The Normal Distribution
CHAPTER 2 Modeling Distributions of Data
CHAPTER 2 Modeling Distributions of Data
Presentation transcript:

Stat 155, Section 2, Last Time Reviewed Excel Computation of: –Time Plots (i.e. Time Series) –Histograms Modelling Distributions: Densities (Areas) Normal Density Curve (very useful model) Fitting Normal Densities (using mean and s.d.)

Reading In Textbook Approximate Reading for Today’s Material: Pages 71-83, Approximate Reading for Next Class: Pages ,

2 Views of Normal Fitting 1.“Fit Model to Data” Choose &. 2.“Fit Data to Model” First Standardize Data Then use Normal. Note: same thing, just different rescalings (choose scale depending on need)

Normal Distribution Notation The “normal distribution, with mean & standard deviation ” is abbreviated as:

Interpretation of Z-scores Recall Z-score Idea: Transform data By subtracting mean & dividing by s.d. To get (mean 0, s.d. 1) Interpret as I.e. “ is sd’s above the mean”

Interpretation of Z-scores Same idea for Normal Curves: Z-scores are on scale, so use areas to interpret them Important Areas: Within 1 sd of mean “the majority”

Interpretation of Z-scores 2.Within 2 sd of mean “really most” 3.Within 3 sd of mean “almost all”

Interpretation of Z-scores Interactive Version (used for above pics) From Publisher’s Website: Statistical Applets Normal Curve

Interpretation of Z-scores Summary: These relations are called the “ % Rule” HW: 1.86 (a: , b: 234, 298), 1.87

Computation of Normal Areas Classical Approach: Tables See inside covers of text Summarizes area computations Because can’t use calculus Constructed by “computers” (a job description in the early 1900’s!)

Computation of Normal Areas EXCEL Computation: works in terms of “lower areas” E.g. for Area < 1.3 is

Computation of Normal Areas Interactive Version (used for above pic) From Same Publisher’s Website: Statistical Applets Normal Curve

Computation of Normal Areas EXCEL Computation: (of above e.g.) Use NORMDIST Enter parameters x is “cutoff point” Return is Area below x

Computation of Normal Areas Computation of areas over intervals: (use subtraction) = -

Computation of Normal Areas Computation of areas over intervals: (use subtraction for EXCEL too) E.g. Use Excel to check % Rule

Normal Area HW HW (use Excel): (Hint: the % above 130 = 100% - % below 130) 1.99 (see discussion above) Caution: Don’t just “twiddle EXCEL until answer appears”. Understand it!!!

And Now for Something Completely Different A mind blowing video clip: 8 year old Skateboarding Twins: Do they ever miss? You can explore farther… Thanks to Devin Coley for the link

Inverse of Area Function Inverse of Frequencies: “Quantiles” Idea: Given area, find “cutoff” x I.e. for Area = 80% This x is the “quantile”

Inverse of Area Function EXCEL Computation of Quantiles: Use NORMINV Continue Class Example: “Probability” is “Area” Enter mean and SD parameters

Inverse Area Example When a machine works normally, it fills bottles with mean = 25 oz, and SD = 0.2 oz. The machine is “out of control” when it overfills. Choose an “alarm level”, which will give only 1 % false alarms. Want: cutoff, x, so that Area above = 1% Note: Area below = 100% - Area above = 99%

Inverse Area HW 1.95, 1.101, 1.107, a (-0.674, 0.674) (4.3%)

Normal Diagnostic When is the Normal Model “good”? Useful Graphical Device: Q-Q plot = Normal Quantile Plot Idea: look at plot which is approximately linear for data from Normal Model

Normal Quantile Plot Approach, for data : 1.Sort data 2.Compute “Theoretical Proportions”: 3.Compute “Theoretical Z-scores” 4.Plot Sorted Data (Y-axis) vs. Theoretical Z – scores (X-axis)

Normal Quantile Plot Several Examples: Show how to compute in Excel Steps as above

Normal Quantile Plot Main Lessons: Melbourne Winter Temperature Data –Gaussian is good, so looks ~ linear –So OK, to use normal model for these data –Adding trendline helps in assessing linearity

Normal Quantile Plot Main Lessons: Intro Stat Course Exam Scores Data –Skewed distributions  nonlinearity –Outliers show up clearly –Normal model unreliable here Combined plot highlights –Mean = Y-intercept –Standard Deviation = Slope

Normal Quantile Plot Main Lessons: Simulated Bimodal Data –Curve is flat near modes –Roughly linear near peaks –Corresponds to two normal subpopulaitons –Goes up fast a valley

Normal Quantile Plot Homework:

And now for something completely different Recall Distribution of majors of students in this course:

And now for something completely different How about a biology joke? A seventh grade Biology teacher arranged a demonstration for his class. He took two earth worms and in front of the class he did the following: He dropped the first worm into a beaker of water where it dropped to the bottom and wriggled about. He dropped the second worm into a beaker of Ethyl alchohol and it immediately shriveled up and died. He asked the class if anyone knew what this demonstration was intended to show them.

And now for something completely different He asked the class if anyone knew what this demonstration was intended to show them. A boy in the second row immediately shot his arm up and, when called on said: "You're showing us that if you drink alcohol, you won't have worms."

Variable Relationships Chapter 2 in Text Idea: Look beyond single quantities, to how quantities relate to each other. E.g. How do HW scores “relate” to Exam scores? Section 2.1: Useful graphical device: Scatterplot

Plotting Bivariate Data Toy Example: (1,2) (3,1) (-1,0) (2,-1)

Plotting Bivariate Data Sometimes: Can see more insightful patterns by connecting points

Plotting Bivariate Data Sometimes: Useful to switch off points, and only look at lines/curves

Plotting Bivariate Data Common Name: “Scatterplot” A look under the hood: EXCEL: Chart Wizard (colored bar icon) Chart Type: XY (scatter) Subtype conrols points only, or lines Later steps similar to above (can massage the pic!)

Scatterplot E.g. Data from related Intro. Stat. Class (actual scores) A.How does HW score predict Final Exam? = HW, = Final Exam i.In top half of HW scores: Better HW  Better Final ii.For lower HW: Final is much more “random”

Scatterplots Common Terminology: When thinking about “X causes Y”, Call X the “Explanatory Var.” or “Indep. Var.” Call Y the “Response Var.” or “Dep. Var.” (think of “Y as function of X”) (although not always sensible)

Scatterplots Note: Sometimes think about causation, Other times: “Explore Relationship” HW: 2.1

Class Scores Scatterplots B.How does HW predict Midterm 1? = HW, = MT1 i.Still better HW  better Exam ii.But for each HW, wider range of MT1 scores iii.I.e. HW doesn’t predict MT1 as well as Final iv.“Outliers” in scatterplot may not be outliers in either individual variable e.g. HW = 72, MT1 = 94 (bad HW, but good MT1?, fluke???)

Class Scores Scatterplots C.How does MT1 predict MT2? = MT1, = MT2 i.Idea: less “causation”, more “exploration” ii.Still higher MT1 associated with higher MT2 iii.For each MT1, wider range of MT2 i.e. “not good predictor” iv.Interesting Outliers: MT1 = 100, MT2 = 56 (oops!) MT1 = 23, MT2 = 74 (woke up!)

Important Aspects of Relations I.Form of Relationship II.Direction of Relationship III.Strength of Relationship

I.Form of Relationship Linear: Data approximately follow a line Previous Class Scores Example Final vs. High values of HW is “best” Nonlinear: Data follows different pattern Nice Example: Bralower’s Fossil Data

Bralower’s Fossil Data From T. Bralower, formerly of Geological Sci.T. BralowerGeological Sci. Studies Global Climate, millions of years ago: Ratios of Isotopes of Strontium Reflects Ice Ages, via Sea Level (50 meter difference!) As function of time Clearly nonlinear relationship

II. Direction of Relationship Positive Association X bigger  Y bigger Negative Association X bigger  Y smaller E.g. X = alcohol consumption, Y = Driving Ability Clear negative association

III. Strength of Relationship Idea: How close are points to lying on a line? Revisit Class Scores Example: Final Exam is “closely related to HW” Midterm 1 less closely related to HW Midterm 2 even related to Midterm 1