Statistics and Data Analysis

Slides:



Advertisements
Similar presentations
Statistics and Data Analysis
Advertisements

Continuous Probability Distributions.  Experiments can lead to continuous responses i.e. values that do not have to be whole numbers. For example: height.
Continuous Random Variables. L. Wang, Department of Statistics University of South Carolina; Slide 2 Continuous Random Variable A continuous random variable.
Part 9: Normal Distribution 9-1/47 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Chapter 6 Introduction to Continuous Probability Distributions
Engineering Probability and Statistics - SE-205 -Chap 4 By S. O. Duffuaa.
Chapter 6 Continuous Random Variables and Probability Distributions
Chapter 5 Discrete and Continuous Probability Distributions
Chapter 6 The Normal Distribution and Other Continuous Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Chapter 6 The Normal Distribution & Other Continuous Distributions
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 6-1 Chapter 6 The Normal Distribution & Other Continuous Distributions Statistics for.
Chapter 5 Continuous Random Variables and Probability Distributions
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Discrete and Continuous Probability Distributions.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Chapter 6 The Normal Distribution & Other Continuous Distributions
Chapter 4 Continuous Random Variables and Probability Distributions
QA in Finance/ Ch 3 Probability in Finance Probability.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
Business Statistics: Communicating with Numbers
Continuous Random Variables and Probability Distributions
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Ch5 Continuous Random Variables
Topics Covered Discrete probability distributions –The Uniform Distribution –The Binomial Distribution –The Poisson Distribution Each is appropriately.
Theory of Probability Statistics for Business and Economics.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Lecture 4 The Normal Distribution. Lecture Goals After completing this chapter, you should be able to:  Find probabilities using a normal distribution.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 5-1 Introduction to Statistics Chapter 6 Continuous Probability Distributions.
Probability Distributions. Statistical Experiments – any process by which measurements are obtained. A quantitative variable x, is a random variable if.
Basic Business Statistics
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 6-1 The Normal Distribution.
Chap 5-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 5 Discrete and Continuous.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions Basic Business.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
Chap 5-1 Discrete and Continuous Probability Distributions.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics, A First Course 4 th.
Theoretical distributions: the Normal distribution.
Chapter 6 Continuous Random Variables Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
©2003 Thomson/South-Western 1 Chapter 6 – Continuous Probability Distributions Slides prepared by Jeff Heyl, Lincoln University ©2003 South-Western/Thomson.
Yandell – Econ 216 Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Continuous random variables
Continuous Probability Distributions
Advanced Algebra The Normal Curve Ernesto Diaz 1.
Chapter Five The Binomial Probability Distribution and Related Topics
Engineering Probability and Statistics - SE-205 -Chap 4
Normal Distribution and Parameter Estimation
Continuous Random Variables
Continuous Probability Distributions
Chapter 6. Continuous Random Variables
STAT 206: Chapter 6 Normal Distribution.
Chapter 6 Introduction to Continuous Probability Distributions
Introduction to Probability and Statistics
Social Science Statistics Module I Gwilym Pryce
5.4 Finding Probabilities for a Normal Distribution
Continuous Probability Distributions
Basic Practice of Statistics - 3rd Edition The Normal Distributions
Chapter 6 Introduction to Continuous Probability Distributions
Statistics for Managers Using Microsoft® Excel 5th Edition
Click the mouse button or press the Space Bar to display the answers.
CHAPTER 3: The Normal Distributions
Basic Practice of Statistics - 3rd Edition The Normal Distributions
Statistical analysis and its application
Chapter 6 Continuous Probability Distributions
The Normal Distribution
Presentation transcript:

Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics

Statistics and Data Analysis Part 9 – The Normal Distribution

The Normal Distribution Continuous Distributions are models Application – The Exponential Model Computing Probabilities for Continuous Variables Normal Distribution Model Normal Probabilities Reading the Normal Table and Computing Probabilities Applications

The Normal Distribution The most useful distribution in all branches of statistics and econometrics. Strikingly accurate model for elements of human behavior and interaction Strikingly accurate model for any random outcome that comes about as a sum of small influences.

Applications Biological measurements of all sorts (not just human mental and physical) Accumulated errors in experiments Numbers of events accumulated in time Amount of rainfall per interval Number of stock orders per (longer) interval. (We used the Poisson for short intervals) Economic aggregates of small terms. And on and on…..

This is a frequency count of the 1,547,990 scores for students who took the SAT test in 2010.   

The histogram has 181 bars. SAT scores are 600, 610, …, 2390, 2400.

Distribution of 3,226 Birthweights Mean = 3.39kg, Std.Dev.=0.55kg

Continuous Distributions Continuous distributions are models for probabilities of events associated with measurements rather than counts. Continuous distributions do not occur in nature the way that discrete counting rules (e.g., binomial) do. The random variable is a measurement, X The device is a probability density function, f(x). Probabilities are computed using calculus (and computers)

Application: Light Bulb Lifetimes A box of light bulbs states “Average life is 1500 hours” P[Fails at exactly 1500 hours] is 0.0. Note, this is exactly 1500.000000000…, not 1500.0000000001, … P[Fails in an interval (1000 to 2000)] is provided by the model (as we now develop). The model being used is called the exponential model

Model for Light Bulb Lifetimes This is the exponential model for lifetimes. The function is the exponential density.

Using the Model for Light Bulb Lifetimes The area under the entire curve is 1.0.

A Continuous Distribution The probability associated with an interval such as 1000 < LIFETIME < 2000 equals the area under the curve from the lower limit to the upper. Requires calculus. A partial area will be between 0.0 and 1.0, and will produce a probability. (here, 0.2498)

The Probability of a Single Value Is Zero The probability associated with a single point, such as LIFETIME=2000, equals 0.0.

Probability for a Range of Values Prob(Life < 2000) (.7364) Minus Prob(Life < 1000) (.4866) Equals Prob(1000 < Life < 2000) (.2498) The probability associated with an interval such as 1000 < LIFETIME < 2000 is obtained by computing the entire area to the left of the upper point (2000) and subtracting the area to the left of the lower point (1000).

Computing a Probability with Minitab Minitab cannot compute the probability in a range, only from zero to a value.

Applications of the Exponential Model Time between signals arriving at a switch (telephone, message center, email server, paging switch…) (This is called the “interarrival time.”) Length of survival of transplant patients. (Survival time) Lengths of spells of unemployment Time until failure of electronic components Time until consumers use a product warranty Lifetimes of light bulbs

Lightbulb Lifetimes http://www.gelighting.com...

Median Lifetime Prob(Lifetime < Median) = 0.5

The Normal Distribution Normal Distribution Model Normal Probabilities Reading the Normal Table Computing Normal Probabilities Applications

Try a visit to http://www.netmba.com/statistics/distribution/normal/

Shape and Placement Depend on the Application

The Empirical Rule and the Normal Distribution Dark blue is less than one standard deviation from the mean. For the normal distribution, this accounts for about 68% of the set (dark blue) while two standard deviations from the mean (medium and dark blue) account for about 95% and three standard deviations (light, medium, and dark blue) account for about 99.7%.

Computing Probabilities P[X = a specific value] = 0. (Always) P[a < X < b] = P[X < b] – P[X < a] (Note, for continuous distributions, < and < are the same because of the first point above.)

Textbooks Provide Tables of Areas for the Standard Normal Econometric Analysis, WHG, 2012, Appendix G Note that values are only given for z ranging from 0.00 to 3.99. No values are given for negative z. There is no simple formula for computing areas under the normal density (curve) as there is for the exponential. It is done using computers and approximations.

Computing Probabilities Standard Normal Tables give probabilities when μ = 0 and σ = 1. For other cases, do we need another table? Probabilities for other cases are obtained by “standardizing.” Standardized variable is Z = (X – μ)/ σ Z has mean 0 and standard deviation 1

Standard Normal Density

Only Half of the Table Is Needed The area to left of 0.0 is exactly 0.5.

Only Half of the Table Is Needed The area left of 1.60 is exactly 0.5 plus the area between 0.0 and 1.60.

Areas Left of Negative Z Area left of -1.6 equals area right of +1.6. Area right of +1.6 equals 1 – area to the left of +1.6.

Prob(Z < 1.03) = .8485

Prob(Z > 0.45) = 1 - .6736

Prob(Z < -1.36) = Prob(Z > +1.36) = 1 - .9131 = .0869

Prob(Z > -1.78) = Prob(Z < + 1.78) = .9625

Prob(-.5 < Z < 1.15) = Prob(Z < 1.15) = .8749 – (1 - .6915) = .5664

Prob(.18 < Z < 1.67) = Prob(Z< 1.67) = .9525 –5714 = .3811

Normal Distributions The scale and location (on the horizontal axis) depend on μ and σ. The shape of the distribution is always the same. (Bell curve)

Computing Normal Probabilities when  is not 0 and  is not 1

Some Benchmark Values (You should remember these.) Prob(Z > 1.96) = .025 Prob(|Z| > 1.96) = .05 Prob(|Z| > 2) ~ .05 Prob(Z < 1) = .8413 Prob(Z > 1) = .1587

Computing Probabilities by Standardizing: Example

Computing Normal Probabilities If SAT scores were scaled to have a normal distribution with mean 1500 and standard deviation of 300, what proportion of students would be expected to score between 1350 and 1800?

Modern Computer Programs Make the Tables Unnecessary Now calculate 0.841344 – 0.308538 = 0.532806 Not Minitab

Application of Normal Probabilities Suppose that an automobile muffler is designed so that its lifetime (in months) is approximately normally distributed with mean 26.4 months and standard deviation 3.8 months. The manufacturer has decided to use a marketing strategy in which the muffler is covered by warranty for 18 months. Approximately what proportion of the mufflers will fail the warranty? Note the correspondence between the probability that a single muffler will die before 18 months and the proportion of the whole population of mufflers that will die before 18 months. We treat these two notions as equivalent. Then, letting X denote the random lifetime of a muffler, P[ X < 18 ] = p[(X-26.4)/3.8 < (18-26.4)/3.8] ≈ P[ Z < -2.21 ] = P[ Z > +2.21 ] = 1 - P[ Z ≤ 2.21 ] = 1 - 0.9864 = 0.0136 (You could get here directly using Minitab.) From the manufacturer’s point of view, there is not much risk in this warranty.

A Normal Probability Problem The amount of cash demanded in a bank each day is normally distributed with mean $10M (million) and standard deviation $3.5M. If they keep $15M on hand, what is the probability that they will run out of money for the customers? Let $X = the demand. The question asks for the Probability that $X will exceed $15M.

‘Nonnormality’ comes in two forms in observed data Skewness: Size variables such as Assets or Sales of businesses, or income/wealth of individuals. Kurtosis: Essentially ‘thick tails.’ Seemingly outlying observations are more common than would be expected from a normal distribution. Financial data such as exchange rates sometimes behave this way.

Summary Continuous Distributions Normal Distribution Models of reality The density function Computing probabilities as differences of cumulative probabilities Application to light bulb lifetimes Normal Distribution Background Density function depends on μ and σ The empirical rule Standard normal distribution Computing normal probabilities with tables and tools