Social Science Statistics Module I Gwilym Pryce

Slides:



Advertisements
Similar presentations
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Advertisements

Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
The Normal Distribution
1 Lecture 2 Calculating z Scores Quantitative Methods Module I Gwilym Pryce
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
1 Lecture 1 Density curves and the CLT Slides available from Statistics & SPSS page of Social Science Statistics Module I.
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
Chapter 6 The Normal Probability Distribution
Section 7.1 The STANDARD NORMAL CURVE
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
1 Lecture 3: Introduction to Confidence Intervals Social Science Statistics I Gwilym Pryce
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Identify.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.2 Probabilities for Bell-Shaped Distributions.
1 Lecture 1 Density curves and the CLT Quantitative Methods Module I Gwilym Pryce
The Central Limit Theorem © Christine Crisp “Teach A Level Maths” Statistics 1.
Confidence Interval Estimation For statistical inference in decision making:
Chapter 6 The Normal Distribution.  The Normal Distribution  The Standard Normal Distribution  Applications of Normal Distributions  Sampling Distributions.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 6-1 The Normal Distribution.
1 Lecture 5 Introduction to Hypothesis Tests Slides available from Statistics & SPSS page of Social Science Statistics Module.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.
THE NORMAL DISTRIBUTION
Normal Distribution 1. Objectives  Learning Objective - To understand the topic on Normal Distribution and its importance in different disciplines. 
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics, A First Course 4 th.
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 6 The Normal Distribution and Other Continuous Distributions
13-5 The Normal Distribution
MATB344 Applied Statistics
Sampling Distributions
NORMAL DISTRIBUTION.
Normal Distribution and Parameter Estimation
Normal Distribution.
The Normal Distribution
Chapter 5 The Normal Curve.
STAT 206: Chapter 6 Normal Distribution.
Chapter 12 Statistics 2012 Pearson Education, Inc.
Descriptive Statistics: Presenting and Describing Data
Chapter 7 Sampling Distributions.
Summary Statistics 9/23/2018 Summary Statistics
CHAPTER 3: The Normal Distributions
Density Curves and Normal Distribution
The Normal Probability Distribution
The normal distribution
Chapter 7 Sampling Distributions.
Econ 3790: Business and Economics Statistics
5.4 Finding Probabilities for a Normal Distribution
Normal Probability Distributions
11-3 Use Normal Distributions
The Standard Normal Distribution
Normal Probability Distributions
Chapter 7 Sampling Distributions.
Lecture 5 Introduction to Hypothesis tests
Chapter 6 Introduction to Continuous Probability Distributions
Statistics for Managers Using Microsoft® Excel 5th Edition
Click the mouse button or press the Space Bar to display the answers.
Chapter 7 Sampling Distributions.
CHAPTER 12 Statistics.
CHAPTER 3: The Normal Distributions
The Normal Curve Section 7.1 & 7.2.
Section 13.6 The Normal Curve
Chapter 5 Normal Probability Distributions.
Chapter 7 Sampling Distributions.
CHAPTER 3: The Normal Distributions
The Normal Distribution
Chapter 12 Statistics.
Presentation transcript:

Social Science Statistics Module I Gwilym Pryce Lecture 2 Calculating z-Scores Slides available from Statistics & SPSS page of www.gpryce.com

Class Reps and Staff Student committee Notices: Register Class Reps and Staff Student committee

Introduction & Overview:

Introduction: We have looked at the characteristics of density functions, & one that particularly interests us, the normal distribution Though we have already looked briefly at the standard normal curve, today we shall look in depth at the practicalities of calculating z scores and using them to work out probabilities.

Aim Objectives by the end of this lecture students should be able to: Aims & Objectives Aim To consider the practicalities of the standard normal curve Objectives by the end of this lecture students should be able to: Work out probabilities associated with z scores Work out zi from given probabilities Derive zi and associated probability from given values of a normally distributed variable x Apply zi scores to sampling distributions

Review of last week’s lecture 1. Find probabilities from zi Plan of L2: Review of last week’s lecture 1. Find probabilities from zi Tables SPSS 2. Find zi from a given probability z that bounds upper or lower tail area ± z that bounds central area 3. Find zi & probabilitiy from xi ~N(m,s) 4. Applying z scores to sampling distributions

Review of Last week’s Lecture (L1): 1. Measures of Central Tendency & Dispersion 2. Density curves & Symmetrical Distributions 3. Normal Distribution 4. Central Limit Theorem

L1: Density curves: idealised histograms (rescaled so that area sums to one)

Properties of a density curve Vertical axis indicates relative frequency over values of the variable X Entire area under the curve is 1 The density curve can be described by an equation Density curves for theoretical probability models have known properties

Area under density curves: the area under a density curve that lies between two numbers = the proportion of the data that lies between these two numbers: e.g. if area between two numbers x1 and x2 = 0.6, then this means 60% of x lies between x1 and x2 when the density curve is symmetrical, we make use of the fact that areas under the curve will also be symmetrical

“Bell-shaped” density curve particularly important Normal Curves “Bell-shaped” density curve particularly important Has some very useful statistical properties Infinite number of poss. normal distributions but they vary only by mean and S.D. so they are all related -- just scaled versions of each other a baseline normal distribution has been invented: called the standard normal distribution has zero mean and standard deviation = 1

a b c Standardise z zc za zb

we can standardise any observation from a normal distribution Standard Normal Curve we can standardise any observation from a normal distribution I.e. show where it fits on the standard normal distribution: All we need is a simple conversion formula A bit like converting lots of different currencies to a baseline common currency (e.g. the dollar) This “currency conversion” uses a v. simple formula: subtract the mean from each value and dividing the result by the standard deviation. This is called the z-score = standardised value of any normally distributed observation. Where m = population mean s = population S.D.

Areas under the standard normal curve between different z-scores are equal to areas between corresponding values on any normal distribution Tables of areas have been calculated for each z-score, so if you standardise your observation, you can find out the area above or below it. But we saw earlier that areas under density functions correspond to probabilities: so if you standardise your observation, you can find out the probability of other observations lying above or below it.

Distribution of means from repeated samples We have looked at how to calculate the sample mean What distribution of means do we get if we take repeated samples? E.g. 1000 samples of 500 people. In each sample of 500 people you calculate the mean income. What would the histogram of the sample mean incomes look like?

Then suppose we ask a random sample of people what their income is. This sample will probably have a similar distribution of income as the population Positive skew: mean is “pulled-up” by the incomes of fat-cat, bourgeois capitalists. Since the median is a “resistant measure”, the mean is greater than the median Then suppose we take a second sample, and then a third; and then compute the mean income of each sample: Sample 1: mean income = £20,500 Sample 2: mean income = £18,006 Sample 3: mean income = £21,230

Why the normal distribution is useful: Even if a variable is not normally distributed, its sampling distribution of means will be normally distributed, provided n is large (I.e. > 30) I.e. some samples will have a mean that is way out of line from population mean, but most will be reasonably close. “Central Limit Theorem”

Review of last week’s lecture 1. Find probabilities from zi Plan Review of last week’s lecture 1. Find probabilities from zi Tables SPSS macro command 2. Find zi from a given probability z that bounds upper or lower tail area ± z that bounds central area 3. Find zi & probabilitiy from xi ~N(m,s) 4. Applying z scores to sampling distributions

Find probabilities from zi Lecture 2: Calculating z-Scores Find probabilities from zi 1.1 Using Published tables Most stats books have z-score tables which allow you to find Prob(z< zi ) Integer and first decimal listed in the row heading Second decimal listed in the column heading Probability value is located at the intersection of this row and column Or sometimes published tables list: Prob(0<z< zi ) Prob(z< zi < 0) Symmetry of the normal curve means that its easy to find any probability from any of these.

Then look-up the first decimal place in the row heading e.g. Prob(z < 1.36) First Draw diagram Then look-up the first decimal place in the row heading Then find the second decimal place in the column heading

Prob(z < 1.36)

 

Using the macro commands: After drawing the diagram: 1st open a new data sheet & enter 0 in first cell 2nd Macros 3rd Open syntax window 4th type in command  pz_lt_zi (1.36) . calculates the probability that z is less than 1.36 & will result in the following output: Prob(z < zi) for a given zi ZI PROB 1.36000 .91309  

Prob(z < 1.36)

pz_gt_zi calculates the probability that z is greater than zi: e.g. pz_gt_zi (-2.897) . will result in the following output: Prob(z > zi) for a given zi ZI PROB -2.89700 .99812 which says that 99.812% of z lie above –2.897. Q/ how would you find this probability using the table?

e.g. pz_lg_zi zil=(-2) ziu=(2). will result in the following output: pz_lg_zi calculates the probability that z is less than ziL or greater than ziU  e.g. pz_lg_zi zil=(-2) ziu=(2). will result in the following output: Prob((z < ziL) OR (z > ziU)) for a given zi ZIL ZIU PROB -2.00000 2.00000 .04550 which can be interpreted as telling us that just 4.55% of z lie outside of the range –2 to 2. Q/ How would you find the probability that z lies above 2 and/or below -2 using the tables?

e.g. pz_gl_zi zil=(-2) ziu=(2). results in the following output: pz_gl_zi calculates the probability that z is greater than ziL AND less than ziU  e.g. pz_gl_zi zil=(-2) ziu=(2). results in the following output: Prob(ziL < z < ziU)) for a given zi ZIL ZIU PROB -2.00000 2.00000 .95450 which tells us that 95.45% of z lie in the range –2 to 2. Q/ How would you find the proportion of z that lies between -2 and 2?

Review of last week’s lecture 1. Find probabilities from zi Plan Review of last week’s lecture 1. Find probabilities from zi Tables SPSS macro command 2. Find zi from a given probability z that bounds upper or lower tail area ± z that bounds central area 3. Find zi & probabilitiy from xi ~N(m,s) 4. Applying z scores to sampling distributions

2. Find zi from a given probability 2.1 Using tables: You can look up the areas in the body of the table and find the z value that bounds that area: You must be careful to restate your problem in a way that fits with the probabilities reported in the table however

E.g. Find zi s.t. Prob(z < zi) = 0.06 This is a small area in the left hand tail so zi is going to be negative Q/ How can we find the value of z even though table only gives positive values of z?

Area = 0.06 z

E.g. Find zi s.t. Prob(z < zi) = 0.06 Table only gives positive values of z. But because the normal distribution is symmetrical, we can look at the upper tail of the same area & know that the z value will be of the same absolute value. I.e. Find zi s.t. Prob(z > zi) = 0.94 The tables only give us the Prob(z < zi) But we can simply switch the sign on zi: i.e. -zi s.t. Prob(z > -zi) = +zi s.t. Prob(z < +zi)

Area = 0.94 z

2.2 Use zi_lt_zp and zi_gt_zp Macros: zi_lt_zp p = (0.06).   Value of zi such that Prob(z < zi) = PROB when PROB is given ZI PROB -1.55477 .06000

zi_gt_zp p = (0.06). ValuValue of zi such that Prob(z > zi) = PROB when PROB is given ZI PROB 1.55477 .06000

2.3 ± z that bounds central area Find the value of zi such that Prob(-zi < z < zi) = 0.99 Q/ How would you do this using tables?

First find half of the central area: Using Tables: First find half of the central area: Area of half of central area = 0.99 / 2 = 0.495 Then add 0.5 = 0.495 + 0.5 = 0.995 Then find z value associated with that area: Look up 0.995 in the body of the table z = 2.575

zi_gl_zp p=(0.99) .   Value of zi such that Prob(-zi < z < zi) = PROB, when PROB is given ZIL ZIU PROB -2.57583 2.57583 .99000 which tells you that the central 99% of z values are bounded by + and – 2.576

Review of last week’s lecture 1. Find probabilities from zi Plan Review of last week’s lecture 1. Find probabilities from zi Tables SPSS macro command 2. Find zi from a given probability z that bounds upper or lower tail area ± z that bounds central area 3. Find zi & probabilitiy from xi ~N(m,s) 4. Applying z scores to sampling distributions

3. Find zi & probabilitiy from xi ~N(m,s) For a particular value xi of a normally distributed variable x, we can calculate the standardised normal value, zi, associated with it by subtracting the population mean, and dividing by the population standard deviation,

So, we can standardise any value of x provided we know the population mean and population standard error of the mean. And once you have standardised a value (i.e. converted it to a z-score), then you can use it to calculate probabilities under the standard normal curve knowing that these probabilities correspond to probabilities under the original distribution of x.

You know that the height of all 18 year old males is normally distributed with a mean of 1.8m and a standard deviation of 1.2m. What proportion of 18year olds are < 2m tall? xi = height of 18 year olds = 2 m = population mean of x = mean height of all 18yr olds = 1.8 s = population standard deviation of x. = 1.2

That is, 56.62% of 18year old males are less than 2m tall. Because height is normally distributed, we know that: Prob(height < 2m) = Prob(z < zi) where zi is the standardised value for x = 2. First we need to calculate zi: Now that we have calculated zi = 0.1667, we can calculate Prob(height < 2m)= Prob(z < 0.1667) Using the pz_lt_zi macro, we get: pz_lt_zi (0.1667). Prob(z < zi) for a given zi ZI PROB .16670 .56620 That is, 56.62% of 18year old males are less than 2m tall.

Review of last week’s lecture 1. Find probabilities from zi Plan Review of last week’s lecture 1. Find probabilities from zi Tables SPSS macro command 2. Find zi from a given probability z that bounds upper or lower tail area ± z that bounds central area 3. Find zi & probabilitiy from xi ~N(m,s) 4. Applying z scores to sampling distributions

4. Applying z scores to sampling distributions “nature’s questionable tendency to normalcy”  limited use of z scores … if it were not for the CLT: Sampling distributions of means are always normally distributed provided n is large.  following formula for z:

If the sample mean inside leg of gerbils is 2 If the sample mean inside leg of gerbils is 2.7cm, the population mean is 3cm and the standard error of the mean is 4, what is the z-score for the sample mean? What proportion of all possible large samples of gerbils have inside leg measurements of less than 2.7cm?

Prob(sample mean < 2.7) = Prob(z < -0.075) zi_lt_zp p = (-0.075). Prob(z < zi) for a given zi ZI PROB -.07500 .47011 That is, 47.01% of all possible sample mean inside leg lengths are less than 2.7cm.

So the CLT + z score allow us to say something about sample means.

1. How to find probabilities from zi Summary: 1. How to find probabilities from zi Tables SPSS 2. How to find zi from a given probability z that bounds upper or lower tail area ± z that bounds central area 3. How to find zi & probabilitiy from xi ~N(m,s) 4. How to z scores can be applied to sampling distributions

M&M section 1.3 and chapter 5. Reading: Pryce, chapter 3 M&M section 1.3 and chapter 5.