Review of Basic Statistics. Parameters and Statistics Parameters are characteristics of populations, and are knowable only by taking a census. Statistics.

Slides:



Advertisements
Similar presentations
Brought to you by Tutorial Support Services The Math Center.
Advertisements

Chapter 6 Continuous Probability Distributions
1 1 Slide MA4704Gerry Golding Normal Probability Distribution n The normal probability distribution is the most important distribution for describing a.
Chapter 3 part B Probability Distribution. Chapter 3, Part B Probability Distributions n Uniform Probability Distribution n Normal Probability Distribution.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide 2009 University of Minnesota-Duluth, Econ-2030 (Dr. Tadesse) Chapter-6 Continuous Probability Distributions.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Introduction to Summary Statistics
Continuous Probability Distributions
Business and Finance College Principles of Statistics Eng. Heba Hamad 2008.
Descriptive Statistics: Numerical Methods, Part 1
QUANTITATIVE DATA ANALYSIS
Calculating & Reporting Healthcare Statistics
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Five-Number Summary 1 Smallest Value 2 First Quartile 3 Median 4
1 1 Slide © 2006 Thomson/South-Western Chapter 6 Continuous Probability Distributions n Uniform Probability Distribution n Normal Probability Distribution.
Continuous Probability Distributions Uniform Probability Distribution Area as a measure of Probability The Normal Curve The Standard Normal Distribution.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Slides by JOHN LOUCKS St. Edward’s University.
Data Transformation Data conversion Changing the original form of the data to a new format More appropriate data analysis New.
Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
Exploring Marketing Research William G. Zikmund
Continuous Probability Distributions A continuous random variable can assume any value in an interval on the real line or in a collection of intervals.
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
DATA Exploration: Statistics (One Variable) 1.Basic EXCELL/MATLAB functions for data exploration 2.Measures of central tendency, Distributions 1.Mean 2.Median.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
1 Tendencia central y dispersión de una distribución.
Describing distributions with numbers
Economics 173 Business Statistics Lecture 2 Fall, 2001 Professor J. Petry
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 3 – Descriptive Statistics
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
JDS Special Program: Pre-training1 Basic Statistics 01 Describing Data.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
Statistics 1 Measures of central tendency and measures of spread.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
1 1 Slide Continuous Probability Distributions n A continuous random variable can assume any value in an interval on the real line or in a collection of.
1 1 Slide Descriptive Statistics: Numerical Measures Location and Variability Chapter 3 BA 201.
Descriptive Statistics: Numerical Methods
IT College Introduction to Computer Statistical Packages Eng. Heba Hamad 2009.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Variables and Random Variables àA variable is a quantity (such as height, income, the inflation rate, GDP, etc.) that takes on different values across.
Chapter Three McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved
What does Statistics Mean? Descriptive statistics –Number of people –Trends in employment –Data Inferential statistics –Make an inference about a population.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2004 Thomson/South-Western Chapter 3, Part A Discrete Probability Distributions n Random Variables n Discrete Probability Distributions n Expected.
Business Statistics (BUSA 3101). Dr.Lari H. Arjomand Continus Probability.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
Chapter 3 Descriptive Statistics: Numerical Methods.
CHAPTER 3 – Numerical Techniques for Describing Data 3.1 Measures of Central Tendency 3.2 Measures of Variability.
Statistics -Descriptive statistics 2013/09/30. Descriptive statistics Numerical measures of location, dispersion, shape, and association are also used.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
PROBABILITY DISTRIBUTION. Probability Distribution of a Continuous Variable.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Descriptive Statistics ( )
Analysis and Empirical Results
Central Tendency and Variability
Numerical Measures: Centrality and Variability
Numerical Descriptive Measures
Dispersion How values arrange themselves around the mean
St. Edward’s University
Numerical Descriptive Measures
Presentation transcript:

Review of Basic Statistics

Parameters and Statistics Parameters are characteristics of populations, and are knowable only by taking a census. Statistics are estimates of parameters made from samples.

Descriptive Statistics Review Measures of Location The Mean The Median The Mode Measures of Dispersion The variance The standard deviation

Mean The mean (or average) is the basic measure of location or “central tendency” of the data. The sample mean is a sample statistic. The population mean  is a population statistic.

Sample Mean Where the numerator is the sum of values of n observations, or: The Greek letter Σ is the summation sign

Example: College Class Size We have the following sample of data for 5 college classes: We use the notation x 1, x 2, x 3, x 4, and x 5 to represent the number of students in each of the 5 classes: X 1 = 46 x 2 = 54 x 3 = 42 x 4 = 46 x 5 = 32 Thus we have: The average class size is 44 students

Population Mean (  ) The number of observations in the population is denoted by the upper case N. The sample mean is a point estimator of the population mean 

Median The median is the value in the middle when the data are arranged in ascending order (from smallest value to largest value). a.For an odd number of observations the median is the middle value. b.For an even number of observations the median is the average of the two middle values.

The College Class Size example First, arrange the data in ascending order: Notice than n = 5, an odd number. Thus the median is given by the middle value The median class size is 46

Median Starting Salary For a Sample of 12 Business School Graduates A college placement office has obtained the following data for 12 recent graduates: GraduateStarting SalaryGraduate Starting Salary

Notice that n = 12, an even number. Thus we take an average of the middle 2 observations: Middle two values First we arrange the data in ascending order Thus

Mode The mode is the value that occurs with greatest frequency Soft Drink Example Soft DrinkFrequency Coke Classic19 Diet Coke8 Dr. Pepper5 Pepsi Cola13 Sprite5 Total50 The mode is Coke Classic. A mean or median is meaningless of qualitative data

Using Excel to Compute the Mean, Median, and Mode Enter the data into cells A1:B13 for the starting salary example. To compute the mean, activate an empty cell and enter the following in the formula bar: =Average(b2:b13) and click the green checkmark. To compute the median, activate an empty cell and enter the following in the formula bar: = Median(b2:b13) and click the green checkmark. To compute the mode, activate an empty cell and enter the following in the formula bar: =Average(b2:b13) and click the green checkmark.

The Starting Salary Example Mean2940 Median2905 Mode2880

Variance The variance is a measure of variability that uses all the data The variance is based on the difference between each observation (x i ) and the mean ( ) for the sample and μ for the population).

The variance is the average of the squared differences between the observations and the mean value For the population: For the sample:

Standard Deviation The Standard Deviation of a data set is the square root of the variance. The standard deviation is measured in the same units as the data, making it easy to interpret.

Computing a standard deviation For the population: For the sample:

Measures of Association Between two Variables Covariance Correlation coefficient

Covariance Covariance is a measure of linear association between variables. Positive values indicate a positive correlation between variables. Negative values indicate a negative correlation between variables.

To compute a covariance for variables x and y For populations For samples

n = 299 III III IV

If the majority of the sample points are located in quadrants II and IV, you have a negative correlation between the variables— as we do in this case. Thus the covariance will have a negative sign.

The (Pearson) Correlation Coefficient A covariance will tell you if 2 variables are positively or negatively correlated—but it will not tell you the degree of correlation. Moreover, the covariance is sensitive to the unit of measurement. The correlation coefficient does not suffer from these defects

The (Pearson) Correlation Coefficient For populations For samples Note that:

I have 7 hours per week for exercise

Normal Probability Distribution The normal distribution is by far the most important distribution for continuous random variables. It is widely used for making statistical inferences in both the natural and social sciences.

Heights of people Heights Normal Probability Distribution n It has been used in a wide variety of applications: Scientific measurements measurementsScientific

Amounts of rainfall Amounts Normal Probability Distribution n It has been used in a wide variety of applications: Test scores scoresTest

The Normal Distribution Where: μ is the mean σ is the standard deviation  = e =

The distribution is symmetric, and is bell-shaped. The distribution is symmetric, and is bell-shaped. Normal Probability Distribution n Characteristics x

The entire family of normal probability The entire family of normal probability distributions is defined by its mean  and its distributions is defined by its mean  and its standard deviation . standard deviation . The entire family of normal probability The entire family of normal probability distributions is defined by its mean  and its distributions is defined by its mean  and its standard deviation . standard deviation . Normal Probability Distribution n Characteristics Standard Deviation  Mean  x

The highest point on the normal curve is at the The highest point on the normal curve is at the mean, which is also the median and mode. mean, which is also the median and mode. The highest point on the normal curve is at the The highest point on the normal curve is at the mean, which is also the median and mode. mean, which is also the median and mode. Normal Probability Distribution n Characteristics x

Normal Probability Distribution n Characteristics The mean can be any numerical value: negative, The mean can be any numerical value: negative, zero, or positive. zero, or positive. The mean can be any numerical value: negative, The mean can be any numerical value: negative, zero, or positive. zero, or positive. x

Normal Probability Distribution n Characteristics  = 15  = 25 The standard deviation determines the width of the curve: larger values result in wider, flatter curves. The standard deviation determines the width of the curve: larger values result in wider, flatter curves. x

Probabilities for the normal random variable are Probabilities for the normal random variable are given by areas under the curve. The total area given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and under the curve is 1 (.5 to the left of the mean and.5 to the right)..5 to the right). Probabilities for the normal random variable are Probabilities for the normal random variable are given by areas under the curve. The total area given by areas under the curve. The total area under the curve is 1 (.5 to the left of the mean and under the curve is 1 (.5 to the left of the mean and.5 to the right)..5 to the right). Normal Probability Distribution n Characteristics.5.5 x

The Standard Normal Distribution The Standard Normal Distribution is a normal distribution with the special properties that is mean is zero and its standard deviation is one.

 0 z The letter z is used to designate the standard The letter z is used to designate the standard normal random variable. normal random variable. The letter z is used to designate the standard The letter z is used to designate the standard normal random variable. normal random variable. Standard Normal Probability Distribution

Cumulative Probability 0 1 z Probability that z ≤ 1 is the area under the curve to the left of 1.

What is P(z ≤ 1)? Z ● ● ● ● ● To find out, use the Cumulative Probabilities Table for the Standard Normal Distribution

Area under the curve 0 z % 95.45% percent of the total area under the curve is within (±) 1 standard deviation from the mean percent of the area under the curve is within (±) 2 standard deviations of the mean.

Exercise a)What is P(z ≤2.46)? b)What is P(z >2.46)? Answer: a).9931 b) =.0069 z

Exercise a)What is P(z ≤-1.29)? b)What is P(z > -1.29)? Answer: a) =.0985 b).9015 Note that, because of the symmetry, the area to the left of is the same as the area to the right of Red-shaded area is equal to green- shaded area Note that: z

Exercise 3 0 What is P(.00 ≤ z ≤1.00)? 1 P(.00 ≤ z ≤1.00)=.3413 z