Presentation is loading. Please wait.

Presentation is loading. Please wait.

B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.

Similar presentations


Presentation on theme: "B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma."— Presentation transcript:

1 B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma

2 Summarizing Data

3 BAD 6243: Applied Univariate Statistics 3 Measures of Central Tendency Mean –Summarizing sample data (continuous data) –Estimating the population parameter (µ) from the sample statistic (x) –Arithmetic average (sum of scores/number of scores) Median –Mid point of distribution –Can be used in summarizing ordinal data Mode –Most frequently occurring value –Does not consider distribution of all scores –E.g., more males than females

4 BAD 6243: Applied Univariate Statistics 4 Take a sample distribution of hourly billing rates for a group of IT workers: 30, 20, 40, 50, 60, 60, 100, 60, 30  x = 450 n = 9 Mean = 50 Median: 20, 30, 30, 40, 50, 60, 60, 60, 100 = 50 Mode = 60 An Example

5 BAD 6243: Applied Univariate Statistics 5 Measures of Dispersion Range –Difference between highest and lowest score –Does not take into account all scores in distribution Variance –Measure of how much scores deviate from mean on average –Use sample statistic (s 2 ) to estimate population parameter (  2 ) Standard Deviation –Square root of the variance –Measure of consistency of scores

6 BAD 6243: Applied Univariate Statistics 6 Example (contd.) x x-x (x-x) 2 Range = 80  (x-x) 2 = 4600 (sum of squared differences) s 2 =  (x-x) 2 /n-1 = 4600/8 = 575 (variance) s = 24 (standard deviation)

7 BAD 6243: Applied Univariate Statistics 7 Frequencies

8 Box Plot BAD 6243: Applied Univariate Statistics 8

9 9 Nature of the Distribution Skewness –Symmetry of the distribution –Value is zero in normal distribution –If skewed positively (pile up of scores on left) or negatively (pile up of scores on right), standardized z scores can be useful Kurtosis –“Peak”edness of the distribution –Value is zero in normal distribution –If positive (peaked) or negative (flat), standardized z scores can be useful Statistical Tests

10 BAD 6243: Applied Univariate Statistics 10 A Summary of Results

11 BAD 6243: Applied Univariate Statistics 11 Histogram with Normal Curve

12 Understanding Data Distributions

13 BAD 6243: Applied Univariate Statistics 13 Data Distributions A data distribution is a way of representing the frequency of occurrence of values for a variable Data distributions can be discrete (e.g., Bernoulli, Binomial, Poisson) or continuous (e.g., Normal, Exponential, Uniform) A histogram, representing a probability density function, depicts a data distribution Data distributions are defined by the functional form and the values of parameters Our focus is on the shape of such distributions and their implications for statistical inference

14 BAD 6243: Applied Univariate Statistics 14 Normal Distribution Refers to a family of distributions (a.k.a Gaussian distributions) that are bell shaped and: –Represent a continuous probability distribution –Are symmetric (with scores concentrated in the middle) –Can be specified mathematically in terms of two parameters: the mean (μ) and the standard deviation (σ) –Have one mode –Are asymptotic

15 BAD 6243: Applied Univariate Statistics 15 An Example

16 BAD 6243: Applied Univariate Statistics 16 Standard Normal Distribution The area P under the standard normal probability curve, with the respective z-statistic

17 BAD 6243: Applied Univariate Statistics 17 The z Distribution The standard normal distribution, sometimes called the z distribution (as indicated by the formula below), is a normal distribution with a mean of 0 and a standard deviation of 1 Normal distributions can be transformed to a standard normal distribution using the formula: where X is a score from the original normal distribution, μ is its mean and σ is the standard deviation A z-score represents the number of standard deviations above or below the mean Note that the z distribution will only be a normal distribution if the original distribution (X) is normal

18 BAD 6243: Applied Univariate Statistics 18 Areas Under the Curve The Empirical Rule: 68-95-99.7

19 BAD 6243: Applied Univariate Statistics 19 An Example If IQ scores are normally distributed, with a mean of 100 and a standard deviation of 15, –what proportion of scores would be greater than 125? –what proportion of scores would fall between 90 and 120? –what proportion of scores would be less than 85?

20 BAD 6243: Applied Univariate Statistics 20 Some Key Concepts Central Limit Theorem –As sample size increases, the sampling distribution of the mean for simple random samples of n cases, taken from a population with a mean equal to  and a finite variance equal to  2, approximates a normal distribution Sampling Distribution of the Mean* Standard Deviation vs. Standard Error of the Mean Sample Size vs. Number of Samples Other Distributions

21 The Central Limit Theorem 21


Download ppt "B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma."

Similar presentations


Ads by Google