Descriptive Statistics:

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics: Numerical Measures
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Analysis of Research Data
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Objective To understand measures of central tendency and use them to analyze data.
REPRESENTATION OF DATA.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
STATISTICS I COURSE INSTRUCTOR: TEHSEEN IMRAAN. CHAPTER 4 DESCRIBING DATA.
Chapter 2 Describing Data.
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
Skewness & Kurtosis: Reference
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Chapter 2 Describing Data: Numerical
COMPLETE BUSINESS STATISTICS
Descriptive Statistics ( )
Lecture #3 Tuesday, August 30, 2016 Textbook: Sections 2.4 through 2.6
Exploratory Data Analysis
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Measures of Dispersion
Business and Economics 6th Edition
MATH-138 Elementary Statistics
Analysis and Empirical Results
Chapter 3 Describing Data Using Numerical Measures
BUSINESS MATHEMATICS & STATISTICS.
Descriptive Statistics
MATH 2311 Section 1.5.
Statistical Reasoning
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Descriptive Statistics:
Chapter 3 Describing Data Using Numerical Measures
About the Two Different Standard Normal (Z) Tables
Numerical Descriptive Measures
Percentiles and Box-and- Whisker Plots
An Introduction to Statistics
AP Exam Review Chapters 1-10
Topic 5: Exploring Quantitative data
Numerical Measures: Skewness and Location
Chapter 2: Descriptive Statistics
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
BUS173: Applied Statistics
Quartile Measures DCOVA
Representation of Data
Displaying and Summarizing Quantitative Data
Numerical Descriptive Measures
Numerical Descriptive Measures
Click the mouse button or press the Space Bar to display the answers.
Honors Statistics Review Chapters 4 - 5
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
St. Edward’s University
Business and Economics 7th Edition
The Normal Distribution
Presentation transcript:

Descriptive Statistics: Part II Each slide has its own narration in an audio file. For the explanation of any slide click on the audio icon to start it. Professor Friedman's Statistics Course by H & L Friedman is licensed under a  Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. 

Shape A third important property of data – after location and dispersion - is its shape. Shape can be described by degree of asymmetry (i.e., skewness). mean > median positive or right-skewness mean = median symmetric or zero-skewness mean < median negative or left-skewness   Positive skewness can arise when the mean is increased by some unusually high values. Negative skewness can arise when the mean is decreased by some unusually low values. Descriptive Statistics II

Skewness Left skewed: Right skewed: Symmetric: Source: Levine et al., Business Statistics, Pearson, 2013. Descriptive Statistics II

Example: # hours to complete a task This guy took a VERY long time! Data (for n=12 employees): 2 3 8 ┋ 8 9 10 ┋ 10 12 15 ┋ 18 22 63 𝑋 = 180/12 = 15 hours Median = 10 hours The (extremely slow) employee who took 63 hours to complete the task skewed the entire distributon to the right. s2 = 2868 / 11 = 260.79 s = 16.25 hours CV = 107.7% Descriptive Statistics II

Example Using MS Excel Scores of 17 students on a national calculus exam. Data: 0, 0, 10, 12, 15, 18, 20, 25, 30, 33, 34, 41, 56, 87, 92, 94, 95 Open MS Excel. Go to Data Analysis—Analysis Tools — Descriptive Statistics. If you do not have Data Analysis-Analysis Tools, you have to use the Add-in feature and add it to MS Excel. Make sure to check the Summary Statistics box once you are in descriptive statistics. See MS Excel Output on next slide. Descriptive Statistics II

Using MS Excel From the output: mean is 38.94 median is 30 mode is 0 MS Excel uses a formula – the Pearson Coefficient of Skewness – to calculate skewness. You do not have to know the formula. If the coefficient is 0 or very close to it, you have a symmetric distribution. From the output: mean is 38.94 median is 30 mode is 0 standard deviation is 33.44 variance is 1118.43 skewness is .78 (positive) range is 95 n is 17 Descriptive Statistics II

Standardizing Data: Z-Scores We can convert the original scores to new scores with 𝑋 = 0 and s = 1. This will give us a pure number with no units of measurement. Any score below the mean will now be negative. Any score at the mean will be 0. Any score above the mean will be positive. Descriptive Statistics II

Standardizing Data: Z-Scores To compute the Z-scores: 𝑍= 𝑋− 𝑋 𝑠 Example. Data: 0, 2, 4, 6, 8, 10 𝑋 = 30/6 = 5; s = 3.74 X  Z 0−5 3.74 -1.34 2 2−5 3.74 -.80 4 4−5 3.74 -.27 6 6−5 3.74 .27 8 8−5 3.74 .80 10 10−5 3.74 1.34 Descriptive Statistics II

Standardizing Data: Z-Scores Data: Exam Scores Original data   Change 7 to 97 Change 23 to 93 X Z 65 -0.45 -0.81 -1.40 73 -0.11 -0.38 -0.79 78 0.10 -0.10 -0.40 69 -0.28 -0.60 -1.09 7 -2.89 <= 97 0.94 1.07 23 -2.21 -3.12 93 0.76 98 0.99 1.14 99 1.05 1.22 0.90 75 -0.02 -0.27 -0.63 79 0.14 -0.05 -0.32 85 0.40 0.28 63 -0.53 -0.92 -1.56 67 -0.36 -0.70 -1.25 72 -0.15 -0.43 -0.86 0.73 0.72 95 0.82 0.83 0.91 Mean 75.57 79.86 83.19 s 23.75 18.24 s. 12.96 Descriptive Statistics II

Standardizing Data: Z-Scores No matter what you are measuring, a Z-score of more than +5 or less than – 5 would indicate a very, very unusual score. For standardized data, if it is normally distributed, 95% of the data will be between ±2 standard deviations about the mean. If the data follows a normal distribution, 95% of the data will be between -1.96 and +1.96. 99.7% of the data will fall between -3 and +3. 99.99% of the data will fall between -4 and +4. Worst case scenario: 75% of the data are between 2 standard deviations about the mean. [Chebychev.] Descriptive Statistics II

Smallest| Q1 | Median | Q3 | Largest Five Number Summary When examining a distribution for shape, sometime the five number summary is useful: Smallest| Q1 | Median | Q3 | Largest Example: 𝑋 = 15 5-number summary: 2 | 8 | 10 | 16.5 | 63 This data is right-skewed. In right-skewed distributions, the distance from Q3 to Xlargest (16.5 to 63) is significantly greater than the distance from Xsmallest to Q1(2 to 8). 2 3 8 9 10 12 15 18 22 63 Median Q1 Smallest Q3 Largest Descriptive Statistics II

Boxplot The boxplot is a way to graphically portray a distribution of data by means of its five-number summary. Boxplot can be drawn along the horizontal or vertically. Vertical line drawn within the box is the median Vertical line at the left side of box is Q1 Vertical line at the right side of box is Q3 Line on left connects left side of box with Xsmallest (lower 25% of data) Line on right connects right side of box with Xlargest (upper 25% of data) Descriptive Statistics II

Boxplot A “bell-shaped” symmetric data distribution would look like this: Descriptive Statistics II

Categorical Data We summarize categorical data using frequencies and graphical methods. Descriptive Statistics II

Working with Frequencies A frequency distribution records data grouped into classes and the number of observations that fell into each class. A frequency distribution can be used for: categorical data numerical data that can be grouped into intervals numerical data with repeated observations A percentage distribution records the percent of the observations that fell into each class. Descriptive Statistics II

Working with Frequencies Example. A sample was taken of 200 professors at a (fictitious) local college. Each was asked for his or her (take-home) weekly salary. The responses ranged from about$520 to $590. If we wanted to display the data in, say, 7 equal intervals, we would use an interval width of $10. Width of interval = 𝑅𝑎𝑛𝑔𝑒 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠 = $70 7 = $10/class. The Frequency / Percentage Distribution: . Take-home pay frequency percentage 520 and under 530 6 3 % 530 " " 540 30 15 540 " " 550 38 19 550 " " 560 52 26 560 " " 570 42 21 570 " " 580 24 12 580 to 590 8 4 200 100 Descriptive Statistics II

Working with Frequencies A Cumulative Distribution focuses on the number or percentage of cases that lie below or above specified values rather than within intervals. Take-home pay frequency percentage less than 520 " " 530 6 3 540 36 18 550 74 37 560 126 63 570 168 84 580 192 96 590 200 100 Descriptive Statistics II

Working with Frequencies The Frequency Histogram: Descriptive Statistics II

Working with Frequencies The Frequency Polygon Descriptive Statistics II

Working with Frequencies The Cumulative Frequency Distribution Descriptive Statistics II

Descriptive Statistics – 2 variables Categorical Data – graphical representation Contingency Table Side-by-Side Bar Chart Numerical Data – looking for relationships in bivariate data Scatter Plot Correlation The Regression Line   Descriptive Statistics II

The Contingency Table Two categorical variables are most easily displayed in a contingency table. This is a table of two-way frequencies. Example: “Who would you vote for in the next election?” This also works for two-way percentages: .   Male Female Republican Candidate 250 500 Democrat Candidate 150 350 400 600 1000 Descriptive Statistics II

The Side-by-Side Bar Chart Chart: Relative Performance (Source: Microsoft.com) Descriptive Statistics II

The Scatter Plot What can we do with 2 numerical variables? We can graph them. Example – Grade and Height (in inches) Y (Grade) 100 95 90 80 70 65 60 40 30 20 X (Height) 73 79 62 69 74 77 81 63 68 Descriptive Statistics II

The Scatter Plot Correlation coefficient is r = .12 Coefficient of determination is r2 = .01 We will learn about the above measures, as well as more about scatter plots, in the topic onCORRELATION. Descriptive Statistics II

Homework Practice, practice, practice. As always, do lots and lots of problems. You can find these in the online lecture notes and homework assignments. Descriptive Statistics II