DESCRIPTIVE STATISTICS Chapter 2 BASED ON SCHAUM’S Outline of Probability and Statistics BY MURRAY R. SPIEGEL, JOHN SCHILLER, AND R. ALU SRINIVASAN ABRIDGMENT,

Slides:



Advertisements
Similar presentations
Statistical Reasoning for everyday life
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Unit 16: Statistics Sections 16AB Central Tendency/Measures of Spread.
Descriptive Statistics
Measures of Dispersion
Calculating & Reporting Healthcare Statistics
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
Measures of Central Tendency
Chapter 2 Describing distributions with numbers. Chapter Outline 1. Measuring center: the mean 2. Measuring center: the median 3. Comparing the mean and.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3 – Descriptive Statistics
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
1.3 Psychology Statistics AP Psychology Mr. Loomis.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
1 1 Slide © 2003 Thomson/South-Western. 2 2 Slide © 2003 Thomson/South-Western Chapter 3 Descriptive Statistics: Numerical Methods Part A n Measures of.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Descriptive Statistics: Numerical Methods
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
By: Amani Albraikan 1. 2  Synonym for variability  Often called “spread” or “scatter”  Indicator of consistency among a data set  Indicates how close.
Engineering Statistics ECIV 2305 Section 2.4 The Variance of a Random Variable.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Psychology’s Statistics. Statistics Are a means to make data more meaningful Provide a method of organizing information so that it can be understood.
Copyright © 2014 by Nelson Education Limited. 3-1 Chapter 3 Measures of Central Tendency and Dispersion.
Numerical Statistics Given a set of data (numbers and a context) we are interested in how to describe the entire set without listing all the elements.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Measures of Central Tendency: The Mean, Median, and Mode
Introduction to Statistics Santosh Kumar Director (iCISA)
Chapter 9 Statistics.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
Central Tendency & Dispersion
CHAPTER 3 : DESCRIPTIVE STATISTIC : NUMERICAL MEASURES (STATISTICS)
1 Descriptive Statistics Descriptive Statistics Ernesto Diaz Faculty – Mathematics Redwood High School.
Summary Statistics: Measures of Location and Dispersion.
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
1.  In the words of Bowley “Dispersion is the measure of the variation of the items” According to Conar “Dispersion is a measure of the extent to which.
1 1 Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University © 2002 South-Western /Thomson Learning.
LIS 570 Summarising and presenting data - Univariate analysis.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
MATH 1107 Elementary Statistics Lecture 3 Describing and Exploring Data – Central Tendency, Variation and Relative Standing.
Descriptive Statistics(Summary and Variability measures)
Data Description Chapter 3. The Focus of Chapter 3  Chapter 2 showed you how to organize and present data.  Chapter 3 will show you how to summarize.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Copyright © 2016 Brooks/Cole Cengage Learning Intro to Statistics Part II Descriptive Statistics Intro to Statistics Part II Descriptive Statistics Ernesto.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Descriptive Statistics Ernesto Diaz Faculty – Mathematics
Exploratory Data Analysis
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Numerical Descriptive Measures
Descriptive Statistics
Numerical Descriptive Measures
Descriptive Statistics: Describing Data
Chapter 1: Exploring Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Measures of Dispersion
Presentation transcript:

DESCRIPTIVE STATISTICS Chapter 2 BASED ON SCHAUM’S Outline of Probability and Statistics BY MURRAY R. SPIEGEL, JOHN SCHILLER, AND R. ALU SRINIVASAN ABRIDGMENT, EDITOR: MIKE LEVAN

DESCRIPTIVE STATISTICS IN THIS CHAPTER: ✔ Descriptive Statistics ✔ Measures of Central Tendency ✔ Mean ✔ Median ✔ Mode ✔ Measures of Dispersion ✔ Variance and Standard Deviation ✔ Percentiles ✔ Interquartile Range ✔ Skewness Descriptive Statistics When giving a report on a data set, it is useful to describe the data set with terms familiar to most people. Therefore, we shall develop widely accepted terms that can help describe a data set. We shall discuss ways to describe the center, spread, and shape of a given data set.

Measures of Central Tendency A measure of central tendency gives a single value that acts as a representative or average of the values of all the outcomes of your experiment. The main measure of central tendency we will use is the arithmetic mean. While the mean is used the most, two other measures of central tendency are also employed. These are the median and the mode.

Mean If we are given a set of n numbers, say x1, x2, …, xn, then the mean, usually denoted by x¯ or μ, is given by Example 2.1. Consider the following set of integers: S = {1, 2, 3, 4, 5, 6, 7, 8, 9} The mean,, of the set S is

Median The median is that value x for which In other words, the median is the value where half of the values of x1, x2, …, xn are larger than the median, and half of the values of x1, x2, …, xn are smaller than the median.

Median S = {1, 2, 3, 4, 6, 8, 9} Notice that the value 4 has three scores below it and three scores above it. Therefore, the median, in this example, is 4. In some instances, it is quite possible that the value of the median will not be one of your observed values. Example 2.2. Consider the following set of integers: S = {1, 6, 3, 8, 2, 4, 9} If we want to find the median, we need to find the value, x, where half the values are above x and half the values are below x. Begin by ordering the list: S = {1, 2, 3, 4, 6, 8, 9}

Median Example 2.3. Consider the following set of integers: S = {1, 2, 3, 4, 6, 8, 9, 12} Since the set is already ordered, we can skip that step, but if you notice, we don’t have just one value in the middle of the list. Instead, we have two values, namely 4 and6. Therefore, the median can be any number between 4 and 6. In most cases, the average of the two numbers is reported. So, the median for this set of integers is In general, if we have n ordered data points, and n is an odd number, then the median is the data point located exactly in the middle of the set. This can be found in location of your set. If n is an even number, then the median is the average of the two middle terms of the ordered set. These can be found in locations.

Mode The mode of a data set is the value that occurs most often, or in other words, has the most probability of occurring. Sometimes we can have two, three, or more values that have relatively large probabilities of occurrence. In such cases, we say that the distribution is bimodal, trimodal, or multimodal, respectively. Example 2.4. Consider the following rolls of a ten-sided die: R = {2, 8, 1, 9, 5, 2, 7, 2, 7, 9, 4, 7, 1, 5, 2} The number that appears the most is the number 2. It appears four times. Therefore, the mode for the set R is the number 2. Note that if the number 7 had appeared one more time, it would have been present four times as well. In this case, we would have had a bimodal distribution, with 2 and 7 as the modes.

Measures of Dispersion Consider the following two sets of integers: S = {5, 5, 5, 5, 5, 5} and R = {0, 0, 0, 10, 10, 10} If we calculated the mean for both S and R, we would get the number 5 both times. However, these are two vastly different data sets. Therefore we need another descriptive statistic besides a measure of central tendency, which we shall call a measure of dispersion. We shall measure the dispersion or scatter of the values of our data set about the mean of the data set. If the values tend to be concentrated near the mean, then this measure shall be small, while if the values of the data set tend to be distributed far from the mean, then the measure will be large. The two measures of dispersions that are usually used are called the variance and standard deviation.

Variance and Standard Deviation A quantity of great importance in probability and statistics is called the variance. The variance, denoted by σ2, for a set of n numbers x1, x2, …, xn, is given by The variance is a nonnegative number. The positive square root of the variance is called the standard deviation.

Variance and Standard Deviation Example 2.5. Find the variance and standard deviation for the following set of test scores: T = {75, 80, 82, 87, 96} Since we are measuring dispersion about the mean, we will need to find the mean for this data set. Using the mean, we can now find the variance.

Variance and Standard Deviation Which leads to the following: Therefore, the variance for this set of test scores is To get the standard deviation, denoted by σ, simply take the square root of the variance. The variance and standard deviation are generally the most used quantities to report the measure of dispersion. However, there are other quantities that can also be reported.

Percentiles It is often convenient to subdivide your ordered data set by use of ordinates so that the amount of data points less than the ordinate is some percentage of the total amount of observations. The values corresponding to such areas are called percentile values, or briefly, percentiles. Thus, for example, the percentage of scores that fall below the ordinate at xα is α. For instance, the amount of scores less than x0.10 would be 0.10 or 10%, and x0.10 would be called the 10th percentile. Another example is the median. Since half the data points fall below the median, it is the 50th percentile (or fifth decile), and can be denoted by x0.50.

Percentiles The 25th percentile is often thought of as the median of the scores below the median, and the 75th percentile is often thought of as the median of the scores above the median. The 25th percentile is called the first quartile, while the 75th percentile is called the third quartile. As you can imagine, the median is also known as the second quartile.

Interquartile Range Another measure of dispersion is the interquartile range. The interquartile range is defined to be the first quartile subtracted from the third quartile. In other words, x0.75 − x0.25 Example 2.6. Find the interquartile range from the following set of golf scores: S = {67, 69, 70, 71, 74, 77, 78, 82, 89} Since we have nine data points, and the set is ordered, the median is located in position, or the 5th position. That means that the median for this set is 74. The first quartile, x0.25, is the median of the scores below the fifth position.

Interquartile Range Since we have four scores, the median is the average of the second and third score, which leads us to x0.25 = The third quartile, x0.75, is the median of the scores above the fifth position. Since we have four scores, the median is the average of the seventh and eighth score, which leads us to x0.75 = 80. Finally, the interquartile range is x0.75 − x0.25 = 80 − 69.5 = One final measure of dispersion that is worth mentioning is the semiinterquartile range. As the name suggests, this is simply half of the interquartile range. Example 2.7. Find the semiinterquartile range for the previous data

Skewness The final descriptive statistics we will address in this section deals with the distribution of scores in your data set. For instance, you might have a symmetrical data set, or a data set that is evenly distributed, or a data set with more high values than low values. Often a distribution is not symmetric about any value, but instead has a few more higher values, or a few more lower values. If the data set has a few more higher values, then it is said to be skewed to the right.

Skewness If the data set has a few more lower values, then it is said to be skewed to the left.