The Central Tendency is the center of the distribution of a data set. You can think of this value as where the middle of a distribution lies. Measure.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Lesson 2.3. A box plot gives you an idea of the overall distribution of a data set, but in some cases you might want to see other information and details.
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Introduction to Summary Statistics
Calculating & Reporting Healthcare Statistics
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
12.3 – Measures of Dispersion
Statistics: Use Graphs to Show Data Box Plots.
Measures of Central Tendency
Measures of Central Tendency
Describing Data: Numerical
Programming in R Describing Univariate and Multivariate data.
Jeopardy Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Final Jeopardy.
Department of Quantitative Methods & Information Systems
Describing distributions with numbers
Objective To understand measures of central tendency and use them to analyze data.
Chapter 3 Statistical Concepts.
Descriptive Statistics
Measures of Central Tendency & Spread
Methods for Describing Sets of Data
Descriptive Statistics Descriptive Statistics describe a set of data.
7.7 Statistics & Statistical Graphs p.445. What are measures of central tendency? How do you tell measures of central tendency apart? What is standard.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Objectives Vocabulary
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
10/17/2015Mrs. McConaughy1 Exploring Data: Statistics & Statistical Graphs During this lesson, you will organize data by using tables and graphs.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Chapter 2 Describing Data.
Describing distributions with numbers
14.1 Data Sets: Data Sets: Data set: collection of data values.Data set: collection of data values. Frequency: The number of times a data entry occurs.Frequency:
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Descriptive Statistics Descriptive Statistics describe a set of data.
Measure of Central Tendency Measures of central tendency – used to organize and summarize data so that you can understand a set of data. There are three.
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
INVESTIGATION 1.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
7.7 Statistics & Statistical Graphs p.445. An intro to Statistics Statistics – numerical values used to summarize & compare sets of data (such as ERA.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
Statistics and Data Analysis
CCGPS Coordinate Algebra Unit 4: Describing Data.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt DefinitionsCalculationsWhat’s.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Statistics Review  Mode: the number that occurs most frequently in the data set (could have more than 1)  Median : the value when the data set is listed.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Chapter 3 Describing Data Using Numerical Measures
Statistics Unit Test Review
Warm Up What is the mean, median, mode and outlier of the following data: 16, 19, 21, 18, 18, 54, 20, 22, 23, 17 Mean: 22.8 Median: 19.5 Mode: 18 Outlier:
Description of Data (Summary and Variability measures)
DAY 3 Sections 1.2 and 1.3.
Histograms: Earthquake Magnitudes
Displaying Distributions with Graphs
Statistics: The Interpretation of Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Chapter 1: Exploring Data
Advanced Algebra Unit 1 Vocabulary
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

The Central Tendency is the center of the distribution of a data set. You can think of this value as where the middle of a distribution lies. Measure of central tendency: Numbers that describe what is average or typical of the distribution… Mean, Median, Mode

Mean: The sum of all the data values divided by the number of values Median: The middle number when the data is arranged in order Mode: The value that occurs most frequently in the data

 Data: 4,17,7,14,18,12,3,16,10,4,4,11  1. Order your data (putting the values in numerical order). 3,4,4,4,7,10,11,12,14,16,17,18  2. Find the median of your data. The median divides the data into two halves. Median: 10.5  3. To divide the data into quarters, you then find the medians of these two halves. 3,4,4,4,7, 10, Median: 4 11,12,14,16,17,18 Median: 15  4. Now you have three points: These three points divide the entire data set into quarters, called "quartiles ◦ Quartile 1 (Q1) = (4+4)/2 = 4 ◦ Quartile 2 (Q2) = (10+11)/2 = 10.5 ◦ Quartile 3 (Q3) = (14+16)/2 = 15  Once you have these three points, Q 1, Q 2, and Q 3, you have all you need in order to draw a simple box-and-whisker plot.

Percentile rank is calculated by taking the number of data points with values less than the value we want, and dividing that sum by the total number of data points.

(14.68)= Notice that all the data values in the bins up to 60 are less than Adding the frequencies up to 60 is out of 40 (total) is approximately 92.5%. So is approximately the 93 rd percentile

 Deviations measure signed difference between the data values and the mean  The variance is another measure of variability that is equal to the sum of the squares of the deviations divided by one less than the number of values.

Connie’s mean: 84 Oscar’s mean: 84 Example: Semester assignments scores

These are Connie’s and Oscar’s scores and their deviations from the mean score for each student. How can we combine the deviations into a single value that reflects the spread in a data set? Should we find the sum of the standard deviations? Let’s try that…. Of course, they cancel out!! So we need to eliminate the effect of the different signs! Any ideas?

When you sum the squares of the deviations, the sum is no longer zero!! The sum of the squares of the deviations, divided by one less than the number of values, is called the variance of the data. The square root of the variance is called the standard deviation of the data. The standard deviation provides one way to judge the “average difference” between data values and the mean. It is a measure of how the data are spread around the mean.

A histogram is a graphical representation of a data set, with columns to show how the data are distributed across different intervals of values. The columns of a histogram are called bins and should not be confused with the bars of a bar graph. The bars of a bar graph indicate categories— how many data items either have the same value or share a characteristic (eye color). The bins of a histogram indicate how many numerical data values fall within a certain interval.

The median (Q2) lies in the middle of its first and third quartiles. The minimum and maximum do not have to be equally far away from the median. The median (Q2) is closer to the first quartile. The mean is typically greater than the median. The mean is typically less than the median. The median is closer to the third quartile.

Shatevia took a random sample of 50 students who own MP3 players at her high school and asked how many songs they have stored. The two graphs were constructed from the data in the table.

a.What is the range of the data? The number of songs goes from a low of 765 songs to a high of 1013 songs. The range is 248 songs. b. What is the bin width of each graph The bin width of Graph A is 50 songs, and the bin width of Graph B is 10 songs. c. How can you know if the graph accounts for all 50 values? The sum of all the bin frequencies is 50 for each of the graphs. d. Why are the columns shorter in Graph B? The bins in Graph A hold the values of up to five bins from Graph B. With smaller bin widths you will usually have shorter bins. e. Which graph is better at showing the overall shape of the distribution? What is that shape? Graph A shows that the distribution is skewed left. This fact is harder to see with all the ups and downs in Graph B

Add the bin frequencies for the bins below (to the left of) 850 songs. There are 10 data values, so 10 out of 50, or 20% of the sample, had fewer than 850 songs f. Which graph is better at showing the gaps and cluster in the data? With more bins you can see gaps and clusters in the data. A dot plot is like a histogram with a very small bin width. Graph B is the better graph for seeing gaps and clusters g. What percentage of the players have fewer than 850 songs stored?