4. Interpreting sets of data

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Describing Quantitative Variables
Chapter 2 Exploring Data with Graphs and Numerical Summaries
It’s an outliar!.  Similar to a bar graph but uses data that is measured.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Descriptive Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
REPRESENTATION OF DATA.
Descriptive Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 1 Overview and Descriptive Statistics.
Objectives Describe the central tendency of a data set.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Chapter 2 Describing Data.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Categorical vs. Quantitative…
1 Further Maths Chapter 2 Summarising Numerical Data.
Chapter 2 Means to an End: Computing and Understanding Averages Part II  igma Freud & Descriptive Statistics.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Engineering Statistics KANCHALA SUDTACHAT. Statistics  Deals with  Collection  Presentation  Analysis and use of data to make decision  Solve problems.
1 Chapter 4 Numerical Methods for Describing Data.
LIS 570 Summarising and presenting data - Univariate analysis.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Exploratory Data Analysis
Lesson 11.1 Normal Distributions (Day 1)
ISE 261 PROBABILISTIC SYSTEMS
Statistics Unit Test Review
7. Displaying and interpreting single data sets
Chapter 6 ENGR 201: Statistics for Engineers
Statistical Reasoning
DS5 CEC Interpreting Sets of Data
NUMERICAL DESCRIPTIVE MEASURES
Description of Data (Summary and Variability measures)
Laugh, and the world laughs with you. Weep and you weep alone
Chapter 3 Describing Data Using Numerical Measures
DS2 – Displaying and Interpreting Single Data Sets
An Introduction to Statistics
Topic 5: Exploring Quantitative data
Dot Plots & Box Plots Analyze Data.
DS4 Interpreting Sets of Data
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
Lesson 1: Summarizing and Interpreting Data
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Section Ii: statistics
10.5 Organizing & Displaying Date
Chapter 1: Exploring Data
Honors Statistics Review Chapters 4 - 5
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Please copy your homework into your assignment book
Lesson – Teacher Notes Standard:
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Two Way Frequency Table
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Analyze Data: IQR and Outliers
Review of 6th grade material to help with new Statistics unit
Chapter 1: Exploring Data
Presentation transcript:

4. Interpreting sets of data Cambridge University Press  G K Powers 2013 Study guide Chapter 2

Grouped frequency tables Classes or groups are listed in the first column in ascending order. The tally column shows the number of times a score occurs in a class. The frequency column shows the total count of the scores in each class. HSC Hint – Class centre is the middle and is calculated by adding the two extremes and dividing by 2. Cambridge University Press  G K Powers 2013

Cumulative frequency Cumulative frequency is the frequency of the score plus the frequency of all the scores less than that score. It is the progressive total of the frequencies. Score Frequency Cumulative frequency 18 1 19 5 6 20 3 9 21 7 16 HSC Hint – The last number in the cumulative frequency column equals the total number of scores. Cambridge University Press  G K Powers 2013

Cumulative frequency graphs Cumulative frequency histogram Cumulative frequency polygon HSC Hint – Cumulative frequency polygon joins the top right corner of the rectangles in a cumulative frequency histogram. Cambridge University Press  G K Powers 2013

Mean Mean is a measure of the centre. It is calculated by summing all the scores and dividing by the number of scores. ‒ ‘Sum of’ (Greek capital letter sigma) x ‒ A score or data value – Mean of a set of scores n ‒ Total number of scores f ‒ Frequency HSC Hint – Make sure all data has been cleared before using the calculator for statistics. Cambridge University Press  G K Powers 2013

Mode Mode is the score that occurs the most number of times. Score with the highest frequency. To find the mode: Determine the number of times each score occurs. Mode is the score that occurs the most number of times. If two or more scores occur the same number of times they are both regarded as the mode. HSC Hint – Data is called bimodal if it contains two modes. Cambridge University Press  G K Powers 2013

Median The median is the middle score or value. Cumulative frequency polygon is used to estimate the median. HSC Hint – Total number of scores is the value of the cumulative frequency for the last score or class. Cambridge University Press  G K Powers 2013

Range and interquartile range Range = Highest score – Lowest score Interquartile range is the difference between the first quartile and third quartile. To calculate the interquartile range (IQR) Arrange the data in increasing order. Divide the data into two equal-sized groups. If n is odd, omit the median. Find Q1 the median of the first group. Find Q3 the median of the second group. Calculate the interquartile range. HSC Hint – Interquartile range is not dependent on the extreme values like the range. Cambridge University Press  G K Powers 2013

Standard deviation The standard deviation is a measure of the spread of data about the mean. Two calculations are used for standard deviation. Population standard deviation ( ) is a better measure when we have all of the data or the entire population. Sample standard deviation ( ) is the better measure when a sample is taken from a large population. HSC Hint – Population standard deviation or sample standard deviation can be used if it is not specified. Cambridge University Press  G K Powers 2013

Investigating sets of data Outlier is a score that is separated from the majority of the data. Outliers have little effect on the mean, median and mode for large sets of data. However, in small data sets, the presence of an outlier will have a large effect on the mean, smaller effect on the median and usually no effect on the mode. Shape of the graph is described in terms of smoothness, symmetry and the number of nodes. HSC Hint – An outlier is a score that is not close to any other scores. It is not typical. Cambridge University Press  G K Powers 2013

Symmetry and skewness No skew (symmetric) Data is symmetrical and balanced about a vertical line. Positively skewed Data is more on the left side. The long tail is on the right side. Negatively skewed Data is more on the right side. The long tail is on the left side. HSC Hint – Mean, mode and median are equal when the data is symmetrical. Cambridge University Press  G K Powers 2013

Number of modes Unimodal Data has only 1 mode or peak. Bimodal Data has 2 modes or peaks. Multimodal Data has many modes or peaks. HSC Hint – List all the modes if the data is multimodal. Cambridge University Press  G K Powers 2013

Double stem-and-leaf plots A stem-and-leaf plot has the tens digit of the data written in numerical order down the page. The ‘units’ digit becomes the ‘leaves’ and is written in numerical order across the page. HSC Hint – The numbers in the ‘leaves’ of a stem-and-leaf plot must be written in increasing order. Cambridge University Press  G K Powers 2013

Double box-and-whisker plots A graph that uses five-number summary – lower extreme, lower quartile, median, upper quartile and the higher extreme. A double box-and-whisker graph has two sets of data. HSC Hint – To draw a box plot arrange the data in order before calculating the five-number summary. Cambridge University Press  G K Powers 2013

Radar charts A radar chart looks like a spider web and is used to compare the performance of one or more entities. HSC Hint – Line segments in a radar chart must be constructed accurately to ensure the information is valid. Cambridge University Press  G K Powers 2013

Area chart A graph consisting of different ‘areas’ each representing a data set over a period of time. The thickness of the area indicates the size of the data. HSC Hint – To read data from an area chart, draw a vertical line and estimate the difference between the heights. Cambridge University Press  G K Powers 2013

Comparison – Measures of location Mean   Advantages Easy to understand and calculate. Depends on every score. Varies least from sample to sample. Disadvantages Distorted by outliers. Not suitable for categorical data. Median Easy to understand. Not affected by outliers. May not be central. Varies more than the mean in a sample. Mode Easy to determine Not affected by outliers Suitable for categorical data May be no mode or more than one mode. May not be central Cambridge University Press  G K Powers 2013

Comparison – Measures of spread Range   Advantages Easy to understand. Easy to calculate. Disadvantages Dependent on the smallest and largest values. May be distorted by outliers. Interquartile range Easy to determine for small data sets. Not affected by outliers. Difficult to calculate for large data sets. Dependent on lower and upper quartiles. Data needs to be sorted. Standard deviation Depends on every score. Difficult to determine without a calculator Difficult to understand. Cambridge University Press  G K Powers 2013

Two-way tables A two-way table presents data using rows and columns. Data in a cell is interpreted by reading the headings for the row and the column. HSC Hint – Calculate the totals across each row and down each column. Add the totals horizontally and vertically. The results of these calculations should be equal. Cambridge University Press  G K Powers 2013