Describing and Displaying Quantitative data. Summarizing continuous data Displaying continuous data Within-subject variability Presentation.

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Describing Quantitative Variables
DESCRIBING DISTRIBUTION NUMERICALLY
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Descriptive Measures MARE 250 Dr. Jason Turner.
Dot Plots & Box Plots Analyze Data.
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
The mean for quantitative data is obtained by dividing the sum of all values by the number of values in the data set.
Measures of Dispersion
1 Chapter 1: Sampling and Descriptive Statistics.
Descriptive Statistics
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Statistics: Use Graphs to Show Data Box Plots.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Programming in R Describing Univariate and Multivariate data.
Department of Quantitative Methods & Information Systems
Numerical Descriptive Measures
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Objectives 1.2 Describing distributions with numbers
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Modified by ARQ, from © 2002 Prentice-Hall.Chap 3-1 Numerical Descriptive Measures Chapter %20ppts/c3.ppt.
7.7 Statistics & Statistical Graphs p.445. What are measures of central tendency? How do you tell measures of central tendency apart? What is standard.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
Lecture 3 Describing Data Using Numerical Measures.
Skewness & Kurtosis: Reference
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Measures of Dispersion How far the data is spread out.
Measure of Central Tendency Measures of central tendency – used to organize and summarize data so that you can understand a set of data. There are three.
INVESTIGATION 1.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
7.7 Statistics & Statistical Graphs p.445. An intro to Statistics Statistics – numerical values used to summarize & compare sets of data (such as ERA.
Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.
CCGPS Advanced Algebra UNIT QUESTION: How do we use data to draw conclusions about populations? Standard: MCC9-12.S.ID.1-3, 5-9, SP.5 Today’s Question:
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Warm Up Simplify each expression
Cumulative frequency Cumulative frequency graph
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Chapter 3 Describing Data Using Numerical Measures
Chapter 5 : Describing Distributions Numerically I
Unit 6 Day 2 Vocabulary and Graphs Review
Averages and Variation
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Unit 4 Statistics Review
Box and Whisker Plots Algebra 2.
Measures of Central Tendency
Numerical Descriptive Statistics
Honors Statistics Review Chapters 4 - 5
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Lesson – Teacher Notes Standard:
Advanced Algebra Unit 1 Vocabulary
Describing Data Coordinate Algebra.
Ch. 12 Vocabulary 9.) measure of central tendency 10.) outlier
Presentation transcript:

Describing and Displaying Quantitative data

Summarizing continuous data Displaying continuous data Within-subject variability Presentation

Summarizing continuous data A quantitative measurement contains more information than categorical. The two most important pieces of information about quantitative measurement are Where is it ? How variable is it? These are the central tendency and measure of spread (variability)

Measure of central tendency

Example students record student\subjectsubj1subj2subj3subj4Mean s s s s s s s s s s

students record student\subjectsubj1subj2subj3subj4Mean s s s s s s s s s s

Mean or average is a statistical sense and efficient. Outliers are single observations which have noticeable influence on the results. These outliers should be excluded from the sample. Outliers should be excluded from the final data summary.

BabyWeight (kg) B11.2 B21.3 B31.4 B41.5 B52.1 Mean1.5

BabyWeight (kg) B11.2 B21.3 B31.4 B41.5 B521 Mean7.89

Medianis estimated by first ordering the data from smallest to largest, and then counting upwards for half of the observations, the center observation in odd samples or the average of middle two observations in even samples.

Example students record student\subjectsubj1subj2subj3subj4MeanMedian s s s s s s s s s s s average Median

Median measure it will not be affected by the outliers.

Mode More is the value that occurs most frequently, if the data grouped then it will be the grouping with highest frequency. It is useful for categorical data to report the most frequent category.

Example

Measures of Dispersion or variability Range and interquartile range Range is the smallest and largest observations, to measure the variability. Example : In age variable we would like to know the youngest and oldest participant. Outliers presence will give distorted impression about the variability

Quartiles namely are lower, median and upper quartile, which divide the data into four equal parts. First order the data and then count the appropriate number from bottom. the interquartile range is useful measure of variability and is given by the difference of lower and upper quartiles.

Example Meanquartiles lower quartiles (25th percentile)51 median quartile (50th percentile) upper quartile (75th percentile) interquartile rangefrom 51 to 67

Interquartile is not vulnerable to outliers. Here we know that 50% of the data lie within the interquartile range

Standard Deviation and Variance

Example students record student\subjectsubj1subj2subj3subj4Standard Deviation s s s s s s s s s s total standard deviation 31.7

Why Standard deviation is useful? Dark blue is less than one standard deviation from the mean. For the normal distribution, this accounts for % of the set; while two standard deviations from the mean (medium and dark blue) account for 95.45%; three standard deviations (light, medium, and dark blue) account for 99.73%; and four standard deviations account for %.normal distribution

Example : The median age of menopause for cases as 50.1 years and the interquartile range is 48.6 to 52.5, thus we know that 50% of the women experienced the menopause within 4-years age range

Displaying Continuous Data A picture worth thousand words, or numbers, so there is no better way to present the data than figures of graph The graph or figure should convey as much information as possible. With one constraint that the reader is not overwhelmed by too much data

Dot plot Example

Histogram : used with huge numerical data, where the data will be divided none overlapping intervals, then counting the number of observations in each. example

Box whisker plot more compact information can be visualized The whiskers in the diagram indicate the minimum and maximum values of the variable under consideration. The median value is indicated by the central horizontal line. The lower and upper quartile by the corresponding horizontal ends of the box. The shaded box itself represents the interquartile range.

The box-whisker plot is used to display median and two measure of spread, namely the range and interquartile.

Scatter plot It used to illustrate the relationship between two continuous variables

Measures of Symmetry Dot and histogram plots give us idea about the shape of the distribution of the data. Symmetric: means if you fold the shape over the central point the two halves will agree other wise will call it skewed, either left skewed or right skewed. If the distribution is symmetric then the mean and the median will be close to each other.

If the distribution is skewed then the median and interquartile range are the approperiate summary measure than mean and standard deviation. Standard deviation and mean are sesitive to the skewness. Example : If we have mean = 1.31 and median = 1.34 we can conclude that the data are reasonably symmetric

Example: If we have the median = 50.1 but it is not exactly in the mid of the first and third quartile of 48.6 and 52.5 which indicate the skewness in the data distribution.

Within the subject variable Measurement taken once for the subject (weight of the baby) and the variability expressed by standard deviation we call it between-subject variability ( the subject not changing frequently) Measurements taken repeatedly on one subject then we are assessing within-subject variability. ( the subject changing frequently)

Within-subject values are unlikely to be independent. Consecutive values will be dependent on values proceeding them In the investigation of total variability it is very important to distinguish within-subject from between-subject variability. The experimenter must be aware of possible sources which contribute to the variation, decide which are of importance in the intended study, and design the study appropriately.

Exercise The age (in years) of a sample of 20 motor cyclists killed in road traffic accidents is given below: Calculate the mean, median, and mode. Calculate the range, inter quartile range and standard deviation. Which of these is better to describe the variability of these data? Draw a dot plot and histogram. Is this distribution symmetric or skewed?

Mean= 30.9 Median= 24 Mode= 24 SD =

Age classesFrequency More2

Min15 Max71 Range56 Quarter 120 Quarter 335 Interquartile15