1 Chapter 7 Looking at Distributions. 2 Modeling by A Distribution For a given data set we want to know which distribution can fit each variable. This.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

Chapter 2 Exploring Data with Graphs and Numerical Summaries
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 1-3 Peer Tutor Slides Instructor: Mr. Ethan W. Cooper, Lead Tutor © 2013.
Analysis of Variance The contents in this chapter are from Chapter 15 and Chapter 16 of the textbook. One-Way Analysis of Variance Multiple Comparisons.
University of Durham D Dr Robert Coe University of Durham School of Education Tel: (+44 / 0) Fax: (+44 / 0)
The Five-Number Summary and Boxplots
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Modeling Process Quality
1 Chapter 1: Sampling and Descriptive Statistics.
Chapter 1 Data Presentation Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness.
Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.
1 The Islamic University of Gaza Civil Engineering Department Statistics ECIV 2305 ‏ Chapter 6 – Descriptive Statistics.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Organizing Information Pictorially Using Charts and Graphs
Descriptive statistics (Part I)
Chapter In Chapter 3… … we used stemplots to look at shape, central location, and spread of a distribution. In this chapter we use numerical summaries.
Very Basic Statistics.
Basic Practice of Statistics - 3rd Edition
Sexual Activity and the Lifespan of Male Fruitflies
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 4: The Normal Distribution and Z-Scores.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Programming in R Describing Univariate and Multivariate data.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
REPRESENTATION OF DATA.
Chapter 2 Summarizing and Graphing Data
Module 10: Summarizing Numbers This module presents the standard summarizing numbers, also often called sample statistics or point estimates, that.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc. 1 Chapter 4 Numerical Methods for Describing Data.
INTRODUCTORY STATISTICS Chapter 2 DESCRIPTIVE STATISTICS PowerPoint Image Slideshow.
Chapter 4 Displaying and Summarizing Quantitative Data Math2200.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
2.2 Organizing Quantitative Data. Data O Consider the following data O We would like to compute the frequencies and the relative frequencies.
1 Further Maths Chapter 2 Summarising Numerical Data.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
Number of Movies Frequency TOTAL: 88 The following data shows the number of movies 88 students watched one week during the summer.
Chapter 3 Descriptive Statistics II: Additional Descriptive Measures and Data Displays.
Essential Statistics Chapter 11 Picturing Distributions with Graphs.
Testing a Hypothesis about means
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
SPSS Workshop Day 2 – Data Analysis. Outline Descriptive Statistics Types of data Graphical Summaries –For Categorical Variables –For Quantitative Variables.
1 Chapter 4 Numerical Methods for Describing Data.
Chapter 5 Describing Distributions Numerically.
CHAPTER 1 Picturing Distributions with Graphs BPS - 5TH ED. CHAPTER 1 1.
Using Measures of Position (rather than value) to Describe Spread? 1.
1 Never let time idle away aimlessly.. 2 Chapters 1, 2: Turning Data into Information Types of data Displaying distributions Describing distributions.
Chapter 4 – Measurements of Location and Position Math 22 Introductory Statistics.
Chapter 3 Statistical Models or Quality Control Improvement.
Statistics and Data Analysis
4.2 Displays of Quantitative Data. Stem and Leaf Plot A stem-and-leaf plot shows data arranged by place value. You can use a stem-and-leaf plot when you.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 2 Section 2 – Slide 1 of 37 Chapter 2 Section 2 Organizing Quantitative Data.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Chapter 14 Statistics and Data Analysis. Data Analysis Chart Types Frequency Distribution.
Organizing Quantitative Data: The Popular Displays
Chapter 6 – Descriptive Statistics
Jeopardy Final Jeopardy Chapter 1 Chapter 2 Chapter 3 Chapter 4
Bar graphs are used to compare things between different groups
Descriptive Statistics
Topic 5: Exploring Quantitative data
DISPLAYING DATA.
Frequency Distributions
Drill Construct a Histogram to represent the data of test score averages in 20 cities using 5 Bars. Test Averages {62, 68, 72, 58, 83, 91, 70, 82, 68,
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Chapter 1 Warm Up .
Honors Statistics Review Chapters 4 - 5
Chapter 6.4 Box and Whisker Plots
Statistical Models or Quality Control Improvement
Presentation transcript:

1 Chapter 7 Looking at Distributions

2 Modeling by A Distribution For a given data set we want to know which distribution can fit each variable. This is a modeling problem. When we have a knowledge to use a specific type distribution (normal, exponential, Poisson distributions) to fit the data, a goodness-fit-test will be useful. Various Q-Q plots are very useful methods to find a suitable distribution to fit the data.

3 Two data sets The contents in this chapter are from Chapter 7 of the textbook. Our textbook chooses the data set of marathon.sav to show us how to use SPSS for looking at distribution. The Chicago Marathon has been run yearly since As we use the student version of SPSS that has some limitation on the number of rows/columns, we use a similar data set of mar1500.sav to instead.

4 Data set “mar1500.sav” The data set involves the following variables: “age”, “sex”, “hours”, “agecat8”, and “agecat6”. Hours = “completion time in hours” Agecat8: 1=24 or less, 2=25-39, 3=40-44, 4=45-49, 5=50-54, 6=55-59, 7=60-64, 8=65+ Agecat6: 1=44 or less, 2=45-49, 3=50-54, 4=55-59, 5=60-64, 6=65+

5 Histogram

6 Impressions on the histogram The mean falls in The distribution is not symmetric about the mean. The distribution has a tail toward larger times. Low marathon times are difficult to achieve. It is hard to break the world record. Since the distribution has a tail toward larger values, the median should be somewhat less than the mean.

7 Basic statistics

8 The 5% trimmed mean excludes the 5% largest and the 5% smallest values. It is based on the 90% of cases in the middle. The trimmed mean provides an alternative to the median when you have some outliers. In this data the 5% trimmed mean doesn’t differ much from the usual mean, because the distribution is not too far from being symmetric.

9 Comparisons of completing time on Gender

10 Comparisons of completing time on Gender

11 Comparisons of completing time on Gender The difference in all of the percentile values of completing times between men and women is about hour. The weighted percentiles and Tukey’s hinges are two different ways of calculating sample percentiles. More details refer to P.120.

12 Histogram of completion times for women

13 Histogram of completion times for men

14 Age and Gender

15 Age and Gender

16 Boxplots of completing times by age and gender

17 Remarks Average completion times for men and women of different ages are shown. For every age group, the average time for men is less than the average time for women. For men and women younger than 45, age does not seem to matter very much. For both men and women the variability of completion times is very stable except the eldest age group.

18 Detecting outliers Cases with values between 1.5 and 3 box lengths from the upper or lower edge of the box are called outliers and are designated with an “o”. Cases with values of more than 3 box lengths from the upper or lower edge of the box are called extreme values. They are designated with “*”.

19

20 A stem-and-leaf plot is a display very much like a histogram, but it includes more information of the data. In a stem-and-leaf plot, each row corresponds to a stem and each case is represented by a leaf. Stem-and-leaf plots

21 The following are price of 15 students eating lunch at a fast-food restaurant: 5.35, 4.75, 4.30, 5.47, 4.85, 6.62, 3.54, 4.87, 6.26, 5.48, 7.27, 8.45, 6.05, 4.76, | 5 The first value of 5.35 is rounded to | The second value of 4.75 is rounded to | 4559 Their stems are 5 and 4, respectively 3 6 | 631 Their leafs are 4 and 8, respectively 1 7 | | 5 Stem-and-leaf plots

22 Stem-and-leaf plots completion time in hours Stem-and-Leaf Plot for agecat6= Frequency Stem & Leaf Extremes (>=6.2) Stem width: 1.00 Each leaf: 1 case(s)