1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

BPS - 5th Ed. Chapter 11 Picturing Distributions with Graphs.
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
1 Chapter 1: Sampling and Descriptive Statistics.
Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
ISE 261 PROBABILISTIC SYSTEMS. Chapter One Descriptive Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Experimental Statistics I.  We use data to answer research questions  What evidence does data provide?  How do I make sense of these numbers without.
CHAPTER 1: Picturing Distributions with Graphs
AP Statistics Day One Syllabus AP Content Outline Estimating Populations and Subpopulations.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Objectives (BPS chapter 1)
Programming in R Describing Univariate and Multivariate data.
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Chapter 1 – Exploring Data YMS Displaying Distributions with Graphs xii-7.
1.1 Displaying Distributions with Graphs
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Chapter 2 Describing Data.
1 Chapter 3 Looking at Data: Distributions Introduction 3.1 Displaying Distributions with Graphs Chapter Three Looking At Data: Distributions.
Categorical vs. Quantitative…
Chapter 5: Exploring Data: Distributions Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions:
BPS - 5th Ed. Chapter 11 Picturing Distributions with Graphs.
1 Picturing Distributions with Graphs Stat 1510 Statistical Thinking & Concepts.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Essential Statistics Chapter 11 Picturing Distributions with Graphs.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
1 Chapter 2: Exploring Data with Graphs and Numerical Summaries Section 2.1: What Are the Types of Data?
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Copyright © 2011 Pearson Education, Inc. Describing Numerical Data Chapter 4.
CHAPTER 1 Picturing Distributions with Graphs BPS - 5TH ED. CHAPTER 1 1.
Chapter 5: Exploring Data: Distributions Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions:
+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
1 Never let time idle away aimlessly.. 2 Chapters 1, 2: Turning Data into Information Types of data Displaying distributions Describing distributions.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
1 Take a challenge with time; never let time idles away aimlessly.
Picturing Distributions with Graphs BPS - 5th Ed. 1 Chapter 1.
Class Two Before Class Two Chapter 8: 34, 36, 38, 44, 46 Chapter 9: 28, 48 Chapter 10: 32, 36 Read Chapters 1 & 2 For Class Three: Chapter 1: 24, 30, 32,
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
1.2 Displaying Quantitative Data with Graphs.  Each data value is shown as a dot above its location on the number line 1.Draw a horizontal axis (a number.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
UNIT ONE REVIEW Exploring Data.
The rise of statistics Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain understanding from.
Exploratory Data Analysis
Chapter 1.1 Displaying Distributions with graphs.
ISE 261 PROBABILISTIC SYSTEMS
Laugh, and the world laughs with you. Weep and you weep alone
CHAPTER 1: Picturing Distributions with Graphs
Displaying Distributions with Graphs
CHAPTER 1: Picturing Distributions with Graphs
DAY 3 Sections 1.2 and 1.3.
Describing Distributions of Data
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Basic Practice of Statistics - 3rd Edition
Basic Practice of Statistics - 3rd Edition
CHAPTER 1: Picturing Distributions with Graphs
Honors Statistics Review Chapters 4 - 5
CHAPTER 1 Exploring Data
Presentation transcript:

1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~

2 Chapter 3: Data Description Types of data Graphical/Numerical summaries

3 What are Data? Any set of data contains information about some group of individuals. The information is organized in variables.

4 Terms A population is a collection of all individuals about which information is desired. A sample is a subset of a population. A variable is a characteristic of an individual. The distribution of a variable tells us what values/categories it takes and how often it takes those values/categories in the population.

5 Data Analysis Goal: to study how variables relate to one another in a population Method: estimating the distributions of variables (in the whole population) by summarizing the distributions of data on those variables

6 Example: A College’s Student Dataset The data set includes data about all currently enrolled students such as their ages, genders, heights, grades, and choices of major. Who? What individuals do the data describe? Population/sample of study? What? How many variables do the data describe? Give an example of variables.

7 Types of Variables A categorical variable places an individual into one of several groups or categories. A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense. Q. Which variable is categorical ? Quantitative?

A variable Categorical/ Qualitative Nominal variable Ordinal variable Numerical/ Quantitative Discrete variable Continuous variable 8 Q: Does “average” make sense? Yes No Yes Q: Is there any natural ordering among categories?Q: Can all possible values be listed down?

9 Two Basic Strategies to Explore Data Begin by examining each variable by itself. Then move on to study the relationship among the variables. Begin with a graph or graphs. Then add numerical summaries of specific aspects of the data.

10 A Dataset of CSUEB Students GenderHeight (inches) Weight (pounds) College M Bsns F61.299Sci F Bsns M Sci M Bsns F Arts M Arts M--188Sci

11 Summarizing Data We will start from summarizing data on a variable to on several variables by: Displaying the distribution of data with graphs Describing the distribution of data with numbers

12 Terms Frequency = the # of individuals in a category or at a value. Relative frequency = the % of individuals in a category or at a value. They both can be used to display the distribution of data.

13 Graphical Tools for One Variable For a categorical variable: – Pie charts – Bar graphs For a quantitative variable: – Histograms – Stem-and-leaf plots (read on your own) – Boxplots

14 How to Make a Pie Chart 1. Calculate the % for each category 2. Draw a pie and slice it accordingly.

15 Pie Chart Class Make-up on First Day

16 How to Make a Bar Chart 1. Label frequencies on one axis and categories of the variable on the other axis. 2. Construct a rectangle at each category of the variable with a height equal to the frequency in the category. 3. Leave a space between categories

17 Class Make-up on First Day Bar Graph

18 Displaying Distributions of Quantitative Variables Stem-and-leaf plots: good for small to medium datasets Histograms: Similar to bar charts; good for medium to large datasets

19 How to Make a Histogram 1. Divide the range of data by the approximate # of intervals desired (usually 5-20). Round the resulting number to a convenient number (the common width for the intervals). 2. Construct intervals with the common width so that the first interval contains the smallest data value and the last interval contains the largest data value. 3. Draw the histogram: the variable on the horizontal axis and the count (or %) on the vertical axis.

BPS - 5th Ed.Chapter 1 20 Histograms: Class Intervals How many intervals? – One rule is to calculate the square root of the sample size, and round up. Size of intervals? – Divide range of data (max  min) by number of intervals desired, and round to convenient number Pick intervals so each observation can only fall in exactly one interval (no overlap)

21 What do We See from Histograms? Important features we should look for: Overall pattern – Shape – Center – Spread Outliers, the values that fall far outside the overall pattern

22 How to Make a Stemplot 1. Separate each observation into a stem consisting of all but the final (rightmost) digit and a leaf, the final digit. Stems may have as many digits as needed, but each leaf contains only a single digit. Example: height of 68.5  leaf = “5” and the other digit “68” will be the stem

23 How to Make a Stemplot 2. Write the stems in a vertical column with the smallest at the top, and draw a vertical line at the right of this column. 3. Write each leaf in the row to the right of its stem, in increasing order out from the stem.

Weight Data: Stemplot (Stem & Leaf Plot) Key 20 | 3 means 203 pounds Stems = 10’s Leaves = 1’s

25 Overall Pattern—Shape How many peaks, called modes? A distribution with one peak is called unimodal. Symmetric or skewed? – Symmetric if the large values are mirror images of small values – Skewed to the right if the right tail (large values) is much longer than the left tail (small values) – Skewed to the left if the left tail (small values) is much longer than the right tail (large values)

26 Describing Data on a Quantitative Variable (Sec 3.4) To measure center: Mode, Mean and Median (Sec 3.5) To measure variability: Range, Interquartile Range (IQR) and Standard Deviation (SD) Outliers (Sec 3.6) Five-number summary and boxplot

BPS - 5th Ed.Chapter 2 27 Quartiles Three numbers which divide the ordered data into four equal sized groups. Q 1 has 25% of the data below it. Q 2 has 50% of the data below it. (Median) Q 3 has 75% of the data below it.

BPS - 5th Ed.Chapter 2 28 Obtaining the Quartiles Order the data. For Q 2, just find the median. For Q 1, look at the lower half of the data values, those to the left of the median location; find the median of this lower half. For Q 3, look at the upper half of the data values, those to the right of the median location; find the median of this upper half.

Weight Data: Sorted 29 L(M)=(53+1)/2=27 L(Q 1 )=(26+1)/2=13.5

BPS - 5th Ed.Chapter 2 30 Weight Data: Quartiles Q 1 = Q 2 = 165 (Median) Q 3 = 185

Five-Number Summary minimum = 100 Q 1 = M = 165 Q 3 = 185 maximum = Interquartile Range (IQR) = Q 3  Q 1 = 57.5 IQR gives spread of middle 50% of the data

32 M Weight Data: Boxplot Q1Q1 Q3Q3 minmax Weight

33 Identifying Outliers The central box of a boxplot spans Q 1 and Q 3 ; recall that this distance is the Interquartile Range (IQR). We call an observation a mild (or extreme) outlier if it falls more than 1.5 (or 3.0)  IQR above the third quartile or below the first quartile.

34 Summarizing Data from 2 Variables 2 categorical var’sContingency table (Cluster or stacked) bar chart 2 quantitative var’sRegression equation Scatterplot 1 categorical + 1 quantitative var Side-by-side boxplot

BPS - 5th Ed.Chapter 1 35 Time Plots A time plot shows behavior over time. Time is always on the horizontal axis, and the variable being measured is on the vertical axis. Look for an overall pattern (trend), and deviations from this trend. Connecting the data points by lines may emphasize this trend. Look for patterns that repeat at known regular intervals (seasonal variations).

36 Average Tuition (Public vs. Private)

Empirical Rule ( rule) If a variable X follows normal distribution, that is, all X values (the whole population) show bell-shaped, then: Mean(X) + 1*SD(X) covers 68% of possible X values Mean(X) + 2*SD(X) covers 95% of possible X values Mean(X) + 3*SD(X) covers 99.7% of possible X values 37

z-Scores & The Empirical Rule Since the z-score is the number of standard deviations from the mean, we can easily interpret the z-score for bell-shaped populations using The Empirical Rule. When a population has a histogram that is approximately bell-shaped, then Approximately 68% of the data will have z-scores between –1 and 1. Approximately 95% of the data will have z-scores between –2 and 2. All, or almost all of the data will have z-scores between –3 and 3. z = –3 z = –2 z = –1 z = 1 z = 2 z = 3 Copyright ©2014 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

Minitab Exercise 39 Use the CSUEB dataset 1.Key in data in Minitab 2.Draw all plots and calculate statistics in Minitab