Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:

Slides:



Advertisements
Similar presentations
Chapter 2: The Normal Distributions
Advertisements

Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Describing Quantitative Variables
Analyzing Data (C2-5 BVD) C2-4: Categorical and Quantitative Data.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 2 Picturing Variation with Graphs.
Displaying & Summarizing Quantitative Data
It’s an outliar!.  Similar to a bar graph but uses data that is measured.
1.2: Describing Distributions
Chapter 2: Density Curves and Normal Distributions
CHAPTER 1: Picturing Distributions with Graphs
Statistics: Use Graphs to Show Data Box Plots.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Business Statistics for Managerial Decision Making
Programming in R Describing Univariate and Multivariate data.
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Descriptive Statistics
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Chapter 2 Describing Data.
Unit 4 Statistical Analysis Data Representations.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Essential Statistics Chapter 11 Picturing Distributions with Graphs.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
MMSI – SATURDAY SESSION with Mr. Flynn. Describing patterns and departures from patterns (20%–30% of exam) Exploratory analysis of data makes use of graphical.
CHAPTER 1 Picturing Distributions with Graphs BPS - 5TH ED. CHAPTER 1 1.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.

Chapter 3: Displaying and Summarizing Quantitative Data Part 1 Pg
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
1.2 Displaying Quantitative Data with Graphs.  Each data value is shown as a dot above its location on the number line 1.Draw a horizontal axis (a number.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
The Normal Model Chapter 6 Density Curves and Normal Distributions.
ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.
Displaying and Describing Categorical Data Chapter 3.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
Simulations and Normal Distribution Week 4. Simulations Probability Exploration Tool.
CHAPTER 1 Exploring Data
UNIT ONE REVIEW Exploring Data.
Chapter 1: Exploring Data
Warm Up.
Unit 4 Statistical Analysis Data Representations
Statistical Reasoning
Laugh, and the world laughs with you. Weep and you weep alone
Displaying Distributions with Graphs
recap Individuals Variables (two types) Distribution
CHAPTER 1: Picturing Distributions with Graphs
Topic 5: Exploring Quantitative data
Histograms: Earthquake Magnitudes
Describing Distributions of Data
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Honors Statistics Review Chapters 4 - 5
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Types of variables. Types of variables Categorical variables or qualitative identifies basic differentiating characteristics of the population.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Describing Data Week 1

The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where: Where was the data measured? When: When was the measurement done? HoW: How was the data measured? Why: Why was the measurement done?

Always Check the W’s Anytime you see data always check the W’s. This will help spot questionable statistics. ALWAYS QUESTION DATA

Variables (The What) Variables are characteristics that are recorded about each individual. Categorical variables are non-numeric in nature. Quantitative variables are measurements and have units

Displaying and Describing Categorical Data

Terms Frequency table: Categories and counts Distribution: lists the frequencies of each category Distribution: lists the relative frequencies of each category Contingency Table: The frequencies or relative frequencies of 2 variables.

Terms Marginal Distribution: the totals found on the margins of the chart. The distribution of one of the two variables Conditional distribution: the distribution of one row or column of a contingency table. Independence: two variables are independent if the conditional distribution of all the values of a variable is the same as the marginal distribution of that variable. (Huh!)

Three Rules of Data Analysis First, make a picture!

Or you could

Why? Pictures reveal things charts don’t. Patterns can be revealed that are not readily apparent from the numbers. Pictures are the easiest way to explain to others about the data

To Make a Graph Make piles. Organize the data into like groups Make a frequency table Make a relative frequency table by finding the percentages

Make a Graph Probably a bar chart graphing the frequencies or... A pie chart to graph the relative frequencies Beware of the area principle. Stay 2-D

To Make a Graph of Categorical Data Think  Check W’s  Identify the variables  Check to see if categories overlap  Data are counts

To Make a Graph of Categorical Data Show  Select the appropriate graph to compare categories  Bar Graph for frequencies  Pie Chart for relative frequencies (percents)  Stacked bar graph can be used instead of a pie chart

To Make a Graph of Categorical Data Tell  Interpret the results  Describe the results in the context of the problem  Answers are sentences not numbers

Displaying Quantitative Data More Graphs

Histograms Think:  Must be quantitative data  Want to see the distribution  Could be counts or percents

Stem and Leaf Plots Think  Must be quantitative data  Want to see the distribution  Usually counts  Relatively small sample size

Stem and Leaf Plot Show  Scale is usually vertical  Put the ‘Stems’ on the vertical scale  Stems are usually the data without the last digit  Might be rounded  If there are a lot of leaves with one stem make dual stems and put 0-4 on one and 5-9 on the other  Plot the ‘leaves’

Dot Plot Think  Must be quantitative data  Want to see the distribution  Usually counts  Relatively small sample size

Dot Plot Show  Scale can be vertical or horizontal  Place a dot at the appropriate location

Describing the Distribution Tell  Shape  How many humps? Unimodal Bimodal - maybe more than one group thrown together Multimodal  Uniform  Symmetric  Skewed  Gaps  Clusters

Describing the Distribution Tell (continued)  Center  What is the middle value  What is the middle range

Describing the Distribution Tell (Continued)  Spread  Range = Maximum value - minimum value  Variation: How much does the data jump around

Outliers Discuss any data points that do not seem to fit the overall pattern. Is there a logical explanation for them to be that different?

Comparing Two Distributions Compare the centers of the two distributions Compare the shapes of the two distributions Compare the spread of the two distributions Compare any extreme values (outliers) of the two distributions.

Time Plot Think:  Quantitative data  Looking for trends Show  Time is horizontal scale  Plot data  Connect the dots  Can use calculator

Describing Distributions with Numbers

Measurements of the Center Mean: The ‘Average’ µ mean of a population mean of a sample Unique Median: The middle score Sort the data Middle score or the average of the middle two scores Unique

More Center Measurers Mode: The most common score  Not necessarily unique  Does Not necessarily exist

Finding Quartiles Sort the data Find the median The 1st quartile (25% mark) is the median of the smaller half of the data The 3rd quartile (75% mark) is the median of the larger half of the data

The Five Number Summary The minimum data point The 1st quartile The median The 3rd quartile The largest data point

InterQuartile Range and Outliers Outliers are data points that do not fit the pattern of the distribution. Interquartile range IQR is the difference of the 3rd quartile - the 1st quartile An outlier is a point more that one and half times the IQR below the 1st quartile number or one and half times the IQR above the 3rd quartile

Checking for Outliers Find the 5 number summary Calculate the Interquartile Range IQR = 3rd quartile - 1st quartile Lower cut off point = 1st quartile– 1.5(IQR) Upper cut off point = 3rd quartile+ 1.5(IQR) Check for data outside the cut off points

The Normal Model Density Curves and Normal Distributions

A Density Curve: Is always on or above the x axis Has an area of exactly 1 between the curve and the x axis Describes the overall pattern of a distribution The area under the curve above any range of values is the proportion of all the observations that fall in that range.

Mean vs Median The median of a density curve is the equal area point that divides the area under the curve in half The mean of a density function is the center of mass, the point where curve would balance if it were made of solid material

Normal Curves Bell shaped, Symmetric,Single-peaked Mean = µ Standard deviation = Notation N(µ, ) One standard deviation on either side of µ is the inflection points of the curve

Rule 68% of the data in a normal curve at least is within one standard deviation of the mean 95% of the data in a normal curve at least is within two standard deviations of the mean 99.7% of the data in a normal curve at least is within three standard deviations of the mean

Why are Normal Distributions Important? Good descriptions for many distributions of real data Good approximation to the results of many chance outcomes Many statistical inference procedures are based on normal distributions work well for other roughly symmetric distributions

Standard Normal Curve

Standardizing (z-score) If x is from a normal population with mean equal to µ and standard deviation,  then the standardized value z is the number of standard deviations x is from the mean Z = (x - µ)/ The unit on z is standard deviations

Standard Normal Distribution A normal distribution with µ = 0 and  = 1, N(0,1) is called a Standard Normal distribution Z-scores are standard normal where z=(x-µ)/

Standard Normal Tables Table B (pg 552) in your book gives the percent of the data to the left of the z value. Or in your Standard Normal table Find the 1st 2 digits of the z value in the left column and move over to the column of the third digit and read off the area. To find the cut-off point given the area, find the closest value to the area ‘inside’ the chart. The row gives the first 2 digits and the column give the last digit

Solving a Normal Proportion State the problem in terms of a variable (say x) in the context of the problem Draw a picture and locate the required area Standardize the variable using z =(x-µ)/ Use the calculator/table and the fact that the total area under the curve = 1 to find the desired area. Answer the question.

Finding a Cutoff Given the Area State the problem in terms of a variable (say x) and area Draw a picture and shade the area Use the table to find the z value with the desired area Go z standard deviations from the mean in the correct direction. Answer the question.

Assessing Normality In order to use the previous techniques the population must be normal To assessing normality :  Construct a stem plot or histogram and see if the curve is unimodal and roughly symmetric around the mean