Exploring Data Chapter 1 Displaying distributions with graphs Describing distributions with Numbers.

Slides:



Advertisements
Similar presentations
CHAPTER 1 Exploring Data
Advertisements

Chapter 5: Exploring Data: Distributions Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions:
CHAPTER 1: Picturing Distributions with Graphs
 Multiple choice questions…grab handout!. Data Analysis: Displaying Quantitative Data.
Chapter 1: Exploring Data
+ Unit 1: Exploring Data Lesson 1: Displaying Data.
+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 1: Exploring Data Sec. 1.2: Displaying Quantitative Data with Graphs.
Lesson 1 – 1a from Displaying Distribution with Graphs.
Warm-up *Finish the Titanic Activity!. Numerical Graphs Graphs to use for Quantitative Data.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.2 Displaying Quantitative.

+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Grade 6 Supporting Idea 6: Data Analysis.
+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
The Practice of Statistics Third Edition Chapter 1: Exploring Data Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Statistics - is the science of collecting, organizing, and interpreting numerical facts we call data. Individuals – objects described by a set of data.
+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs.
Statistics Section 1.2.   Dotplots are among the simplest graphs to construct and interpret. Dotplots.
1.2 Displaying Quantitative Data with Graphs.  Each data value is shown as a dot above its location on the number line 1.Draw a horizontal axis (a number.
+ Chapter 1: Exploring Data Section 1.1 Displaying Quantitative Data with Graphs Dotplots, Stemplots and Shapes.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
UNIT ONE REVIEW Exploring Data.
The rise of statistics Statistics is the science of collecting, organizing and interpreting data. The goal of statistics is to gain understanding from.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Knight’s Charge.
Warm Up.
Chapter 5: Exploring Data: Distributions Lesson Plan
Sec. 1.1 HW Review Pg. 19 Titanic Data Exploration (Excel File)
recap Individuals Variables (two types) Distribution
CHAPTER 1: Picturing Distributions with Graphs
Chapter 1 Data Analysis Section 1.2
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
1.1 Cont’d.
CHAPTER 1 Exploring Data
Identifying key characteristics of a set of data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
8/21/2017 Homework: pg. 46 #3-6 3.) A. Stemplot. The dots are too spread out to identify the shape of the distribution. B. Cumulative counts of observations.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Displaying Quantitative Data
CHAPTER 1 Exploring Data
Chapter 5: Exploring Data: Distributions Lesson Plan
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Warmup Find the marginal distribution for age group.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Exploring Data Chapter 1 Displaying distributions with graphs Describing distributions with Numbers

Different types of graphs Categorical Data: ◦Use a Bar Graph Quantitative Data: ◦Dot plots ◦Stem plots ◦Histograms

Things to remember!! Always Always Always plot your data!! Don’t forget your SOCS ◦S – Shape- ◦O – Outliers ◦C – Center ◦S – Spread

Bar Graphs- used to plot categorical data. The distribution of a categorical variable lists the categories and gives either the count or the percent of each individuals who fall in each category. Example 1 The radio audience rating service Aribitron places the country’s 13,838 radio stations into categories that describe the kind of programs they broadcast. Here is the distribution of stations format.

FormatCount of stationsPercent of stations Adult Contemporary Adult Standards Contemporary hit Country News/Talk/Information Oldies Religious Rock Spanish language Other format Total

Dot Plots Use for quantitative data Small amounts of data

Example 2 The accompanying data on gender and birth weight (KG) of foals born to 15 thoroughbred mares appeared in the article “Suckling Behavior Does Not Measure Milk Intake in Horses” (Animal Behaviour (1999): ). Construct a dot plot of the birth weights by gender. Gender: F M M F F M F F M F M F M F F Weight:

Stemplots: Use with quantitative data Gives a quick picture of the shape of a distribution Shows symmetry, gaps, clusters, outliers Use for small data sets

The accompanying observations are maximum flow rates for 34 different shower heads evaluated in a Consumer Reports article (July 1990). Construct two stem plots (one without splitting and one with split stems) and describe the most prominent features of the displays

Back to Back Stem Plot Literacy rates in Islamic nations CountryFemale Percent Male Percent CountryFemale Percent Male Percent Algeria6078Morocco3868 Bangladesh3150Saudi Arabia 7084 Egypt4668Syria6389 Iran7185Tajikistan99100 Jordan8696Tunisia6383 Kazakhstan99100Turkey7894 Lebanon8295Uzbekistan99100 Libya7192Yemen2970 Malaysia8592

Virginia CollegesTuition and fees ($)Virginia CollegesTuition and fees ($) Averett18430Patrick Henry14645 Bluefield10615Randolph—Macon22625 Christendom14420 Randolph—Macon Women’s Christopher Newport12626Richmond34850 DeVry12710Roanoke22109 Eastern Mennonite18220Saint Paul’s9420 Emory and Henry16690Shenandoah19240 Ferrum16870Sweet Briar21080 George Mason15816University of Virginia22831 Hampton14996University of Virginia-Wise14152 Hampton – Sydney22944Virginia Commonwealth17262 Hollins21675Virginia Intermont15200 Liberty13150Virginia Military Institute19991 Longwood12901Virginia State11462 Lynchburg22885Virginia Tech16530 Mark Baldwin19991Virginia Union12260 Marymount17090Washington and Lee25760 Norfolk State14837William and Mary21796 Old Dominion14688

Histograms Used for large sets of data Breaks the range of values of a variable into classes and displays only the count or percent of the observations that fall into each class Divide the range of data into equal-width classes Count the observations in each class – ’frequency’ Draw bars to represent classes- height=frequency Bars should touch (unlike bar graphs) Large sets of data

You have probably heard that the distribution of scores on IQ tests follows a bell shaped pattern. Let’s look at some actual IQ scores. Here are 60 5 th -grade students chosen at random from one school

Distributions Look for the overall pattern and for striking deviations from that pattern Describe the overall pattern by its shape, center, spread, and outliers. Outliers-an individual value that falls outside the overall pattern.

SHAPE Does the distribution have one or more major peak(s), one peak-unimodal Is the distribution approximately symmetric or is it skewed in one direction? Symmetric- Skewed right Skewed left

Outliers Look for points that are clearly apart from the body of the data, not just the most extreme observations in a distribution. We will discuss a test used to identify outliers in the next section. You should look for an explanation for any outlier, sometimes they are an error in recording the data. It is not a good idea to just delete or ignore outliers.

Relative Frequency Histograms do a good job displaying the distribution of values of a quantitative variable. But….. In order to get information about an individual observation you should construct a relative cumulative frequency graph. Let’s look at the U.S. presidents example…

PresidentAgePresidentAgePresidentAge Washington57Lincoln52Hoover54 J. Adams61A.Johnson56F.D. Roosevelt51 Jefferson57Grant46Truman60 Madison57Hayes54Eisenhower61 Monroe58Garfield49Kennedy43 J.Q. Adams57Arthur51L.B. Johnson55 Jackson61Cleveland47Nixon56 Van Buren54B. Harrison55Ford61 W.H. Harrison68Cleveland55Carter52 Tyler51McKinley54Reagan69 Polk49T. Roosevelt42G.H.W. Bush64 Taylor64Taft51Clinton46 Fillmore50Wilson56G.W. Bush54 Pierce48Harding55 Buchanan65Coolidge51

1. Decide on class intervals and make a frequency table, add three columns, relative frequency, cumulative frequency, and relative cumulative frequency. ClassFrequencyRelative Frequency Cumulative Frequency Relative cumulative frequency

Describing Distributions with Numbers

Two-seater CarsMinicompact Cars ModelCityHighwayModelCityHighway Acura NSX1724Aston Martin Vanquish 1219 Audi TT Roadster2028Audi TT Coupe2129 BMW Z4 Roadster2028BMW 325 CI1927 Cadillac XLR1725BMW 330 CI1928 Chevrolet Corvette1825BMW M31623 Dodge Viper1220Jaguar XK81826 Ferrari 360 Modena1116Jaguar XKR1623 Ferrari Maranello1016Lexus SC Ford Thunderbird1723Mini Cooper2532 Honda Insight6066Mitsubishi Eclipse2331 Lamborghini Gallardo 915Mitsubishi Spyder2029 Lamborghini Murcielago 913Porsche Cabriolet1826 Lotus Esprit1522Porsche Turbo Maserati Spyder1217 Mazda Miata2228 Mercedes-Benz SL Mercedes-Benz SL Nissan 350Z2026 Porsche Boxster2029 Porsche Carrera Toyota MR22632

Construct a Stem Plot—This will help you describe the shape! In order to interpret measures of center and spread you will need to think about the shape of the distribution.

Mean and Median Mean- average value Median- middle value

Two-seater CarsMinicompact Cars ModelCityHighwayModelCityHighway Acura NSX1724Aston Martin Vanquish 1219 Audi TT Roadster2028Audi TT Coupe2129 BMW Z4 Roadster2028BMW 325 CI1927 Cadillac XLR1725BMW 330 CI1928 Chevrolet Corvette1825BMW M31623 Dodge Viper1220Jaguar XK81826 Ferrari 360 Modena 1116Jaguar XKR1623 Ferrari Maranello1016Lexus SC Ford Thunderbird1723Mini Cooper2532 Honda Insight6066Mitsubishi Eclipse2331 Lamborghini Gallardo 915Mitsubishi Spyder2029 Lamborghini Murcielago 913Porsche Cabriolet1826 Lotus Esprit1522Porsche Turbo Maserati Spyder1217 Mazda Miata2228 Mercedes-Benz SL Mercedes-Benz SL Nissan 350Z2026 Porsche Boxster2029 Porsche Carrera Toyota MR22632

Looking at the data are there any outliers? What happens to the mean if we remove the outlier? One weakness of mean as a measure of center is it is non resistant to outliers. The Median is resistant to outliers.

Mean versus Median Both mean and median are the most common measures of center. The mean and median of a symmetric distribution are close together. In a skewed distribution, the mean is farther out in the ‘tail’ than is the median.