Analyzing Data (C2-5 BVD) C2-4: Categorical and Quantitative Data.

Slides:



Advertisements
Similar presentations
So What Do We Know? Variables can be classified as qualitative/categorical or quantitative. The context of the data we work with is very important. Always.
Advertisements

CHAPTER 1 Exploring Data
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Frequency Distributions and Graphs. Where do we start? Quantitative Data is a set that can be numerically represented.
1.1 Displaying and Describing Categorical & Quantitative Data.
Displaying & Summarizing Quantitative Data
AP Statistics Day One Syllabus AP Content Outline Estimating Populations and Subpopulations.
Histogram A frequency plot that shows the number of times a response or range of responses occurred in a data set.
Frequency Distributions and Graphs
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Objective To understand measures of central tendency and use them to analyze data.
STATISTICAL GRAPHS.
1.2 - Displaying quantitative data with graphs (Histograms)
Chapter 1: Exploring Data
+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Unit 4 Statistical Analysis Data Representations.
Displaying Distributions with Graphs. the science of collecting, analyzing, and drawing conclusions from data.
Section 1-1 Day One Types of Data Bar Graphs, Pie Charts Dots Plots, Stem and leaf plots, Histograms.
SWBAT: Construct and interpret dotplots, stemplots, and histograms. Dot Plot: Each data value is shown as a dot above its location on a number line. 1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 1 Exploring Data 1.2 Displaying Quantitative.

+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.
+ Chapter 1: Exploring Data Section 1.2 Displaying Quantitative Data with Graphs The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
What is Statistics?. Statistics 4 Working with data 4 Collecting, analyzing, drawing conclusions.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
1.2 Displaying Quantitative Data with Graphs.  Each data value is shown as a dot above its location on the number line 1.Draw a horizontal axis (a number.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1.1 Displaying Distributions with graphs.
Chapter 1: Exploring Data
Chapter 2: Methods for Describing Data Sets
Chapter 1: Exploring Data
Warm Up.
Unit 4 Statistical Analysis Data Representations
Displaying Distributions with Graphs
Sec. 1.1 HW Review Pg. 19 Titanic Data Exploration (Excel File)
recap Individuals Variables (two types) Distribution
Displaying Quantitative Data
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Displaying Distributions with Graphs
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Analyzing Data (C2-5 BVD) C2-4: Categorical and Quantitative Data

* Categorical variables place an individual into a group or category. * Organize the data into a frequency table or a Relative frequency (percent) table. * Graph data in a bar graphs or pie charts. * To use a segmented bar or a pie chart, data must add to 100% of a total – no overlap of categories or categories that don’t constitute a single whole.

* Two-way tables or contingency tables may be used to compare two categorical variables. * A marginal distribution of a categorical variable is the distribution for the totals of that variable (in the margins of the table). * A conditional distribution of a variable is the distribution for that variable for a specific value of the other variable. * Side-by-side segmented bar graphs showing the conditional distributions of a variable can be used to look for an association of the variables. If there is no association (i.e. the variables are independent of each other) the segmented bar graphs or corresponding relative frequency distributions will be very similar.

* Titanic Data: * 1 st class survived – 197 died – 122 * 2 nd class survived – 94 died – 167 * 3 rd class survived – 151 died – 476

* Variables: Ticket class, Survival * Marginal Distribution for Survival: 442 survived, 765 died * Conditional Distribution for 1 st class survival: 197 survived, 122 died, 319 total * Possible graph: Three segmented bars, one for each class, divided into two colors showing relative survival/death rates * Conclusion: The three bars do not look nearly identical and are not all like the marginal distribution for survival. The relative frequencies of survival and death are different at a level we believe to be statistically significant. There was an association between survival rate and ticket class.

* Quantitative variables take numerical values for which taking an average would make sense. Most quantitative variables have units of measurement. * Organize the data into a list. * Graph data using a dot plot, stem and leaf plot, or histogram SOCS * Describe the distribution using SOCS (Shape, Outlier/Unusual, Center, Spread)

* Dot plot * Dot plot - Use a number line, label axis, give graph a title. * Stem and leaf plot – * Stem and leaf plot – Stems usually are all but the final digit. The leaf is usually the final digit. You must include a key that shows what the numbers mean. Arrange leaves from least to greatest out from the stem. Do NOT leave out duplicates. May do back-to-back plots for comparisons. If there are many data points in each stem, you can split the stems. Don’t forget title. * Histogram – * Histogram – Divide data into bins of equal width. (like 0 to <5, 5 to <10, etc.) Draw a number line with the bin boundaries. Draw bars to appropriate height to show counts in each bin. Label axes. No gaps between bars unless there is an empty bin. Choose bin width to have a reasonable number of bins (around 5 or so). * Review: How to make histogram on calculator

* Describing Shape * Is the graph roughly symmetric or is it skewed? Skewed left – long tail to left. * Is the graph unimodal/bimodal/multimodal? (Don’t call a graph multimodal unless you really believe multiple peaks are meaningful and not just random variation). A constant graph is flat – bars all same height. * Most graphs that are unimodal and symmetric are NOT normal. Don’t say it is normal unless it really is!

* Describing outliers/unusual points * Are there gaps (empty bins)? * Are there any values that are unusually far from the rest?

* Describing Center * Which bin would contain the midpoint (median)? * If the data are skewed right, the mean would be above the median. Skewed left, below the median. Symmetric – same place as median

* Describing Spread * Range * More descriptive measures come in C5.

* If asked to compare/contrast two variables, do NOT just state SOCS for both displays. * You must COMPARE – tell how they are alike * And CONTRAST – tell how they are different * For each part of S O C S * Don’t leave one out – if there’s nothing interesting to say, then say that.

* Examine the 5 W’s to see what you really know about data. * Who – NOT who gathered the data. Who are the subjects the data is ABOUT. This might not be people or living things. * What – what data was gathered – i.e. your variables. Categorical? Quantitative? Units? * When – NOT when the study is published, but when the data occurred. * Where – NOT where the study is published, but where the data occurred. * Why – What is the question the research is trying to answer? * How – how was the data found – Observational study? Experiment? Sampling? Simulation? What is the Scope of Inference? Are there concerns about design?