MDS & Cluster Analysis Project Jason Niemann Taylor Evansen.

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Chapter 3 Examining Relationships
Chapter 3: Describing Relationships
ADULT MALE % BODY FAT. Background  This data was taken to see if there are any variables that impact the % Body Fat in males  Height (inches)  Waist.
Scatterplots, Association, and Correlation 60 min.
The Normal Distribution. Distribution – any collection of scores, from either a sample or population Can be displayed in any form, but is usually represented.
The measure that one trait (or behavior) is related to another
Drunk Driving I'm against drunk driving because there are many accidents each year that r caused by drinking and driving.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
What is statistics? STATISTICS BOOT CAMP Study of the collection, organization, analysis, and interpretation of data Help us see what the unaided eye misses.
State mean SAT mathematics scores. took the SAT Reasoning test.
3.3 Density Curves and Normal Distributions
Correlation Value, 4 Section 4.1. The Correlation Value, r The correlation value is a mathematical way of measuring how good a fit is. 1) Find the average.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Looking at data: relationships Scatterplots IPS chapter 2.1 © 2006 W. H. Freeman and Company.
CHAPTER 38 Scatter Graphs. Correlation To see if there is a relationship between two sets of data we plot a SCATTER GRAPH. If there is some sort of relationship.
Show correlation with a line of best fit Example: Tracking patterns in human diseases The US Center for Disease Control (CDC) keeps track of reported cases.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.1 Scatterplots.
Section 9.3 ~ Hypothesis Tests for Population Proportions Introduction to Probability and Statistics Ms. Young.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
2. PENCIL 3. GRAPHING CALCULATOR 1.NOTEBOOK AP Stats ‘Must Haves’
3.2 - Least- Squares Regression. Where else have we seen “residuals?” Sx = data point - mean (observed - predicted) z-scores = observed - expected * note.
April 1 st, Bellringer-April 1 st, 2015 Video Link Worksheet Link
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 4 – Slide 1 of 23 Chapter 3 Section 4 Measures of Position.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Relationships Scatterplots and Correlation.  Explanatory and response variables  Displaying relationships: scatterplots  Interpreting scatterplots.
Scatter Plots. Definitions  Scatter Plot: A graph in which the values of two variables are plotted along two axes, the pattern of the resulting points.
Andrew Lundborg Allison Chock Using the 2012 County Health Ranking Minnesota Data.
MDS and Cluster Project By: Elizabeth Pappenfus & Madison Anderson.
Z-scores, normal distribution, and more.  The bell curve is a symmetric curve, with the center of the graph being the high point, and the two sides on.
Chapter 12 Section 2 Why are consumer services distributed in a regular pattern?
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Stem-and-Leaf Plots …are a quick way to arrange a set of data and view its shape or distribution A key in the top corner shows how the numbers are split.
Central Tendency  Key Learnings: Statistics is a branch of mathematics that involves collecting, organizing, interpreting, and making predictions from.
Who Are We? {Coalition Blub Slide}. Who Are We? {Coalition Blub Slide}
Chapter 3: Describing Relationships
2013 Wisconsin Health Trends: Progress Report
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Warmup In a study to determine whether surgery or chemotherapy results in higher survival rates for a certain type of cancer, whether or not the patient.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3 Scatterplots and Correlation.
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
AP Stats Agenda Text book swap 2nd edition to 3rd Frappy – YAY
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Obesity in Today’s Society
Chapter 3: Describing Relationships
Presentation transcript:

MDS & Cluster Analysis Project Jason Niemann Taylor Evansen

Data Variables (all z-scores): – Excessive Drinking – Fatal Car Crashes – Uninsured – Highschool Graduation – Preventable Hospital Stays – Single Parent Households – Violent Crime Rates – Percent Under 18

MDS graph analysis Dimension 1 seems to indicate wealth of different counties, and we can see this because the wealthier counties are going to have a lower uninsured rate (pushing these counties to the left of the graph) whereas the less wealthy counties would have a higher uninsured rate (pushing them to the right). Dimension 2 correlates strongly with our “preventable hospital stays” variable, where the z-score of Lac Qui Parle county and Stevens county were very high, with Ramsey and Hennepin counties having a very negative z-score. This can indicate the level more reckless/dangerous behaviors.

In these two dimensions, we can see a few distinct clusters forming. The top left corner is the suburban counties that are more wealthy and populated. The large center cluster is the “norm” of the state, basically showing the average health behaviors. Ramsey and Hennepin counties behave very similarly almost in a cluster of their own, maybe because of the diverse nature of their population (including both metropolitan and suburban areas).

Cluster Analysis We used the “complete” method to form our dendrogram. From our k-means analysis, 3 clusters was the ideal amount for our data so this is what we’ll use to interpret the dendrogram. The smallest cluster (above the red section) includes Ramsey, Hennepin, Beltrami, Mille Lacs, and Clearwater counties. This cluster seems to be the one where the diverse/unrelated counties get put, simply because they don’t fit anywhere else. Ramsey and Hennepin counties both have high violent crime rates, which makes sense due to their metropolitan areas.

The cluster to the very right (above the green section) seems to correspond with our MDS top right cluster. These counties are more rural, and are associated with more uninsured people and reckless behavior. If we look at the data we can see that these counties have more preventable hospital stays, more fatal car crashes, and things like this. The cluster to the very left (above the blue section) has many suburban counties, like our MDS top left cluster. These counties cluster together probably due to their low uninsured rates, low single parent z-scores, high high school graduation, low violent crime rates, etc. These counties have good health overall and demonstrate that through their clustering patterns.

These MDS and cluster analysis graphs showed us that we can divide the counties up into three basic clusters based on these health variables, and that there are observable patterns throughout the state.