Jessica M. Orth Department of Statistics and Actuarial Science University of Iowa Dynamic Graphics: An Interactive Analysis Of What Attaches People To.

Slides:



Advertisements
Similar presentations
Computing in Archaeology Session 12. Multivariate statistics © Richard Haddlesey
Advertisements

The Robert Gordon University School of Engineering Dr. Mohamed Amish
Lesson 10: Linear Regression and Correlation
Clustering: Introduction Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
1 Incorporating Statistical Process Control and Statistical Quality Control Techniques into a Quality Assurance Program Robyn Sirkis U.S. Census Bureau.
Copyright © 2014 by The University of Kansas Collecting and Analyzing Data.
Mapping Nominal Values to Numbers for Effective Visualization Presented by Matthew O. Ward Geraldine Rosario, Elke Rundensteiner, David Brown, Matthew.
Chapter 17 Overview of Multivariate Analysis Methods
Chapter 9 Business Intelligence Systems
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Data and Interpretation What have you learnt?. The delver into nature’s aims Seeks freedom and perfection; Let calculation sift his claims With faith.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Using School Climate Surveys to Categorize Schools and Examine Relationships with School Achievement Christine DiStefano, Diane M. Monrad, R.J. May, Patricia.
1 The New York State Education Department New York State’s Student Reporting and Accountability System.
Chapter 2 Data Patterns and Choice of Forecasting Techniques
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Introduction to GREAT for ELs Office of Student Assessment Wisconsin Department of Public Instruction (608)
Factor Analysis Psy 524 Ainsworth.
Data Mining Techniques
The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Multivariate Data Analysis CHAPTER seventeen.
Introduction Multilevel Analysis
1 Multivariate Analysis (Source: W.G Zikmund, B.J Babin, J.C Carr and M. Griffin, Business Research Methods, 8th Edition, U.S, South-Western Cengage Learning,
CHAPTER 14 MULTIPLE REGRESSION
Chapter Fourteen Statistical Analysis Procedures Statistical procedures that simultaneously analyze multiple measurements on each individual or.
Analyzing and Interpreting Quantitative Data
1 LECTURE 6 Process Measurement Business Process Improvement 2010.
Determinants of Credit Default Swap Spread: Evidence from the Japanese Credit Derivative Market.
Clustering II. 2 Finite Mixtures Model data using a mixture of distributions –Each distribution represents one cluster –Each distribution gives probabilities.
Introduction to GREAT for ELs Office of Student Assessment Wisconsin Department of Public Instruction (608)
Indian and Northern Affaires indiennes Affairs Canada et du Nord Canada First Nation and Inuit Community Well-Being : Describing Historical Trends ( )
Available at Chapter 13 Multivariate Analysis BCB 702: Biostatistics
The Statistical Analysis of Data. Outline I. Types of Data A. Qualitative B. Quantitative C. Independent vs Dependent variables II. Descriptive Statistics.
Chapter 14 – Cluster Analysis © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
American Community Survey (ACS) Product Types: Tables and Maps Samples Revised
Dimension Reduction in Workers Compensation CAS predictive Modeling Seminar Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Lecture 07: Dealing with Big Data
THE MEASUREMENT OF URBAN LAND CONSUMPTION AS A SOURCE OF INDICATORS OF ECONOMIC PERFORMANCE AND SUSTAINABILITY Rodrigo Bastías Castillo
Chapter 6: Analyzing and Interpreting Quantitative Data
Climate Change in the Mind of a College Student A Cross-Sectional Study on Climate Change Perceptions at the University of Oklahoma Benjamin Ignac, Aparna.
Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
1-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Effectiveness of Selected Supplemental Reading Comprehension Interventions: Impacts on a First Cohort of Fifth-Grade Students June 8, 2009 IES Annual Research.
Principal Component Analysis
How to Construct a Seasonal Index. Methods of Constructing a Seasonal Index  There are several ways to construct a seasonal index. The simplest is to.
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
Chapter Seventeen Copyright © 2004 John Wiley & Sons, Inc. Multivariate Data Analysis.
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
UMass Dartmouth College of Arts & Sciences umassd.edu/cas Introduction and Purpose Methods Methods - ContinuedFindings - Continued Conclusions Acknowledgement.
Unsupervised Learning
Unsupervised Learning
Chapter 15 – Cluster Analysis
Analyzing data: First steps
Quality Control at a Local Brewery
Clustering and Multidimensional Scaling
State Superintendent’s Advisory Committee on 4-Year-Old Kindergarten May 15, 2017 FIND PICTURE!
Principal Components Analysis
Multidimensional Space,
Results for Research Question 1 Results for Research Question 2
Linear Regression and Correlation
Unsupervised Learning
Unsupervised Learning
LAUNCHING THE 2019 REGIONAL COMPETITIVENESS INDEX RCI 2019
Presentation transcript:

Jessica M. Orth Department of Statistics and Actuarial Science University of Iowa Dynamic Graphics: An Interactive Analysis Of What Attaches People To Their Communities JSM 2013 Montréal, Canada Data Expo: 2013 ‘Soul of the Community’ Knight Foundation and Gallop Three years: ,000 people surveyed in 26 communities across America The Data Attachment, Loyalty, Passion, Basic Services, Leadership, Education, Safety, Aesthetics, Economy, Social Offerings, Community Offerings, Civic Involvement, Openness, Social Capital, Community Domains Index Variables Means, standard deviations, proportion of high index variables, z- scores Summary Statistics Principal Component Analysis Multidimensional Scaling Data Reduction and Data Mining Interactive Motion Charts Average Hierarchal Cluster Analysis Graphical Displays Displaying multivariate data can be achieved in many ways through a variety of tools. Here we aim to emphasize the use of motion charts for displaying the trend analysis of time-dependent Principal Component Analysis and Multidimensional Scaling. It is well known that these methods are used as data reduction and data mining techniques in the analysis of multivariate data, but what happens when we introduce a time variable to these results? As will be seen, motion charts provide the tool to seamlessly merge these results throughout time and allow for dynamic and interactive interpretations of what attaches people to their communities. We analyze the index variables from the ‘Soul of the Community’ survey conducted by the Knight Foundation and Gallop by looking at four different summary statistics: means, standard deviations, the proportion of high index variables, and z-scores. Means, standard deviations, and proportions are calculated for cities based on the index variables. The z-scores serve as an index themselves, providing information on each city’s score for the original index variables: negative z-scores imply a lower score for the index variable and positive z-scores indicate a higher score for that city, relative to the overall score of the original index variable. I. Approach II. Key Drivers and Relationships Between them (PCA) Figure 1: Means Figure 2: Standard Deviations Figure 3: Proportions It is often said that ‘Beauty is in the eye of the beholder’, so why not put the analysis in the hands of the user? One of the many beauties of motion charts is the capability to do just this. Why limit the results to a single graphical display? Motion charts allow for customizable analysis to suit the interests of multiple users. While social offerings, openness, and aesthetics are found to be the leading drivers of community attachment by the Knight Foundation, we look at the relationship between these and the other index variables using Principal Component Analysis. MeansStandard DeviationsProportions Dimension 1Overall drivers for attachment Personal Assurance vs. Overall drivers for attachment Percentage of Variation Explained 2008: : : : : : : : : 39 Dimension 2Economic Growth vs. Emotional Bond Personal Assurance and Pride vs. Economic Growth Emotional Bond vs. Economic Growth Percentage of Variation Explained 2008: : : : : : : : : 20 Dynamic DriversInvolvement, Economy, Domains Safety, Social Capital, Education, Basic Services Safety, Aesthetics, Social Capital, Leadership, Social Offering, Openness, Economy

III. Differences Between Communities JSM 2013 Montréal, Canada Data Expo: 2013 III.A. Multidimensional Scaling IV. Conclusions and Future ResearchReferences Markus Gesmann and Diego de Castillo. Using the Google Visualisation API with R. The R Journal, 3(2):40-44, December H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, Francois Husson, Julie Josse, Sebastien Le and Jeremy Mazet (2013). FactoMineR: Multivariate Exploratory Data Analysis and Data Mining with R. R package version "What Attaches People to Their Communities? | Knight Soul of the Community.” 03 June R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL The goal of Multidimensional Scaling is to provide a visual representation of the pattern of similarities and differences among the cities. We use the index variables to determine the relationships between the cities. Cities estimated to be very similar to each other in these characteristics are placed close to each other on the map, and those estimated to be very different from each other are placed far away from each other on the map. These motion charts provide many different ways one can interpret the clusters and dimensions of the Multidimensional Scaling. In each figure, we can see distinct clusters of cities. We can group them by region or urbanicity to search for patterns in the clusters. Dynamic cites, which are those cities that move from cluster to cluster throughout the years, are marked on the charts. Higher mean scores and proportion scores imply that the city scored higher across all index variables. A higher score in standard deviations implies that the responses for that city had more variation across the index variables, and higher z-scores indicate a higher city score relative to the original index variables. III.B. Hierarchical Cluster Analysis Figure 8 Another way we can observe the differences between the communities is to look at the results of average hierarchical cluster analysis. Figure 8 shows the dendrograms for each year, and the clusters of cities obtained by this method. Cutting each tree at 0.8, we can observe different numbers of clusters for each year, as well as different groupings of the cities throughout time. We have demonstrated the use of motion charts in displaying the results of time-dependent multivariate analysis. Dynamic and interactive interpretations can be achieved and customized based on the interest of the user. Future research in this area will be to repeat the analyses based on subsets of the data by the suggested clusters to further understand the relationships between the index variables and cities, and to better characterize what attaches people to their communities. Figure 4: Means Figure 7: Z-Scores Figure 5: Standard Deviations Figure 6: Proportions