Techniques for Decision-Making: Data Visualization Sam Affolter.

Slides:



Advertisements
Similar presentations
Quality control tools
Advertisements

© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Displaying & Summarizing Quantitative Data
WELCOME TO THE ANALYSIS PLATFORM V4.1. HOME The updated tool has been simplified and developed to be more intuitive and quicker to use: 3 modes for all.
Chap 2-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 2 Describing Data: Graphical.
Chapter 2 Graphs, Charts, and Tables – Describing Your Data
Chapter 2 Describing Data Sets
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
CHAPTER 1: Picturing Distributions with Graphs
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Frequency Distributions and Graphs
CS 235: User Interface Design November 12 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
Charts and Graphs V
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Quantitative Skills: Data Analysis and Graphing.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation Chapter 2.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
CMPT 880/890 Writing labs. Outline Presenting quantitative data in visual form Tables, charts, maps, graphs, and diagrams Information visualization.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Sample size vs. Error A tutorial By Bill Thomas, Colby-Sawyer College.
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
Quantitative Skills 1: Graphing
Examples of different formulas and their uses....
Lecture 2 Graphs, Charts, and Tables Describing Your Data
Descriptive Statistics
Fall 2002CS/PSY Information Visualization Picture worth 1000 words... Agenda Information Visualization overview  Definition  Principles  Examples.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
Chapter 2 Describing Data.
Descriptive Statistics
Statistics Unit 2: Organizing Data Ms. Hernandez St. Pius X High School
1 Chapter 3 Looking at Data: Distributions Introduction 3.1 Displaying Distributions with Graphs Chapter Three Looking At Data: Distributions.
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
Problem Solving.
Graphing Why? Help us communicate information : Visual What is it telling your? Basic Types Line Bar Pie.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Sort the graphs. Match the type of graph to it’s name.
Ch. 1 Looking at Data – Distributions Displaying Distributions with Graphs Section 1.1 IPS © 2006 W.H. Freeman and Company.
Section 2.2 Bar Graphs, Circle Graphs, and Time-Series Graphs 2.2 / 1.
GrowingKnowing.com © Frequency distribution Given a 1000 rows of data, most people cannot see any useful information, just rows and rows of data.
Chap 2-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course in Business Statistics 4 th Edition Chapter 2 Graphs, Charts, and Tables.
Grade 8 Math Project Kate D. & Dannielle C.. Information needed to create the graph: The extremes The median Lower quartile Upper quartile Any outliers.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Applied Quantitative Analysis and Practices
Unit 42 : Spreadsheet Modelling
Unit 2: Geographical Skills
CS 235: User Interface Design November 19 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CS 235: User Interface Design April 30 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
100 Metros Analysis User’s Guide Atlanta Regional Commission For more information contact:
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Two Organizing Data.
Recap Iterative and Combination of Data Visualization Unique Requirements of Project Avoid to take much Data Audience of Problem.
Microsoft Excel 2013 Chapter 8 Working with Trendlines, PivotTable Reports, PivotChart Reports, and Slicers.
Introduction to statistics I Sophia King Rm. P24 HWB
Statistics is... a collection of techniques for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting,
Exploratory data analysis, descriptive measures and sampling or, “How to explore numbers in tables and charts”
Educational Research Descriptive Statistics Chapter th edition Chapter th edition Gay and Airasian.
Techniques for Decision-Making: Data Visualization Sam Affolter.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Exploratory Data Analysis
Tennessee Adult Education 2011 Curriculum Math Level 3
Chapter 2: Methods for Describing Data Sets
2.2 Bar Charts, Pie Charts, and Stem and Leaf Diagram
CHAPTER 1: Picturing Distributions with Graphs
Statistics is... a collection of techniques for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting,
Presentation transcript:

Techniques for Decision-Making: Data Visualization Sam Affolter

Basics of Visualization Analytical Navigation Analytical Interaction Techniques & Practices Analytical Patterns

We attend to contrasts from the norm.

Visualizations should display patterns that are easy to spot.

Memory plays an important role in cognition, but is very limited.

Pre-attentive Visual Attributes Stephen Few, Now You See It, pages

Working in groups to help one another out: Find two datasets for your project. Open Tableau and import the datasets. Join the two datasets as appropriate.

Directed analysis begins with a specific question that we hope to answer, Searches for an answer to that question, Then produces an answer. In cases of directed navigation, data visualization typically will not be used until we attempt to communicate the answer.

Exploratory analysis begins by looking without knowing what we’ll find Then, we find something that seems interesting and ask a question, We then proceed in the directed fashion to find an answer. Data visualization, particularly in powerful DV tools like Tableau, is often the best way to work through exploratory navigation.

“Overview first, zoom and filter, then details-on-demand.” An overview reduces search and allows detection of patterns. Zooming and filtering cuts away the inconsequential and focuses on the relevant data space Having detailed data at ones fingers is essential to developing a full understanding of what is happening

A useful method of navigation, starting from high-level views of the data set, then progressing to ever finer grains. Node-link diagrams (tree diagrams from the diagramming class) can be used to visualize the structure of hierarchical navigation Tree maps are also often used to give the high level view.

Using your new datasets in Tableau, create a Tree Map. Spend some time determining the quantitative measures that are appropriate for high level study. Get into your groups and discuss.

“The beating heart of analysis” - Stephen Few Specifically, we are looking for similarities or differences within data We can compare magnitudes or patterns within larger data sets Some problems inherent in visually comparing – Hidden data – Obscured patterns

Nominal: Comparing values that have no particular order Ranking: Comparing values that are arranged by magnitude Part-to-whole: Comparing values that make up parts of a whole

Deviation: Comparing the differences between two sets of values Time-series: Comparing measures that were recorded at different points in time to see how they change

Sorting can quickly point to meaningful relationships in the underlying data Sorting can be done on the quantitative data (to determine which category is the largest/smallest driver) It can also be done on the categories. This allows the user to easily find specific categorical elements.

When exploring a data set, adding additional variables allows us to segment data which in turn can pinpoint interesting data anomalies.

Viewing Revenue by Country may show us that the US generates the most revenue for our company. However, by pulling in Region, we find that one of our US Regions is the smallest revenue generator in the entire company.

In contrast to adding variables, filtering allows the user to remove superfluous data. Used as one proceeds deeper into the data set The filtering process typically removes extra data elements, but can also be used to eliminate outliers to further understand the underlying trends.

The first glance of the dataset to the left one might see positive trends across the board. Removing some of the additional lines quickly shows that Item G’s revenue stream has been dropping steadily. Furthermore, Item F has made a “stair step” increase in revenue at approximately the same time. Are these related?

Most data analysis is done on data sets with elements that are grouped. Grouping helps to reduce volatility in thin data. Binning sales data by price bucket, is an example of this. Minimally, we aggregate by time; however, aggregations across other categorical elements can be extremely powerful. All hierarchical data can be considered as pre-aggregated.

One of the most commonly used analytical techniques. Can help to build up your data to ink ratio substantially. Re-expression can be seen in changing numbers to percentages, looking at deltas between data sets, or building rates.

Re-expression Example

From the top level view that you have built in your Tree Map, use the analytical interaction techniques described to begin to dive into your data. Does comparing magnitudes bring to light questions? Sorting? Adding variables? Are there categorical or quantitative elements that you should be aggregating or re-expressing?

When using a bar graph, begin the scale at zero and end the scale above the highest value. With every type of graph other than a bar graph begin the scale a little below the lowest value and end it a little above the highest value. Begin and end the scale at round numbers, make the intervals round numbers as well.

By scaling between $5 and $6 Million, the chart to the left is presenting what Few terms a “visual lie.” Compared to the chart below, the viewer sees the variance to be much larger than it really is. This is problematic in programs such as MS Excel, which auto-formats the axis.

In contrast with bar charts, other charts are meant to compare data relative to itself. A perfect example of which is line charts. The line chart to the left begins at 0. Here we are unable to see the trend, compared to the chart below which displays the trend well. Moving from a bar chart to a line or other chart in Excel can be problematic when it comes to axes. If we forced the low value to be 0 to fit the bar chart, we may lose visual understanding as we switch.

Reference lines are used to compare the data set with another metric. These help make outliers obvious. Reference lines can be developed off the data themselves (averages, standard deviations, etc.) They can also be driven externally (year end growth goals, call time goals, etc.)

Take the above monthly chart of monthly changes in revenue. One thing of possible interest is to understand when we are above or below average. We may want to add additional warnings if the changes become too great. In this case, I added a lower and upper control limit at 1.5X the stdev.

Trellis displays should have the following characteristics: – Graphs only differ in terms of the data displayed. Each graph displays a subset of a single larger set, divided according to some categorical variable – Every graph is the same type, shape, and size, and shares the same categorical and quantitative scales – Graphs can be arranged horizontally, vertically, or both. – Graphs are sequenced in a meaningful order, usually based on the values that are featured.

Trellis Chart Example

Exercise 4 Once again using your current data sets in Tableau, build a trellis chart to analyze additional variables. Get into your groups to discuss findings.