MIS2502: Data Analytics Principles of Data Visualization

Slides:



Advertisements
Similar presentations
Lecture 06: Design II February 5, 2013 COMP Visualization.
Advertisements

Making effective plots: 1.Don’t use default Excel plots! 2.Figure should highlight the key relationships in the data. 3.Should be clear - no extraneous.
Analyzing and Visualizing Data
Statistics for the Behavioral Sciences Second Edition Chapter 3: Visual Displays of Data iClicker Questions Copyright © 2012 by Worth Publishers Susan.
Introduction to Data Analytics
FOOLING BY STATISTICS 5 Ways to Avoid Being Fooled By Statistics
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 3.1 Chapter Three Art and Science of Graphical Presentations.
2007 會計資訊系統計學 ( 一 ) 上課投影片 3.1 Chapter Three Art and Science of Graphical Presentations.
Data Visualization.
Analyzing and Visualizing Data Dr. Lam TECM 4180.
CMPT 880/890 Writing labs. Outline Presenting quantitative data in visual form Tables, charts, maps, graphs, and diagrams Information visualization.
Graphical Display and Presentation of Quantitative Information 13 February 2006.
Graphics for Macroeconomics. Principles Graphing is done best when it clearly communicates ideas about data Focus on the main point while preventing distractions.
Department of Politics and Government Illinois State University
MIS2502: Data Analytics Principles of Data Visualization David Schuff
COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics.
always keep in mind… arrangement emphasis contrast repetition alignment proximity.
Worth 1,000 Words How to use information graphics to make data meaningful National Association for Career and Technical Education Information May 17, 2012.
start with… PURPOSE OF DATA DISPLAYS. ALL OF THEM. The reason for these displays—rather then just putting numbers in your paragraphs—is to help your readers.
Data Presentation Adapted by Joanna Wolfe from Marianne W. Zawitz, Bureau of Justice Statistics, October 11, 2000 Presenting effective Tables and Figures.
MIS2502: Data Analytics Principles of Data Visualization.
start with… audience  who’s the data display for? who will be looking at, reading, and interpreting it? purpose  what does your audience want? what.
MIS5101: What is Analytics? Principles of Data Visualization.
MIS 420: Data Visualization, Representation, and Presentation Content adapted from Chapter 2 and 3 of
Happy Tuesday Scientists!
Elementary Statistics
Objectives Understand when to use visuals
What is an infographic?.
EHS 655 Lecture 22: Technical writing, data presentation
Figures, Graphs, and Tables
Data Visualization vs. Infographics
Some tips on which visuals to use (and which not to use) and when
Data Visualization.
Chapter 15 : Communicating Evidence Visually
Graphics in Expository Writing: A Guide
In Concert: An Integrated Reading and Writing Approach by Kathleen T
Display of Quantitative Information
Tutorial 4: Enhancing a Workbook with Charts and Graphs
Analyzing and Visualizing Data
Three Using Visuals in Written and Oral Communication.
IB Physics: Chapter 1 Lesson 2b: Graphs.
Good and Bad Data Visualizations
How could data be used in an EPQ?
MIS2502: Data Analytics Principles of Data Visualization
STAT 4030 – Jennifer Priestley, Ph.D. Programming in R
Data Visualization Data visualization principles. Tell a story
Elementary Statistics
Proposal: Preliminary Results and Discussion
Visualization Week 8.
Module 6: Presenting Data: Graphs and Charts
MIS2502: Data Analytics Principles of Data Visualization
Presenting Data.
Study these for your Scientific Method Test!!!!
3 2 Chapter Organizing and Summarizing Data
MIS2502: Data Analytics Dimensional Data Modeling
IS-171 Computing With Spreadsheets
IS-171 Computing With Spreadsheets
MIS2502: Data Analytics Dimensional Data Modeling
What’s the problem? Goodson
Xbar Chart Farrokh Alemi, Ph.D..
Creating Visuals and Data Displays
CMPE 280 Web UI Design and Development March 26 Class Meeting
Computer Applications for Business
Keller: Stats for Mgmt & Econ, 7th Ed
Statistical Reasoning
Purpose of Displaying Data
Keller: Stats for Mgmt & Econ, 7th Ed
INSTRUCTIONAL NOTES There are many similarities between Photoshop and Illustrator. We have attempted to place tools and commands in the context of where.
Displaying data Seminar 2.
Charts A chart is a graphic or visual representation of data
Presentation transcript:

MIS2502: Data Analytics Principles of Data Visualization Acknowledgement: David Schuff

Data visualization can: provide clear understanding of patterns in data detect hidden structures in data condense information

What makes a good chart? Wikipedia: Patriotic War of 1812 http://en.wikipedia.org/wiki/File:Patriotic_War_of_1812_ENG_map1.svg Video: Napoleonic Wars in 8 Minutes Another video

Napoleon’s 1812 March by Charles Joseph Minard What makes a good chart? Napoleon’s 1812 March by Charles Joseph Minard Animation Reprinted in Tufte (2009), p. 41 Perhaps the most famous data presentation…

What can you learn from this map? http://www.popvssoda.com/countystats/total-county.html

What makes a good chart? This is from an academic conference paper. What are the problems with this chart? The legend is for how often they find relevant information (given their “following” behavior”) Zhang et al. (2010), “A case study of micro-blogging in the enterprise: use, value, and related issues,” Proceedings of the 28th International Conference on Human Factors in Computing Systems.

Some basic principles (adapted from Tufte 2009) The chart should tell a story 1 The chart should have graphical integrity 2 The chart should minimize graphical complexity 3 Tufte’s fundamental principle: Above all else show the data

Principle 1: The chart should tell a story Graphics should be clear on their own The depictions should enable meaningful comparison The chart should yield insight beyond the text “If the statistics are boring, then you’ve got the wrong numbers.” (Tufte 2009)

Do these tell a story? http://www.evl.uic.edu/aej/491/week03.html http://flowingdata.com/2009/11/26/fox-news-makes-the-best-pie-chart-ever/

Telling a Story http://fivethirtyeight.com/features/the-three-types-of-dwayne-the-rock-johnson-movies/ http://economix.blogs.nytimes.com/2009/05/05/obesity-and-the-fastness-of-food/

Does it tell a good story? http://gizmodo.com/8-horrible-data-visualizations-that-make-no-sense-1228022038

Principle 2: The chart should have graphical integrity Basically, it shouldn’t “lie” (mislead the reader) Tufte’s “Lie Factor”: 𝐿𝑖𝑒 𝐹𝑎𝑐𝑡𝑜𝑟= 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑒𝑓𝑓𝑒𝑐𝑡 𝑠ℎ𝑜𝑤𝑛 𝑖𝑛 𝑔𝑟𝑎𝑝ℎ𝑖𝑐 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑒𝑓𝑓𝑒𝑐𝑡 𝑖𝑛 𝑑𝑎𝑡𝑎 Should be ~ 1 > 1 = exaggerated effect < 1 = understated effect

Examples of the “lie factor” 𝐿𝐹= 5.3/0.6 27.5/18 = 8.83 1.53 =5.77 𝐿𝐹= 4280% (𝑐ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑣𝑜𝑙𝑢𝑚𝑒) 454% (𝑐ℎ𝑎𝑛𝑔𝑒 𝑖𝑛 𝑝𝑟𝑖𝑐𝑒) =9.4 Reprinted from Tufte (2009), p. 57 & p. 62

How about this bar chart? The original graphic from President Trump’s tweet. (Look at the y-axis) https://www.washingtonpost.com/graphics/politics/2016-election/trump-charts/

How is this deceptive? The original graphic from President Trump’s tweet. (Look at the y-axis) Does the scale match the numbers? https://www.washingtonpost.com/graphics/politics/2016-election/trump-charts/

Where would the real baseline end up? 43 45 Where would the real baseline end up? https://www.washingtonpost.com/graphics/politics/2016-election/trump-charts/

Exaggerated Effect (LF>1) Understated Effect (LF<1) A A A B B A https://www.washingtonpost.com/graphics/politics/2016-election/trump-charts/

3D Pie Chart: which supplier is the largest? Source: Knaflic (2015). Storytelling with Data: A Data Visualization Guide for Business Professionals. Chapter 2.

3D Pie Chart: which supplier is the largest? Supplier B—which looks largest, at 31%—is actually smaller than Supplier A, at 34%! Source: Knaflic (2015). Storytelling with Data: A Data Visualization Guide for Business Professionals. Chapter 2.

What can be used instead?

Other tips to avoid “lying” Adjust for inflation Make sure the context is presented vs.

Present data in context The original graphic from Fox News, Feb 2012. In Reality… Fox Chart Showed Gas Prices Were Consistently Rising. On February 20, Fox News displayed a graphic that used three random data points to purportedly show the national average cost of gasoline over a year: One was the national average gas price from the day the graphic aired, the other two were chosen from the previous week and the previous year. From Fox News' America's Newsroom In Reality, Fox Cherry Picked Data To Hide Fact That Fluctuating Gas Prices Had Fallen From High Points. An accurate representation of gas prices over the 12-month period starting in February 2011 showed that gas prices in February 2012 -- the highest point on Fox's graphic -- were actually down from their high in April-May of 2011. From AAA: http://mediamatters.org/research/2012/10/01/a-history-of-dishonest-fox-charts/190225

Principle 3: The chart should minimize graphical complexity Generally, the simpler the better… Key concepts Sometimes a table is better Data-ink Chartjunk

 “You know you’ve achieved perfection, not when you have nothing more to add, but when you have nothing to take away.”  -- by Antoine de Saint-Exupéry

When a table is better than a chart For a few data points, a table can do just as well… Salesperson Total Sales Peacock $225,763.68 Leverling $201,196.27 Davolio $182,500.09 Fuller $162,503.78 Callahan $123,032.67 King $116,962.99 Dodsworth $75,048.04 Suyama $72,527.63 Buchanan $68,792.25 The table carries more information in less space and is more precise.

The Ultimate Table: The Box Score Large amount of information in a very small space So why does this work? Depends on the reader’s knowledge of the data Baseball NBA: http://www.nba.com/games/20160306/PHIMIA/gameinfo.html?ls=eref:google:1b:post

Data Ink Should be ~ 1 The amount of “ink” devoted to data in a chart Tufte’s Data-Ink ratio: 𝐷𝑎𝑡𝑎−𝑖𝑛𝑘 𝑟𝑎𝑡𝑖𝑜= 𝑑𝑎𝑡𝑎−𝑖𝑛𝑘 𝑡𝑜𝑡𝑎𝑙 𝑖𝑛𝑘 𝑢𝑠𝑒𝑑 𝑖𝑛 𝑔𝑟𝑎𝑝ℎ𝑖𝑐 Should be ~ 1 < 1 = more non-data related ink in graphic = 1 implies all ink devoted to data Tufte’s principle: Erase ink whenever possible

Being conscious of data ink Lower data-ink ratio (worse) Higher data-ink ratio (better)

What makes a good chart? Sometimes it’s really a matter of preference. These both minimize data ink. Why isn’t a table better here?

3-D Charts Evaluate this from a data-ink perspective. How does it affect the clarity of the chart?

One of the golden rules of data visualization is….. Never use 3D! Data Integrity/ Lie Factor 3D skews numbers, making them difficult to interpret or compare Graphical Complexity Adding 3D to graphs introduces unnecessary chart elements like side and floor panels Source: Knaflic (2015). Storytelling with Data: A Data Visualization Guide for Business Professionals. Chapter 2.

Chartjunk: Data Ink “gone wild” Unnecessary visual clutter that doesn’t provide additional insight Distraction from the story the chart is supposed to convey When the data-ink ratio is low, chartjunk is likely to be high

Example: Moiré effects (Tufte 2009) Creates illusion of movement Stands out, in a bad way

Example: The Grid Why are these examples of chartjunk? What could you do to remedy it?

Data Ink Working For Us Evaluate this chart in terms of Data Ink. Imagine this as a bar chart. As a table!!

Review: Data principles (adapted from Tufte 2009) The chart should tell a story 1 The chart should have graphical integrity 2 The chart should minimize graphical complexity 3 Tufte’s fundamental principle: Above all else show the data

Review: What do you think of these? http://www.economist.com/node/21537909 http://images.macworld.com/images/howto/graphics/134708-create-charts-good_376.jpg

(For Spatial Comparison) Common Chart Types Bars (For Comparison) Pie (For Composition) Map (For Spatial Comparison) Line (For Evolution) Scatterplot (Relationship)

Infographics i.e. Information graphics Visualization of information, data or knowledge intended to present information quickly and clearly We will have an ICA to create inforgraphics using Piktochart. http://the-digital-reader.com/2015/04/13/infographic-ebooks-on-track-to-double-dutch-ebook-market-in-2014/

Some Visualization Tools Excel (as always) R, Stata, Tableau, SAS (useful for Statistical Plots). Google Charts, FusionCharts (simple graphs as well as maps) Piktochart (infographics) Adobe Photoshop, Illustrator, etc (for graphical design)

Summary Use data visualization principles to assess a visualization Tell a story Graphical integrity (lie factor) Minimize graphical complexity (data ink, chartjunk) Explain how a visualization can be improved based on those principles Types of visualization