Data Visualization.

Slides:



Advertisements
Similar presentations
ENV Envisioning Information Lecture 8 – Good Design – What we can learn from Tufte Ken Brodlie
Advertisements

Lecture 06: Design II February 5, 2013 COMP Visualization.
Theory of Data Graphics Part 1 Most of a graphic’s ink should vary in response to data variation (see chapters 4-6)
Moneyball (2011) How to make data less daunting and more meaningful Brent Stockwell | Strategic Initiatives Director Scottsdale, Ariz., City Manager’s.
© Keith Vander Linden, 1999 “A picture is worth a thousand words”
Introduction to Data Analytics
Interface Design Tufteism.
Using Visual Rhetoric in Report Writing Professor Stevens Amidon Department of English and Linguistics, IPFW.
1 Some principles of graphical excellence Kaye E. Marion Norca Consulting Pty Ltd Principle reference: Tufte, E.R. (1983), The visual display of quantitative.
Data Presentation A guide to good graphics Bureau of Justice Statistics Marianne W. Zawitz.
SIMS 247 Information Visualization and Presentation Prof. Marti Hearst August 31, 2000.
1 Information Design Scott Matthews Courses: /
Scientific Communication and Technological Failure presentation for ILTM, July 9, 1998 Dan Little.
Design World Graphical Integrity
1 Information Design Scott Matthews Courses: /
1 Visualization Solutions for Effective Communication Warren C. Weber California State Polytechnic University, Pomona.
ID-2050 The “Design” Lecture. Today Document Design Information Design Tufte’s “Data Maps” BREAK Graphical Excellence in practice.
Business Communication, 15e
1 Determining Effective Data Display with Charts.
Information Visualization in Data Mining S.T. Balke Department of Chemical Engineering and Applied Chemistry University of Toronto.
Graphing Scientific Data From a Mathematics Across the Curriculum (MAC) coordinated studies class with Biology 201 (Fall 2000) at Edmonds Community College.
Infographics Visualizing Data. What are they? InfographicsInfographics can be used to visualize data in beautiful and interesting ways making it fun and.
Jeffrey Nichols Displaying Quantitative Information May 2, 2003 Slide 0 Displaying Quantitative Information An exploration of Edward R. Tufte’s The Visual.
Graphics and visual information English 314 Technical communication Note: To hide or reveal these lecture notes, go to VIEW and click COMMENTS. This lecture.
Principles of Graphical Excellence Best Paper: ALAIR April 5–6, 2001 AIR: June 2-5, 2002, Toronto Focus-IR, February 21, 2003 Anna T. Waggener, Ph.D. Institutional.
Making Graphs. The Basics … Graphical Displays Should: induce the viewer to think about the substance rather than about the methodology, graphic design,
William H. Bowers – Designing Look and Feel Cooper 19.
Mark P. Baldwin Northwest Research Associates, USA Cargese UTLS Summer School, 6 Oct Data Graphics AndTypography.
The Center for IDEA Early Childhood Data Systems April 25, 2014 Data Visualization: A Picture’s Worth a Thousand Numbers Nick Ortiz, Alice Ridgway and.
Mark P. Baldwin Northwest Research Associates, USA Cargese UTLS Summer School, 6 Oct Data Graphics AndTypography.
CMPT 880/890 Writing labs. Outline Presenting quantitative data in visual form Tables, charts, maps, graphs, and diagrams Information visualization.
Principles of Good Presentation Slides & Graphics November 21, 2008 Adapted from slides used by Katie Kopren.
Graphical Display and Presentation of Quantitative Information 13 February 2006.
© 2003 Pearson Education, Inc., publishing as Longman Publishers. 1 Chapter 14 Designing Visuals Technical Communication, 9/e John M. Lannon PowerPoint.
CREATING A PROFESSIONAL 3-FOLD BROCHURE PUBLISHER 2007.
1 Eric Rasmusen, March 10, 2014 Graphs and Tables.
Graphics for Macroeconomics. Principles Graphing is done best when it clearly communicates ideas about data Focus on the main point while preventing distractions.
Visualizing Data in Excel Geof Hileman, FSA Kennell & Associates, Inc June 4, 2012.
Data Visualization Seminar NCDC, April Todd Pierce Module 1 Data Visualization.
Making data meaningful through effective visual presentation 5 th - 9 th December 2011, Rome Slides courtesy of: United Nations Economic Commission for.
MIS2502: Data Analytics Principles of Data Visualization David Schuff
COMMUNICATING DATA USING GRAPHICS MIS2502 Data Analytics.
GNET INTRODUCTION TO CONTENT. GNET INTRODUCTION.
14-1 © 2014 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any.
Four types Data maps (17-19, Tufte, also History of the World in 100 Seconds)History of the World in 100 Seconds Time series Narrative graphics of space.
1 CSE 2337 Chapter 3 Data Visualization With Excel.
Information Design Trends Unit Three: Information Visualization Lecture 1: Escaping Flatland.
Worth 1,000 Words How to use information graphics to make data meaningful National Association for Career and Technical Education Information May 17, 2012.
© Keith Vander Linden, Graphs and Charts ● Quantitative data can frequently be illustrated in a compelling way using charts and graphs. – see
MIS2502: Data Analytics Principles of Data Visualization.
Recap Iterative and Combination of Data Visualization Unique Requirements of Project Avoid to take much Data Audience of Problem.
Chapter 13 Using Visual Aids.
Effective Visuals Tables Graphs Charts Illustrations.
Creating A Professional 3-fold Brochure PUBLISHER 2007.
Testing Tufte Applying Visual Design Principles to Student Test Results Dan Gilbert Mike Griffin
Multivariate Visualization. Projection Distortion.
Assignment 7: Thinking about graphical excellence By: Sarah K. Brooks.
MIS5101: What is Analytics? Principles of Data Visualization.
MIS 420: Data Visualization, Representation, and Presentation Content adapted from Chapter 2 and 3 of
DATA VISUALIZATION BOB MARSHALL, MD MPH MISM FAAFP FACULTY, DOD CLINICAL INFORMATICS FELLOWSHIP.
Display of Quantitative Information
MIS2502: Data Analytics Principles of Data Visualization
Data Visualization Data visualization principles. Tell a story
MIS2502: Data Analytics Principles of Data Visualization
CSc4730/6730 Scientific Visualization
Visual Presentation of Quantitative Data
More on Data Presentation CS 239 Experimental Methodologies for System Software Peter Reiher May 24, 2007.
Step-3: Principles of Good Interface and Screen Design
Using graphics and visuals
Presentation transcript:

Data Visualization

Lies, Damn Lies, and Bad Graphs http://www.tnr.com/blog/jonathan-cohn/77893/lying-graphs-republican-style Notice the bars don’t start at zero, giving you a false perception that the differences are large.

Lies, Damn Lies, and Bad Graphs http://www.tnr.com/blog/jonathan-cohn/77893/lying-graphs-republican-style Here the bars are adjusted and start at zero, and the graphs go to 100 percent, but this crushes the vertical distance, maybe leading you to believe there is NO difference.

Lies, Damn Lies, and Bad Graphs http://motherjones.com/kevin-drum/2010/09/big-spenders This does a better job of putting spending in perspective. Things don’t get lost in averages, trends are easy to see, and the fact that they use a line graph allows them to start at something besides zero.

Visual Medium Reports Presentations (static and dynamic) Paper (static, with time) Web (dynamic and interactive) Presentations (static and dynamic) The different medium require you to spend time thinking about the audience, the message, and they time they have for digesting the data.

“It’s not just about producing graphics for publication,” Aldhous explains. “It’s about playing around and making a bunch of graphics that help you explore your data. This kind of graphical analysis is a really useful way to help you understand what you’re dealing with, because if you can’t see it, you can’t really understand it. But when you start graphing it out, you can really see what you’ve got.” In an article headlined “Hey, Green Spender,” Aldhous and colleague Phil McKenna examined the gap between consumer perception and environmental realities across multiple industries such as retail, media, travel and leisure, food and beverages, technology, construction and chemicals. When the data was plotted, the differences between the perceptions and the realities were immediately visible – and the reporters knew they were on the right track.  “It’s not just about producing graphics for publication,” Aldhous explains. “It’s about playing around and making a bunch of graphics that help you explore your data. This kind of graphical analysis is a really useful way to help you understand what you’re dealing with, because if you can’t see it, you can’t really understand it. But when you start graphing it out, you can really see what you’ve got.” http://www.r-bloggers.com/r-is-hot-part-4/

http://www.excelcharts.com/blog/data-visualization-continuum/

Data vis is sometimes about simple error checking.

Four sets of data with the same correlation of 0.816

Percent Blue relative to Red?

Percent Blue relative to Red?

You want to make it as easy as possible to make visual interpretations You want to make it as easy as possible to make visual interpretations. Positions along a common scale are the easiest. Never require more difficult means when easier ones suffice. Don’t use 3 d bar charts when all you need it the height encoded.

More Bad versus good charts http://www.perceptualedge.com/examples.php http://lilt.ilstu.edu/jpda/charts/bad_charts1.htm Some bad 3D graphs http://www.slideshare.net/gschmitt/meet-the-connected-consumer

Bad

Better From http://hobershort.wordpress.com/2008/11/06/do-the-numbers/

Even Better* The problem remaining is starting Y axis at 0 compresses the differences. This is good and bad. Its bad because there is too much useless whitespace. Its good because it doesn’t distort the data. The other problem is it connects data points across time when in fact there are 4 years intervening and the composition of the groups are different of those time periods as some people move groups, but this is minor.

#Introduction #History of Plots   #The Explanatory Power of Graphics #Basic Philosophy of Approach #Graphical Integrity #Data Densities #Data Compression #Multifunctioning Graphical Elements #Maximize data-ink; minimize non-data ink #Small Multiples #Chartjunk #Colors #General Philosophy for Increasing Data Comprehension #Techniques for Increasing Data Comprehension #When NOT to Use Graphics #Aesthetics

http://chartchooser.juiceanalytics.com/

Chartjunk and Graphics Integrity

Types of chartjunk Chartjunk are non-data-ink or redundant data-ink decoration Unintended Optical Art (Moiré vibration) The Grid The Duck: Self-promoting Graphics

Unintended Optical Art Mainly rely on moiré effects Distracting appearance of vibration and movement The most common form of graphical clutter

Moiré Vibrations

The Grid Dark grid lines are chartjunk The grid should usually be muted or completely supressed

The Grid (cont’d) Marey’s train schedule

The Duck Self-promoting graphics: when the data measures become design elements

Duck Examples

"In our excitement to produce what we could only make before with great effort, many of us have lost sight of the real purpose of quantitative displays — to provide the reader with important, meaningful, and useful insight." — Stephen Few

Graphical Integrity Graphical excellence begins with telling the truth about the data Some examples of Lie

Two Principles The representation of numbers, as physically measured on the surface of the graphics, should be directly proportional to the numerical quantities represented Clear, detailed and thorough labeling should be used to defeat distoration

Violating rule 1 18 miles/gallon: 0.6 inches; 27.5miles/gallon: 5.3 inches

Lie Factor Rule 1 can be measured by Lie factor size of effect shown in graphics size of effect in data Lie Factor equal to one is ideal The previous slide has a lie factor of 14.8 Lie Factor =

Design and Data Variation Show data variation, not design variation 1973-1978: one vertical inch equals to $8.00. In 1979, One vertical inch equals $3-4 1973-1978: one horzontal inch equals 3.7 years, while 1979 equals 0.57 year

Example Lie factor: 9.5 The price of oil is inflated so need to be repaired.

Government Spending Tricks to exaggerate the growth of spending

Real Government Spending Tricks to exaggerate the growth of spending Tricks to exaggerate the growth of spending

Visual Area and Numerical Measure Tricking the reviewer with design variation is to use areas to show 1D data Lie factor: 2.8

Content is Essential Graphics must not quote data out of context

Content is Essential Graphics must not quote data out of context

On Using Color… The gray squares in the center are all the same color, but notice the apparent differences.

Rule #3: Use color only when needed to serve a particular communication goal.

Picking Color Schemes http://colorbrewer2.org/ http://kuler.adobe.com http://kuler.adobe.com/# Colorbrewer

Stop Visually Assaulting Me http://fosslien.com/rules/

The principles The representation of numbers, as physically measured on the surface of graphics, should be proportionally to the numerical quantities represented Use clear and detailed labeling Show data variation, not design variation The number of information-carrying dimensions depicted should not exceed the number of dimensions in the data (2 dimensions of data 2 D, 2 dimensions 3 D) Graphics should not quote data out of context

Why do graphics lie? Lack of quantitative skills of professional artists The doctrine that statistical data are boring The doctrine that graphics are only for the unsophisticated readers

Design is choice. The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper. — Edward Tufte, The Visual Display of Quantitative Information

Word Cloud This is similar to the unordered bar charts. But in this case the ordering is sacrificed for some aesthetic value. The hope is that you spend time with the data.

Wordle.net In this case good~times and bad~times appear equally. Notice that phrases are joined by ~

Spine Plot / Matrix Chart This encodes two types of data. http://pubs.logicalexpressions.com/Pub0009/LPMArticle.asp?ID=508

Bullet Graph Data dense. Each bar communicates a piece of data. Overlapping bars. Another data dense visual. http://www.perceptualedge.com/articles/misc/Bullet_Graph_Design_Spec.pdf http://peltiertech.com/WordPress/marimekko-replacement-overlapping-bars-easy/ Data dense. Each bar communicates a piece of data.

Bullet Graph

Bullet Graph

Choropleth “Heat Map” The problem with this type of chart is it makes the entire county appear equally at risk. Tufte doesn’t like these for cancer maps. It might work in this case since it represents the proportion of houses in the county in foreclosure. Since the data is legitimately geographically bounded it is useful.

RED STATE BLUE STATE PURPLE STATE http://www-personal.umich.edu/~mejn/election/2012/ PURPLE STATE

http://www.smartmoney.com/map-of-the-market/

Dynamic Charts

Encoding three variables and plotting them over time. Dynamic. http://taggertbrooks.com/wp-content/uploads/2009/10/foreclosure.htm http://taggertbrooks.com/wp-content/uploads/2009/10/foreclosure-state.htm http://www.uwlax.edu/faculty/brooks/prof/charts/foreclosure.htm http://www.uwlax.edu/faculty/brooks/prof/charts/foreclosure-state.htm

http://www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html?scp=1&sq=inflation%20chart&st=cse

http://www.nytimes.com/interactive/2008/09/15/business/20080916-treemap-graphic.html

Problem with this representation? 0 should mean absence of bar.

Avoid defaults in Excel

Show the data Data dense

Maximize Data Ink Ratio Minimize Non-Data Ink

Eliminate Chart Junk

Streamline Placement

http://blog.bissantz.com/napoleon Napoleon’s Russian offensive was a catastrophe. He started with 422,000 men in June 1812 and returned with less than 10,000 soldiers on October 7, 1813. His army had already lost 135,000 men two weeks into the campaign – although there were no major battles at this point. Napoleon wanted his troops to feed off the land during their advance but the enemy left nothing but ‘scorched earth’ during its retreat. Since they had no alcohol to sanitize the water, there was a rapid outbreak of dysentery. Before the battle of Smolensk on August 17th, disease, weakness and desertion had already decimated the troops to 175,000 men. Napoleon arrived in Moscow with 100,000 soldiers. He had already lost two-thirds of his main army – not to mention many horses. Undoubtedly a graphical milestone: This 1869 visualization drawn by the engineer Charles Joseph Minard shows the data from Napoleon’s disastrous Russian campaign from 1812–1813. Without further explanation, however, it is more of an appeal than an analysis. Click on the image to enlarge. Napoleon’s army faced many battles during its retreat. It lacked horses to pull the loads. The soldiers torched their wagons and left their dismantled canons behind. When winter arrived, they had no warm clothing. Since the horses had the wrong shoes, the number of accidents rose on the slick paths. They even burned the pontoons that they carried to build bridges just a few days before they reached the Beresina River. Lice thrived in the appalling hygienic conditions and transmitted typhus fever. Napoleon returned with less than 10,000 men.

http://www.forbes.com/sites/bruceupbin/2012/05/17/the-real-reason-that-ted-talk-was-censored-its-shoddy-and-dumb/