Data Visualization The commonality between science and art is in trying to see profoundly - to develop strategies of seeing and showing Edward Tufte.

Slides:



Advertisements
Similar presentations
Data Visualization The commonality between science and art is in trying to see profoundly - to develop strategies of seeing and showing Edward Tufte.
Advertisements

Introduction to Excel 2007 Part 2: Bar Graphs and Histograms February 5, 2008.
Creating a Histogram using the Histogram Function.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 3.1 Chapter Three Art and Science of Graphical Presentations.
Introduction to Excel 2007 Bar Graphs & Histograms Psych 209 February 1st, 2011.
TABLES, CHARTS, AND GRAPHS. TABLES  A Table is simply a set of numbers from which you could draw a graph or chart.  A table should provide a clear summary.
Graphing. When to Graph Your Data When "a picture could tell billions of words" To impress people Dramatize a research finding Some people think visually.
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
Chapter 5 Review: Plotting Introduction to MATLAB 7 Engineering 161.
CMPT 880/890 Writing labs. Outline Presenting quantitative data in visual form Tables, charts, maps, graphs, and diagrams Information visualization.
Quantitative Skills 1: Graphing
Data Presentation Research Methods. Data Presentation: Figures and Tables Consider your audience. The reader should understand (generally) the figure.
Graphing Tutorial William Hornick CS 101. Overview You will be given a brief description, example, and “how to create” for each of the following: You.
COM: 111 Introduction to Computer Applications Department of Information & Communication Technology Panayiotis Christodoulou.
GRAPHING RULES.
Chapter 2 Descriptive Statistics.
Relative Cumulative Frequency Graphs
Overview of R and ggplot2 for graphics
Charts & Graphs CTEC V
Add More Zing to your Dashboards – Creating Zing Plot Gadgets
Reading a file R can read a wide variety of input formats Text,
Scatterplot #SCATTERPLOT: USEFUL FOR PLOTTING RELATIONSHIPS BETWEEN TWO NUMERIC VARIABLES library(ggvis) library(DBI) require(RMySQL) # set a driver m
Data Visualizer.
Data Visualization Jeopardy
Probability & Statistics Displays of Quantitative Data
Excel Part 4 Working with Charts and Graphics
Charts and Graphs V
Two-Dimensional Plots
Cartography and Labeling
2.2 Bar Charts, Pie Charts, and Stem and Leaf Diagram
Excel Part 4 Working with Charts and Graphics
Century 21 Computer Skills and Applications
Using Excel to Graph Data
graphical representation of data
Chapter 2 Describing Distributions of Data
graphical representation of data
Chart and Graphs used in Business CHART COMPONENTS
Data Visualization The commonality between science and art is in trying to see profoundly - to develop strategies of seeing and showing Edward Tufte.
Guide to Using Excel 2003 For Basic Statistical Applications
Statistical Tables and Graphs
Lecture 3 part-2: Organization and Summarization of Data
Graphs in Science Chapter 2 Section 3.
Analyzing One-Variable Data
CSc4730/6730 Scientific Visualization
Frequency Distributions and Their Graphs
MatLab – 2D Plots 2 MATLAB has many built-in functions and commands to create various types of plots. Instructor notes: We start with an example of some.
Chart and Graphs used in Business CHART COMPONENTS
Environmental Science
Statistical Reasoning Discussion Paragraph next time….
graphical representation of data
Chart and Graphs used in Business CHART COMPONENTS
Using Excel to Graph Data
Overview of R and ggplot2 for graphics
Chart Components Lesson 6 – Working With Charts and Graphics, continued 4.02 Chart Components.
Keller: Stats for Mgmt & Econ, 7th Ed
Statistical Reasoning
Chart and Graphs used in Business CHART COMPONENTS
Keller: Stats for Mgmt & Econ, 7th Ed
Chart and Graphs used in Business CHART COMPONENTS
Organizing, Displaying and Interpreting Data
Chart and Graphs used in Business CHART COMPONENTS
Displaying Distributions with Graphs
Introduction to Excel 2007 Part 3: Bar Graphs and Histograms
Chart and Graphs used in Business CHART COMPONENTS
ESRM 250/CFR 520 Autumn 2009 Phil Hurvitz
Writing Technical Reports
Excel Part 4 Working with Charts and Graphics
Microsoft Office Illustrated Fundamentals
Interactive Data Visualizations using R and ggvis
Introduction to Excel 2007 Part 1: Basics and Descriptive Statistics Psych 209.
Presentation transcript:

Data Visualization The commonality between science and art is in trying to see profoundly - to develop strategies of seeing and showing Edward Tufte

Visualization skills Humans are particularly skilled at processing visual information A natural capability compared to reading which is a learning skill Our ancestors were those who were efficient visual processors and quickly detected threats and used this information to make effective decisions

A graphical representation of Napoleon Bonaparte's invasion of and subsequent retreat from Russia during 1812. The graph shows the size of the army, its location and the direction of its movement. The temperature during the retreat is drawn at the bottom of figure, which was drawn by Charles Joseph Minard in 1861 and is generally considered to be one of the finest graphs ever produced.

Wilkinson’s grammar of graphics Data A set of data operations that create variables from datasets (e.g., spreadsheets and databases (e.g., Classic Models)) Trans Variable transformations (converting data into a format suitable for the intended visualization) Scale Scale transformations (good for controlling the visualization of data)

Wilkinson’s grammar of graphics Coord A coordinate system describing where things are located (e.g., longitude and latitude for maps, and x-axis and y-axis for graphs) Element Graph and its aesthetic attributes (e.g., scatterplot of year against co2 emissions) Guide One or more guides (e.g., axes and legends can be useful for guiding what is plotted in a graph)

ggvis An implementation of the grammar of graphics in R The grammar describes the structure of a graphic A graphic is a mapping of data to a visual representation ggvis http://had.co.nz/ggplot/

Data Spreadsheet approach Database Use an existing spreadsheet or create a new one Export as CSV file Database Execute SQL query

Transformation A transformation converts data into a format suitable for the intended visualization # TRANSFORMATION: url <-'http://people.terry.uga.edu/rwatson/data/carbon1959-2011.txt' carbon <- read.table(url, header=T, sep=',') head(carbon) # compute a new column in carbon containing the relative change in CO2 since pre- # industrial periods, when the value was 280ppm. carbon$relCO2 = (carbon$CO2-280)/280

Coord A coordinate system describes where things are located Most graphs are plotted on a two-dimensional (2D) grid with x (horizontal) and y (vertical) coordinates The default coordinate system is Cartesian (histogram)

Element An element is a graph and its aesthetic attributes Build a graph by adding layers library(ggvis) library(readr) # ELEMENT: CO2 EMISSION BY YEAR carbon %>% ggvis(~year,~CO2) %>% layer_points(fill:='red') # use pipe function (%>%) to create a pipeline of commands # the code above reads like a recipe. It says: # 1. take the carbon data, then # 2. use the package ggvis to plot year by co2, and # 3. specify the plot to contain red points.

Element

Scale # SCALE: GOOD IDEA TO HAVE A ZERO POINT FOR THE Y-AXIS (DONT DISTORT THE SLOPE!) carbon %>% ggvis(~year,~CO2) %>% layer_points(fill:='red') %>% scale_numeric('y',zero=T) # perform steps 1-3 of the ELEMENT code, and then, # 4. set the scale for the y-axis to zero.

Axes # AXES: HELP THE READER UNDERSTAND THE GRAPH carbon %>% ggvis(~year,~relCO2) %>% layer_lines(stroke:='blue') %>% scale_numeric('y',zero=T) %>% add_axis('y', title = "CO2 ppm of the atmosphere", title_offset=50) %>% add_axis('x', title ='Year', format = '####') # the code above says: # 1. take the carbon data, then # 2. use the package ggvis to plot year by relco2, then # 3. specify the plot to contain a continuous blue line, then # 4. set the scale for the y-axis to zero, then # 5. add a title for the y-axis that is moved a bit to the left to improve readability, and # 6. add a title for the x-axis, specifying a format of 4 consecutive digits for displaying year on the x-axis

Axes

Guides Axes and legends are both forms of guides Helps the viewer to understand a graphic

Exercise Create a point plot using the data in the following table. Add a title for both x- and y- axes. Year 1804 1927 1960 1974 1987 1999 2012 2027 2046 Population (billions) 1 2 3 4 5 6 7 8 9

Histogram # HISTOGRAM: USEFUL FOR SHOWING THE DISTRIBUTION OF VALUES IN A SINGLE COLUMN url <- 'http://people.terry.uga.edu/rwatson/data/centralparktemps.txt' t <- read.table(url, header=T, sep=',') t$C <- round((t$temperature - 32)*5/9,1) t %>% ggvis(~C) %>% layer_histograms(width = 2, fill:='cornflowerblue') %>% add_axis('x',title='Celsius') %>% add_axis('y',title='Frequency') # width refers to the size of the bin. # this means that the bin above the tick mark 10 contains all values in the range 9 to 11. # The code above says: # 1. read the url, then # 2. read the url content as table t, then # 3. create a new column in t that transforms f temperature to celsius and rounds it to one decimal place, then # 4. take the t data, then # 5. use the package ggvis to plot celsius temperature, then # 6. specify the plot to be a histogram with width 2 and color cornflowerblue, then, # 7. add a title for the x-axis, and # 8. add a title for the y-axis.

Histogram

Exercise Create a histogram of CO2 using the carbon 1959-2011 data. Add a title for both x- and y- axis. url <-'http://people.terry.uga.edu/rwatson/data/carbon1959-2011.txt' carbon <- read.table(url, header=T, sep=',')

Bar graph # BAR GRAPH: USEFUL FOR GRAPHING CATEGORICAL DATA library(DBI) require(RMySQL) # set a driver m<-dbDriver("MySQL") # connect to the database conn <- dbConnect(m,user='student',password='student',host='wallaby.terry.uga.edu',dbname='ClassicModels') # if error "in .local(drv, ...): cannot allocate a new connection: 16 connections already opened" appears loop through the connections and delete them. If there is no problem move on to query the database. cons<-dbListConnections(MySQL()) for(con in cons) dbDisconnect(con) # query the database and create file for use with R d <- dbGetQuery(conn,"select productLine from Products;") # plot the number of product lines by specifying the appropriate column name d %>% ggvis(~productLine) %>% layer_bars(fill:='chocolate') %>% add_axis('x',title='Product line') %>% add_axis('y',title='Count') # The code immediately above says: # 1. take the d data, then # 2. use the package ggvis to plot productline, the # 3. specify the plot to be a bar graph with color chocolate, then, # 4. add a title for the x-axis, and # 5. add a title for the y-axis.

Bar graph

Exercise Using Classic Models, create a bar graph to show how many offices each country has.