B.Ramamurthy Partially Based on Ben Jones Book [1] Communicating Data B.Ramamurthy Partially Based on Ben Jones Book [1] 11/19/2018
Overview In this session we will learn how to communicate data with tools such as Tableau software. 11/19/2018
Based on the book by B. JONEs [1] Tableau Based on the book by B. JONEs [1] 11/19/2018
Outline Huge opportunity to find and share insights contained in data: “data-driven” applications Communication involves: numbers, words, images and videos There are challenges: meaningful? fidelity? appeal? engaging? useful? breathtaking? Tableau software has developed and created a visualization querying engine and user interface to make it easier to discover and communicate with data. It frees the data from tables and spreadsheets that are indeed originally meant to be input medium Tableau is for everyone, no need to know a programming language Tableau desktop can connect to wide variety of data sources: relational databases, cloud sources, Hadoop technologies, etc. Available for only Windows operating system. 11/19/2018
Data Data refers to any kind of factual information that can be stored and digitally transmitted: Can be news articles, financial information in tables, data bases and so on. Communicating data is an important step in the data discovery process as shown in the next slide 11/19/2018
The discovery process Question Gathering data Structuring data Exploring data Communicating data 11/19/2018
Discovery process (contd.) This is a highly iterative process that begins with a question; Domain-specific. Specific question such as “which combination of products occurs most often?” General question such as “what can we learn about historical sales of our products?” Gathering data: Internal , external Buy or methods for gathering data yourself through feeds and APIs, free data available online (R data, amazon data) The Data Science book we used for earlier sessions has given quite a few sources for gathering data Verify the sources for reliability and fidelity 11/19/2018
Discovery Process (contd.) Data Structuring: This is an arduous process often refereed to as “data wrangling” and “data munging” Cleaning up tags and fillers and Filtering off unwanted data Data is formatted, shaped, merged, converted and made ready for data exploration step We looked at this with an R exercise in Session 3 Our Data science book has many examples: see the example using data extracted via NYTimes API in Chapter 5 11/19/2018
Discovery Process (contd.) Exploring data: data is viewed, analyzed from various points of views until one of more insights are gleaned. This exploration provides the insights/discoveries/knowledge/quantitative results Communicating data involves representing the discoveries in a form that the discoveries/insights can be easily understood by decision makers. 11/19/2018
Six principles of communicating data [1] Know your goal Who? Target audience What? Intended meaning Why? Desired effect Use the right data Does not have to be big data but right data: Example: the story of a single data point 14. Right amount of data: big or small Ethically and legally collected Select suitable visualizations Quantitative, ordinal and nominal data types, each demand different types of visualization Choices: position, length, angle, area, grey ramp, color ramp, color hue, shape, maps 11/19/2018
Six Principles (contd.) [1] Design for aesthetics (of course) Choose an effective medium and channel Medium: the form the message takes Channel: how it gets delivered Check the results Check the reach, understanding and impact 11/19/2018
Tableau Tableau is a drag and drop analysis and visualization software It is a level of abstraction above d3.js, three.js and R in that it requires no programming Learning curve for Tableau is flat; one can quickly ramp up and create useful and impressive visuals and analytics 11/19/2018
Main Components of Tableau Workbook Worksheet Data sources, Plots, charts. Dashboard(s): single interactive visual with one or more sheets worksheets Story: a sequence of interactive visuals with one or more dashboards and worksheets with navigation facilitating presentation dashboards 11/19/2018
Dimensions and Measures When a user connects to a data source, Tableau automatically classifies each field as either a Dimension or Measure. Dimensions are fields that are used to group or categorize the data Example: Country, State Measure Names Measures are fields that can be used compute: like summing and averaging. Area Population Latitude, longitude Measure values 11/19/2018
Usage of Tableau Excellent tool of team interaction: for encouraging discussions during team meetings to explore “what if” questions. No need for a prepared dashboard or story: just data exploration Dashboards enable you to communicate facts to your management team, to your customer via your web page. Example: create a dash board and display it on your web page, let your audience interact and watch and monitor their interest Story: lets you communicate results to any audience, specifically clients, decision makers, sales force and upper management. 11/19/2018
Tableau Exercises (See Ubbox for Instructions) We introduce the main features and basic plots and “worksheet” of Tableau using world data about GDP and population. (Exercise 1) Exercise 2 is a comprehensive example covering most features of a Tableau and an interesting real data set of NHL 100 top point scorers. Exercise 3 continues with the same NHL data with the focus preparing a Tableau “Dashboard” Final exercise is on designing a Tableau “Story” using the World data on GDP and population. 11/19/2018
Summary We studied principles and methods for communicating data More specifically we looked at Tableau for drag-drop data analytics and visualization We also worked on complete examples illustrating its features. 11/19/2018
References B. Jones. Communicating data with Tableau, Designing, developing and delivering data visualizations, O’Reilly, 2014. http://dataremixed.com/books/cdwt/ 11/19/2018