B.Ramamurthy Partially Based on Ben Jones Book [1]

Slides:



Advertisements
Similar presentations
Unified Workflow Discovery Tool Version 1.0 User Guide December 2010 – Intended for Internal Use Only.
Advertisements

Version 4 for Windows NEX T. Welcome to SphinxSurvey Version 4,4, the integrated solution for all your survey needs... Question list Questionnaire Design.
Open and save files directly from Word, Excel, and PowerPoint No more flash drives or sending yourself documents via Stop manually merging versions.
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
Chapter 5 Application Software.
Created by: Ian Osborn. Possibilities Of Movie Maker Windows Movie Maker allows users to organize and add effects to media clips that ordinarily would.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
2012 National BDPA Technology Conference Creating Rich Data Visualizations using the Google API Yolanda M. Davis Senior Software Engineer AdvancED August.
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
An Internet of Things: People, Processes, and Products in the Spotfire Cloud Library Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist.
© Paradigm Publishing Inc. 5-1 Chapter 5 Application Software.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services
Tableau Overview Sagar Samtani and Hsinchun Chen MIS 496A Spring
Review DATA VISUALIZATION WITH TABLEAU ONLINE TUTORIAL Training Guide Fundamentals.
Power Point Mistakes Contrasting background and text Microsoft Office PowerPoint 2007 enables users to quickly create high-impact, dynamic presentations,
Data Visualization with Tableau
Building Dashboards with JMP 13 Dan Schikore SAS, JMP
Centralities (Gephi and Python)
Join the Community
GIS Mapping for K-12 Students
AP CSP: Cleaning Data & Creating Summary Tables
Map & Geographic Visualization
Mapping for the interwebs
Presenter Date | Location
Gadgets and Dashboards
AESA – Module 8: Using Dashboards and Data Monitors
Central nodes (Python and Gephi).
Introducing the Smarter Balanced Digital Library
Getting Started with Power Query
Leveraging BI in SharePoint with PowerPivot and Power View
UNIT – V BUSINESS ANALYTICS
CHAPTER 2 Computer Software.
A Network Science Approach to Fake News Detection on Social Media
Presenter Date | Location
CHAPTER 8 Multimedia Authoring Tools
Gephi Gephi is a tool for exploring and understanding graphs. Like Photoshop (but for graphs), the user interacts with the representation, manipulate the.
Presentation Graphics
Microsoft Office 2003 Illustrated Introductory, Premium Edition
Office 2010 and Windows 7: Essential Concepts and Skills
Environmental Sensing Monitoring and Analyzing Water Temperatures
Increased Efficiency and Effectiveness
B.Ramamurthy Partially Based on Ben Jones Book [1]
Computers Are Your Future
CSc4730/6730 Scientific Visualization
Chapter GS Getting Started.
Microsoft Office Access 2003
What's New in eCognition 9
Application Software EIT, © Author Gay Robertson, 2016.
Central Nodes (Python and Gephi).
Microsoft Office Access 2003
Gephi.
Chapter GS Getting Started.
Introduction to Database Programs
Benefits of PowerPoint
Chapter GS Getting Started.
Web AppBuilder for ArcGIS
Donald Donais Minnesota SharePoint Users Group – April 2019
Introduction to Database Programs
Centralities Using Gephi and Python Prof. Ralucca Gera,
Integrating Office 2013 Programs
Chapter GS Getting Started.
What's New in eCognition 9
Getting Started with Data
Tableau Desktop depends on the breakthrough technology that lets you drag and drop to analyze data. You can associate with information.
Data Analytics Case Study
The Data of Visualization
Presentation transcript:

B.Ramamurthy Partially Based on Ben Jones Book [1] Communicating Data B.Ramamurthy Partially Based on Ben Jones Book [1] Rich's Big Data Training 11/7/2018

Review We spent most of Sessions 1, 2 and 3 with R data analysis software and the Rstudio integrated development environment. Packages, libraries, plots, charts, maps, external data access and worked on many exploratory data analysis In Session 4 we looked at amazon cloud services; we will revisit this in our next session. In Session 5 we looked at Javascript and the expressive JS libraries in d3.js, three.js and jquery. Today in Session 6, we will move to a level of abstraction above all these in two diverse approaches to data analytics and visualization in Gephi and Tableau. Rich's Big Data Training 11/7/2018

Overview In this session we will learn how to communicate data with tools such as Gephi and Tableau software. We will begin with Gephi which is quite focused on networks and graphs; then work on Tableau that is quite broad in its application area. Gephi and Tableau are somewhat complimentary to each other. Rich's Big Data Training 11/7/2018

Gephi Gephi is an open source freeware for analyzing and visualizing networks and graphs, and detecting communities and discovering relationships; Gephi is originally a contribution of an open source community in France. Especially useful for social network data analysis Download Gephi 0.8.2 for Windows or the latest version Rich's Big Data Training 11/7/2018

Gephi Applications Analyzing Social Networks Detecting communities Dynamic Networks Twitter Data analysis Text Network analysis (“mine the talk”) Rich's Big Data Training 11/7/2018

Gephi data and Algortihms Gephi data is very simple: three types of data: nodes, edges and attributes. edges are always between two nodes and attributes are data associated to nodes or edges, like some string or integer results. Nodes and edges structure is called the network topology. Attributes are called network data. Gephi uses well-known algorithms for creation of the graph and graph operations. This information about the algorithms used is given at every step by in the information icon. Rich's Big Data Training 11/7/2018

Creating Gephi Dataset At the fundamental level Gephi needs two sets of data: nodes and edges While Gephi accepts a variety of formats for inputs we will look at graph data represented by (i) xml file with node and edges tags, (ii) csv file each for nodes and edges Rich's Big Data Training 11/7/2018

Gephi Tool Layout Overview Data Laboratory Preview 11/7/2018 Rich's Big Data Training 11/7/2018

Gephi Workflow Data Laboratory facilitates importing data into the Gephi workspace Overview provides various options for creating and manipulating the basic graph Clustering Ranking Labeling Filtering The Review panel provides aesthetics and export features for capturing the graph created. Rich's Big Data Training 11/7/2018

Force Atlas Analysis Graphs are drawn based on similarities and differences (no similarities) in data. Settings can be customized to place more emphasis on individual nodes independence from one another and relative (semantic) proximity to one another. For example, you can specify Attraction Strength and Repulsion Strength: former extracting similarities in the creation of the graph and latter updating the graph for dissimilarities. Attraction pulls the nodes of the graph towards the center and drives dissimilar nodes to the perimeter. Clusters/communities can be identified and actions can be taken to target specific communities for sales campaign or any such activities. http://webatlas.fr/tempshare/ ForceAtlas2_Paper.pdf . For those with statistical backgrounds, the tool offers many models along with detailed analysis. Other algorithms: Fructerman-Reingold, Yifan Hu Rich's Big Data Training 11/7/2018

Application of Gephi Gephi graph analysis is appropriate for applications that needs discovery of clusters of people, customer, employees, candidates, sales people etc. Customer relationship management, customer segmentation for targeted sales campaign, employee/people resource clusters for various business activities, election campaigns, fund raising campaigns (developments). Rich's Big Data Training 11/7/2018

Gephi Exercises We introduce the basic features of Gephi using a data set from digital humanities project. This is a partial project that simply introduces main features. Exercise 2 is a complete network example with a known data set of Les Miserables Broadway show. Rich's Big Data Training 11/7/2018

Based on the book by B. JONEs [1] Tableau Based on the book by B. JONEs [1] Rich's Big Data Training 11/7/2018

Outline Huge opportunity to find and share insights contained in data: “data-driven” applications Communication involves: numbers, words, images and videos There are challenges: meaningful? fidelity? appeal? engaging? useful? breathtaking? Tableau software has developed and created a visualization querying engine and user interface to make it easier to discover and communicate with data. It frees the data from tables and spreadsheets that are indeed originally meant to be input medium Tableau is for everyone, no need to know a programming language Tableau desktop can connect to wide variety of data sources: relational databases, cloud sources, Hadoop technologies, etc. Available for only Windows operating system. Rich's Big Data Training 11/7/2018

Data Data refers to any kind of factual information that can be stored and digitally transmitted: Can be news articles, financial information in tables, data bases and so on. Communicating data is an important step in the data discovery process as shown in the next slide Rich's Big Data Training 11/7/2018

The discovery process Question Gathering data Structuring data Exploring data Communicating data Rich's Big Data Training 11/7/2018

Discovery process (contd.) This is a highly iterative process that begins with a question; Domain-specific. Specific question such as “which combination of products occurs most often?” General question such as “what can we learn about historical sales of our products?” Gathering data: Internal , external Buy or methods for gathering data yourself through feeds and APIs, free data available online (R data, amazon data) The Data Science book we used for earlier sessions has given quite a few sources for gathering data Verify the sources for reliability and fidelity Rich's Big Data Training 11/7/2018

Discovery Process (contd.) Data Structuring: This is an arduous process often refereed to as “data wrangling” and “data munging” Cleaning up tags and fillers and Filtering off unwanted data Data is formatted, shaped, merged, converted and made ready for data exploration step We looked at this with an R exercise in Session 3 Our Data science book has many examples: see the example using data extracted via NYTimes API in Chapter 5 Rich's Big Data Training 11/7/2018

Discovery Process (contd.) Exploring data: data is viewed, analyzed from various points of views until one of more insights are gleaned. This exploration provides the insights/discoveries/knowledge/quantitative results Communicating data involves representing the discoveries in a form that the discoveries/insights can be easily understood by decision makers. Rich's Big Data Training 11/7/2018

Six principles of communicating data [1] Know your goal Who? Target audience What? Intended meaning Why? Desired effect Use the right data Does not have to be big data but right data: Example: the story of a single data point 14. Right amount of data: big or small Ethically and legally collected Select suitable visualizations Quantitative, ordinal and nominal data types, each demand different types of visualization Choices: position, length, angle, area, grey ramp, color ramp, color hue, shape, maps Rich's Big Data Training 11/7/2018

Six Principles (contd.) [1] Design for aesthetics (of course) Choose an effective medium and channel Medium: the form the message takes Channel: how it gets delivered Check the results Check the reach, understanding and impact Rich's Big Data Training 11/7/2018

Tableau Tableau is a drag and drop analysis and visualization software It is a level of abstraction above d3.js, three.js and R in that it requires no programming Learning curve for Tableau is flat; one can quickly ramp up and create useful and impressive visuals and analytics Rich's Big Data Training 11/7/2018

Main Components of Tableau Workbook Worksheet Data sources, Plots, charts. Dashboard(s): single interactive visual with one or more sheets worksheets Story: a sequence of interactive visuals with one or more dashboards and worksheets with navigation facilitating presentation dashboards Rich's Big Data Training 11/7/2018

Dimensions and Measures When a user connects to a data source, Tableau automatically classifies each field as either a Dimension or Measure. Dimensions are fields that are used to group or categorize the data Example: Country, State Measure Names Measures are fields that can be used compute: like summing and averaging. Area Population Latitude, longitude Measure values Rich's Big Data Training 11/7/2018

Usage of Tableau Excellent tool of team interaction: for encouraging discussions during team meetings to explore “what if” questions. No need for a prepared dashboard or story: just data exploration Dashboards enable you to communicate facts to your management team, to your customer via your web page. Example: create a dash board and display it on your web page, let your audience interact and watch and monitor their interest Story: lets you communicate results to any audience, specifically clients, decision makers, sales force and upper management. Rich's Big Data Training 11/7/2018

Tableau Exercises We introduce the main features and basic plots and “worksheet” of Tableau using world data about GDP and population. (Exercise 3) Exercise 4 is a comprehensive example covering most features of a Tableau and an interesting real data set of NHL 100 top point scorers. Exercise 5 continues with the same NHL data with the focus preparing a Tableau “Dashboard” Final exercise is on designing a Tableau “Story” using the World data on GDP and population. Rich's Big Data Training 11/7/2018

Summary We studied principles and methods for communicating data More specifically we looked at Gephi for network/graph analysis, Tableau for drag-drop data analytics and visualization We also worked on complete examples illustrating the features of the two tools. Rich's Big Data Training 11/7/2018

References B. Jones. Communicating data with Tableau, Designing, developing and delivering data visualizations, O’Reilly, 2014. http://dataremixed.com/books/cdwt/ Rich's Big Data Training 11/7/2018