CPT-S 415 Big Data Yinghui Wu EME B45.

Slides:



Advertisements
Similar presentations
Network Visualization by Semantic Substrates Aleks Aris Ben Shneiderman.
Advertisements

H3: Laying Out Large Directed Graphs in 3D Hyperbolic Space Tamara Munzner, Stanford University.
Jun 2, 2014 IAT Trees2 Chapter 3.2 of Spence ______________________________________________________________________________________ SCHOOL OF INTERACTIVE.
Abstract Syntax Tree Rendering Noah Brickman CMPS 203.
Graph Visualization cs5764: Information Visualization Chris North.
SIMS 247: Information Visualization and Presentation jeffrey heer
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
1 i247: Information Visualization and Presentation Marti Hearst April 2, 2008.
Tree Structures (Hierarchical Information) cs5764: Information Visualization Chris North.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Data Mining Techniques
IAT Graphs ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT]
By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.
Information Design and Visualization
1 SWE 513: Software Engineering Usability II. 2 Usability and Cost Good usability may be expensive in hardware or special software development User interface.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Lecture 12: Network Visualization Slides are modified from Lada Adamic, Adam Perer, Ben Shneiderman, and Aleks Aris.
Introduction to Information Visualization Robert Putnam Introduction to Information Visualization - Spring 2013.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
A Focus+Context Technique Based on Hyperbolic Geometry for Visualizing Large Hierarchies. John Lamping, Ramana Rao, and Peter Pirolli Xerox Palo Alto Research.
IAT 814 Trees Chapter 3.2 of Spence ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS +
Advanced Scientific Visualization
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
Copyright © 2005, Pearson Education, Inc. Slides from resources for: Designing the User Interface 4th Edition by Ben Shneiderman & Catherine Plaisant Slides.
Graph Visualization and Beyond … Anne Denton, April 4, 2003 Including material from a paper by Ivan Herman, Guy Melançon, and M. Scott Marshall.
Mao Lin Huang University of Technology, Sydney, Visual Representations of Data and Knowledge.
Hyperbolic Trees A Focus + Context Technique John lamping Ramana Rao Peter Pirolli Joy Mukherjee.
Innovative UI Ideas Marti Hearst SIMS 213, UI Design & Development April 20, 1999.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Visualization Four groups Design pattern for information visualization
Chapter 11 Information Visualization
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
CPT-S Topics in Computer Science Big Data 1 Yinghui Wu EME 49.
Data Visualization Fall Information Visualization Fall 2015Data Visualization2 Upon now, we dealt with scientific visualization (scivis) Scivis.
Introduction to Machine Learning, its potential usage in network area,
Cohesive Subgraph Computation over Large Graphs
OPERATING SYSTEMS CS 3502 Fall 2017
Decision Support Systems
What Is Cluster Analysis?
Automatic Video Shot Detection from MPEG Bit Stream
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
CSE5544 Final Project Interactive Visualization Tool(s) for IEEE Vis Publication Exploration and Analysis Team Name: Publication Miner Team Members:
Advanced Scientific Visualization
CSE5544 Final Project Interactive Visualization Tool(s) for IEEE Vis Publication Exploration and Analysis Team Name: Publication Miner Team Members:
Chapter 13 The Data Warehouse
IAT 355 Trees2 ______________________________________________________________________________________.
Personalized Social Image Recommendation
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Hierarchies (Trees) Definition Examples
Usability & Human Factors
Professor John Canny Fall 2001 Nov 29, 2001
Professor John Canny Spring 2003
Information Visualization
Information Visualization Picture worth 1000 words...
Multi-Dimensional Data Visualization
CSc4730/6730 Scientific Visualization
CSc4730/6730 Scientific Visualization
Information Design and Visualization
حيـــم الر حمن الر الله بســـم.
Information Visualization
Analysis models and design models
User interface design.
Introduction to Visual Analytics
Information Visualization 2 Case Study: Portraying Hierarchies
Information Visualization (Part 1)
CHAPTER 7: Information Visualization
Group 9 – Data Mining: Data
CHAPTER 14: Information Visualization
Charts A chart is a graphic or visual representation of data
Comp 15 - Usability & Human Factors
Presentation transcript:

CPT-S 415 Big Data Yinghui Wu EME B45

CPT_S 415 Big Data Data Visualization and Navigation (Information Visualization) Information Visualization Graph visualization Graph drawing and graph visualization Graph layout courtesy from Ivan Herman

“Close the loop” Data quality -> knowledge quality (potential new area) Data interpretation Visual analytics Graph visualization

http://www.matthiasdittrich.com/projekte/narratives/visualisation/

Information Visualization 5

Linking data to human Collecting information is no longer a problem, but extracting value from information collections has become progressively more difficult. Visualization links the human eye and computer, helping to identify patterns and to extract insights from large amounts of information Visualization technology shows considerable promise from increasing the value of large-scales collections of information Visualization can be classified as scientific visualization, software visualization, and information visualization

Visualization Classification Scientific Visualization helps understanding physical phenomena in data (Nielson, 1991) Mathematical model plays an essential role Isosurfaces, volume rendering, and glyphs are commonly used techniques Isosurfaces depict the distribution of certain attributes Volume rendering allows views to see the entire volume of 3-D data in a single image (Nielson, 1991) Glyphs provides a way to display multiple attributes through combinations of various visual cues (Chernoff, 1973)

Visualization Classification Software Visualization helps people understand and use computer software effectively (Stasko et al. 1998) Program visualization helps programmers manage complex software (Baecker & Price, 1998) Visualizing the source code (Baecer & Marcus, 1990) data structure, and the changes made to the software (Erick et al., 1992) Algorithm animation is used to motivate and support the learning of computational algorithms http://www.algomation.com/ http://www.algomation.com/algorithm/quick-sort-visualization

What is Information Visualization? Information visualization helps users identify patterns, correlations, or clusters Structured information Graphical representation to reveal patterns Integration with various data mining techniques (Thealing et al., 2002; Johnston, 2002) Unstructured Information Need to identify variables and construct visualizable structures The depiction of information using spatial/graphical representations, to facilitate comparison, pattern recognition, change detection, and other cognitive skills by making use of the visual system.

Information Visualization Problem: HUGE Datasets: How to understand them? Solution Take better advantage of human perceptual system Convert information into a graphical representation. Issues How to convert abstract information into graphical form? Do visualizations do a better job than other methods?

Goals of Information Visualization Make large datasets coherent (Present huge amounts of information compactly) Present information from various viewpoints Present information at several levels of detail (from overviews to fine structure) Support visual comparisons Tell stories about the data

“Sci vis” versus “Info vis” Scientific visualization: specifically concerned with data that has a well-defined representation in 2D or 3D space (e.g., from simulation mesh or scanner). *Adapted from The ParaView Tutorial, Moreland 13

Information visualization Information visualization: concerned with data that does not have a well-defined representation in 2D or 3D space (i.e., “abstract data”). 14

Data Attributes Data attributes: Infovis has more data types than numerical values Data Type Attribute Domain Operations Examples nominal Unordered set Comparison (=) Text, references, syntax elements ordinal Ordered set Ordering (=, <, >) Ratings (e.g., bad, average, good) discrete Integer Integer arithmetic Line of code continuous real Real arithmetic Code metrics 15

Info Viz vs Sic Viz Scivis Infovis Data Domain spatial, compact non-spatial, abstract Attribute Type numerical any data type Data Points Samples over the domain Tuples of attributes Cells Support interpolation Describe relations Interpolation Piecewise continuous Can be inexistent

Information representation in InfoViz 17

Information Representation Shneiderman (1996) proposed seven types of representation methods: 1-D, 2-D, 3-D Multidimensional Tree Network Temporal approaches

1-D TileBars (Hearst, 1995)

2-D & 3-D 2D: To represent information as two-dimensional visual objects Visualization systems based on self-organizing map (SOM) (Kohonen, 1995) To help users deal with the large number of categories created for the mass textual data 3D: To represent information as three-dimensional visual objects WebBook system folds web pages into three-dimensional books (Card et al., 1996) 3-D version of a tree or network 3-D hyperbolic tree to visualize large-scale hierarchical relationships (Munzner 2000) http://www.start.umd.edu/gtd/globe/index.html

Multidimensional To represent information as multidimensional objects and projects them into a three-dimensional or a two-dimensional space Dimensionality reduction algorithm will be used Multidimensional scaling (MDS) Hierarchical clustering K-means algorithms Principle components analysis Examples SPIRE system (Wise et al. 1995) VxInsight System (Boyack et al. 2002) Glyph representation has been used in various social visualization techniques (Donath, 2002) to describe human behavior during computer-mediated communication (CMC)

Table Visualization Simple list; does not support analysis, or insight 22

Table Visualization Aided by Sorting, Bar Graph, Evolution Icons Dense Pixel Display: Bar Graph, Table Lens 23

To represent hierarchical relationship Tree To represent hierarchical relationship Challenge: nodes grows exponentially Different layout algorithms have been applied Examples Tree-Map allocates space according to attributes of nodes (Johnson & Shneiderman 1991) Cone Tree system uses 3-D visual structure to pack more nodes on the screen (Robertson et al., 1991) Hyperbolic Tree projects subtrees on a hyperbolic plane and puts the plane (Lamping et al., 1995)

Tree Visualization The TreeMap (Johnson & Shneiderman ‘91) Idea: Show a hierarchy as a 2D layout Fill up the space with rectangles representing objects Size on screen indicates relative size of underlying objects. Treemap method: visualize the tree structure that use virtually every pixel of the display space to convey information Every subtree is represented by a rectangle, that is partitioned into smaller rectangles with correspond to its children. The position of the slicing lines determines the relative sizes of the child rectangles. For every child, repeat the slicing recursively, swapping the slicing direction from vertical to horizontal or conversely 25

Tree Visualization: Examples Treemap 26

Tree Visualization Rooted-Tree Layout Radial-Tree Layout Ball-and-stick visualization: use the position and appearance of the glyphs Rooted-Tree Layout Radial-Tree Layout 3D Cone-Tree Layout 27

Graphs and Networks To represent complex relationships that a simple tree structure is insufficient to represent Citation among academic papers( C. Chen & Paul 2001; Mackinlay et al., 1995) Documents linked by the internet (Andrews, 1995) Spring-embedder model (Eades, 1984) along with its variants ( Davidson & Harel, 1996;l Fruchterman & Reingold, 1991) have become the most popular drawing algorithms.

Examples Network visualization (vizster) 29

To represent information based on temporal order Location and animation are commonly used visual variables to reveal the temporal aspect of information Examples Perspective Wall lists objects along the x-axis based on time sequence and presents attriibutes along the y-axis (Robertson et al., 1993) In VxInsight system (Boyack et al., 2002), the landscape changes as the time changes.

Examples Geo data mapping Demo Cyber Attacks http://map.norsecorp.com/#/ 31

Additional Examples http://map.norsecorp.com/v1/ NY Times words, words, numbers Visual Complexity (from book by Manuel Lima) 50 examples (from June 2009, somewhat dated) D3 Gallery 32

Visualization components User-Interface Interaction: Color, Size, Texture, Proximity, Annotation, Interactivity Immediate interaction not only allows direct manipulation of the visual objects displayed but also allows users to select what to be displayed (Card et al., 1999) Shneiderman (1996) summarizes six types of interface functionality: Overview, Zoom, Filtering, Details on demand, Relate, history Information Analytics Indexing Extract the semantics of information Analysis Clustering, classification 33

Visualization pipeline Acquire -> Parse -> Filter -> Mine -> Represent -> Refine -> Interact Represent Parse Interact Acquire Refine Filter/Mine

Visualization software Host language (C/C++/Java/Python) plus OpenGL Stat/math package with graphics R MATLAB Special-purpose info viz software Earth mapping, biological network visualization, etc. Browser-enabled graphics/info viz packages Google Charts Processing / Processing.js D3 Java + Flash (becoming rarer) 35

Graph Drawing and Graph Visualization 36

Information Visualization vs. Graph Drawing Old topic, many books, etc. May have other goals than visualization E.g. VLSI design Graph Visualization Size key issue Usability requires nodes to be discernable Navigation considered

Graph Visualization Hierarchical graph of the evolution of the UNIX operating system 38

Graph Visualization The Call Graph Three concentric rings show containment Files Classes Methods The curved lines indicate function calls 39

Graph visualization Circle chart 40

When is Graph Visualization Applicable? Ask the question: is there an inherent relation among the data elements to be visualized? If YES – then the data can be represented by nodes of a graph, with edges representing the relations. If NO – then the data elements are “unstructured” and goal is to use visualization to analyze and discover relationships among data. Source: Herman, Graph Visualization and Navigation in Information Visualization: a Survey

Traditional Graph Drawing Optimization based on a set of criteria (mathematical aesthetics) Minimize edge crossings Minimize area Maximize smallest angle Maximize symmetry Do all at once is hard. Often unsuitable for interactive visualization Many optimizations are NP-Hard Approximation algorithms very complex Precompute layout, or compute once at the beginning of an application then support interaction Slide adapted from Jeff Heer

Traditional Graph Drawing poly-line graphs (includes bends) orthogonal drawing planar, straight-line drawing upward drawing of DAGs

Layout Approaches Tree-ify the graph - then use tree layout Hierarchical graph layout Radial graph layout Optimization-based techniques Includes spring-embedding / force-directed layout Adjacency matrices Structurally-independent layout On-demand revealing of subgraphs Distortion-based views Hyperbolic browser (this list is not meant to be exhaustive)

Tree-based graph layout Select a tree-structure out of the graph Breadth-first-search tree Minimum spanning tree Other domain-specific structures Use a tree layout algorithm Benefits Fast, supports interaction and refinement Drawbacks Limited range of layouts

Tree-ify the graph

Traditional Tree Layouts H-tree layout: best for balanced trees Radial view Balloon view: related to 3-d cone tree

Hierarchical graph layout Use directed structure of graph to inform layout Order the graph into distinct levels this determines one dimension Now optimize within levels determines the second dimension minimize edge crossings, etc The method used in graphviz’s “dot” algorithm Great for directed acyclic graphs, but often misleading in the case of cycles

Hierarchical Graph Layout Evolution of the UNIX operating system Hierarchical layering based on descent

Hierarchical graph layout Gnutella network

Radial Layout Animated Exploration of Graphs with Radial Layout, Yee et al., 2001 Gnutella network

Optimization-based layout Specify constraints for layout Series of mathematical equations Hand to “solver” which tries to optimize the constraints Examples Minimize edge crossings, line bends, etc Multi-dimensional scaling (preserve multi-dim distance) Force-directed placement (use physics metaphor) Benefits General applicability Often customizable by adding new constraints Drawbacks Approximate constraint satisfaction Running time; “organic” look not always desired

Example: Force-Directed Layout Uses physics model to layout graph, Nodes repel each other, edges act as springs, and some amount of friction or drag force is used. Special techniques to dampen “jitter”. http://getspringy.com/demo.html visual wordnet http://www.kylescholz.com/projects/wordnet visuwords http://www.visuwords.com/

Hyperbolic Browser: Inspiration

Using Distortion and Focus + Context The Hyperbolic Tree Browser The Hyperbolic Browser: A Focus + Context Technique for Visualizing Large Hierarchies, Lamping & Rao, CHI 1996. http://www.inxight.com/products/sdks/st/ Uses non-Euclidean geometry as basis of focus + context technique The hyperbolic browser is a projection into a Euclidean space – a circle The circumference of a circle increases at a linearly with radius (2 PI) The circumference of a circle in hyperbolic space increases exponentially Exponential growth in space available with linear growth of radius Makes tree layout easy Size of objects decreases with growth of radius Reduces expense of drawing trees when cut-off at one pixel

Appearance of Initial Layout Root mapped at center Multiple generations of children mapped out towards edge of circle Drawing of nodes cuts off when less than one pixel 364 nodes – Uniform Distribution 1004 nodes – Poisson Distribution (i.e., more realistic)

Structurally-Independent Layout Ignore the graph structure. Base the layout on other attributes of the data Examples: Geography Time Benefits Often very quick layout Optimizes communication of particular features Drawbacks May or may not present structure well

Structurally Independent Layout The “Skitter” Layout Internet Connectivity Angle = Longitude geography Radius = Degree # of connections http://www.caida.org/research/topology/as_core_network/2007/images/ascore-simple.2007_big.png Skitter, www.caida.org

References David Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for insight provenance. Information Visualization 8(1): 42-55, 2009. David Gotz and Zhen Wen: Behavior-driven visualization recommendation. IUI 2009: 315-324, 2008. Eser Kandogan: Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations. IEEE VAST 2012: 73-82. Mengdie Hu, Huahai Yang, Michelle X. Zhou, Liang Gou, Yunyao Li, and Eben Haber: OpinionBlocks: A Crowd-Powered, Self-Improving Interactive Visual Analytic System for Understanding Opinion Text. To appear in Proc. INTERACT 2013. Zhen Wen and Michelle X. Zhou: Evaluating the Use of Data Transformation for Information Visualization. IEEE Trans. Vis. Comp. Graph. 14(6): 1309-1316, 2008. Zhen Wen and Michelle X. Zhou: An optimization-based approach to dynamic data transformation for smart visualization. IUI 2008: 70-79 Zhen Wen, Michelle X. Zhou, and Vikram Aggarwal: An Optimization-based Approach to Dynamic Visual Context Management. INFOVIS 2005: 25-32. Huahai Yang, Yunyao Li, and Michelle X. Zhou: A Crowd-sourced Study: Understanding Users’ Comprehension and Preferences for Composing Information Graphics. In Submission to TOCHI 2013. Michelle X. Zhou and Min Chen: Automated Generation of Graphic Sketches by Example. IJCAI 2003: 65-74 Michelle X. Zhou, Min Chen, and Ying Feng: Building a Visual Database for Example-based Graphics Generation. INFOVIS 2002: 23-30. Michelle X. Zhou, Sheng Ma, and Ying Feng: Applying machine learning to automated information graphics generation. IBM Systems Journal 41(3): 504-523 (2002)