Download presentation
Presentation is loading. Please wait.
1
CPT-S 415 Big Data Yinghui Wu EME B45
2
CPT_S 415 Big Data Data Visualization and Navigation
(Information Visualization) Information Visualization Graph visualization Graph drawing and graph visualization Graph layout courtesy from Ivan Herman
3
“Close the loop” Data quality -> knowledge quality
(potential new area) Data interpretation Visual analytics Graph visualization
5
Information Visualization
5
6
Linking data to human Collecting information is no longer a problem, but extracting value from information collections has become progressively more difficult. Visualization links the human eye and computer, helping to identify patterns and to extract insights from large amounts of information Visualization technology shows considerable promise from increasing the value of large-scales collections of information Visualization can be classified as scientific visualization, software visualization, and information visualization
7
Visualization Classification
Scientific Visualization helps understanding physical phenomena in data (Nielson, 1991) Mathematical model plays an essential role Isosurfaces, volume rendering, and glyphs are commonly used techniques Isosurfaces depict the distribution of certain attributes Volume rendering allows views to see the entire volume of 3-D data in a single image (Nielson, 1991) Glyphs provides a way to display multiple attributes through combinations of various visual cues (Chernoff, 1973)
8
Visualization Classification
Software Visualization helps people understand and use computer software effectively (Stasko et al. 1998) Program visualization helps programmers manage complex software (Baecker & Price, 1998) Visualizing the source code (Baecer & Marcus, 1990) data structure, and the changes made to the software (Erick et al., 1992) Algorithm animation is used to motivate and support the learning of computational algorithms
9
What is Information Visualization?
Information visualization helps users identify patterns, correlations, or clusters Structured information Graphical representation to reveal patterns Integration with various data mining techniques (Thealing et al., 2002; Johnston, 2002) Unstructured Information Need to identify variables and construct visualizable structures The depiction of information using spatial/graphical representations, to facilitate comparison, pattern recognition, change detection, and other cognitive skills by making use of the visual system.
10
Information Visualization
Problem: HUGE Datasets: How to understand them? Solution Take better advantage of human perceptual system Convert information into a graphical representation. Issues How to convert abstract information into graphical form? Do visualizations do a better job than other methods?
11
Goals of Information Visualization
Make large datasets coherent (Present huge amounts of information compactly) Present information from various viewpoints Present information at several levels of detail (from overviews to fine structure) Support visual comparisons Tell stories about the data
12
“Sci vis” versus “Info vis”
Scientific visualization: specifically concerned with data that has a well-defined representation in 2D or 3D space (e.g., from simulation mesh or scanner). *Adapted from The ParaView Tutorial, Moreland 13
13
Information visualization
Information visualization: concerned with data that does not have a well-defined representation in 2D or 3D space (i.e., “abstract data”). 14
14
Data Attributes Data attributes: Infovis has more data types than numerical values Data Type Attribute Domain Operations Examples nominal Unordered set Comparison (=) Text, references, syntax elements ordinal Ordered set Ordering (=, <, >) Ratings (e.g., bad, average, good) discrete Integer Integer arithmetic Line of code continuous real Real arithmetic Code metrics 15
15
Info Viz vs Sic Viz Scivis Infovis Data Domain spatial, compact
non-spatial, abstract Attribute Type numerical any data type Data Points Samples over the domain Tuples of attributes Cells Support interpolation Describe relations Interpolation Piecewise continuous Can be inexistent
16
Information representation in InfoViz
17
17
Information Representation
Shneiderman (1996) proposed seven types of representation methods: 1-D, 2-D, 3-D Multidimensional Tree Network Temporal approaches
18
1-D TileBars (Hearst, 1995)
19
2-D & 3-D 2D: To represent information as two-dimensional visual objects Visualization systems based on self-organizing map (SOM) (Kohonen, 1995) To help users deal with the large number of categories created for the mass textual data 3D: To represent information as three-dimensional visual objects WebBook system folds web pages into three-dimensional books (Card et al., 1996) 3-D version of a tree or network 3-D hyperbolic tree to visualize large-scale hierarchical relationships (Munzner 2000)
20
Multidimensional To represent information as multidimensional objects and projects them into a three-dimensional or a two-dimensional space Dimensionality reduction algorithm will be used Multidimensional scaling (MDS) Hierarchical clustering K-means algorithms Principle components analysis Examples SPIRE system (Wise et al. 1995) VxInsight System (Boyack et al. 2002) Glyph representation has been used in various social visualization techniques (Donath, 2002) to describe human behavior during computer-mediated communication (CMC)
21
Table Visualization Simple list; does not support analysis, or insight
22
22
Table Visualization Aided by Sorting, Bar Graph, Evolution Icons
Dense Pixel Display: Bar Graph, Table Lens 23
23
To represent hierarchical relationship
Tree To represent hierarchical relationship Challenge: nodes grows exponentially Different layout algorithms have been applied Examples Tree-Map allocates space according to attributes of nodes (Johnson & Shneiderman 1991) Cone Tree system uses 3-D visual structure to pack more nodes on the screen (Robertson et al., 1991) Hyperbolic Tree projects subtrees on a hyperbolic plane and puts the plane (Lamping et al., 1995)
24
Tree Visualization The TreeMap (Johnson & Shneiderman ‘91) Idea: Show a hierarchy as a 2D layout Fill up the space with rectangles representing objects Size on screen indicates relative size of underlying objects. Treemap method: visualize the tree structure that use virtually every pixel of the display space to convey information Every subtree is represented by a rectangle, that is partitioned into smaller rectangles with correspond to its children. The position of the slicing lines determines the relative sizes of the child rectangles. For every child, repeat the slicing recursively, swapping the slicing direction from vertical to horizontal or conversely 25
25
Tree Visualization: Examples
Treemap 26
26
Tree Visualization Rooted-Tree Layout Radial-Tree Layout
Ball-and-stick visualization: use the position and appearance of the glyphs Rooted-Tree Layout Radial-Tree Layout 3D Cone-Tree Layout 27
27
Graphs and Networks To represent complex relationships that a simple tree structure is insufficient to represent Citation among academic papers( C. Chen & Paul 2001; Mackinlay et al., 1995) Documents linked by the internet (Andrews, 1995) Spring-embedder model (Eades, 1984) along with its variants ( Davidson & Harel, 1996;l Fruchterman & Reingold, 1991) have become the most popular drawing algorithms.
28
Examples Network visualization (vizster) 29
29
To represent information based on temporal order
Location and animation are commonly used visual variables to reveal the temporal aspect of information Examples Perspective Wall lists objects along the x-axis based on time sequence and presents attriibutes along the y-axis (Robertson et al., 1993) In VxInsight system (Boyack et al., 2002), the landscape changes as the time changes.
30
Examples Geo data mapping Demo Cyber Attacks
31
31
Additional Examples http://map.norsecorp.com/v1/
NY Times words, words, numbers Visual Complexity (from book by Manuel Lima) 50 examples (from June 2009, somewhat dated) D3 Gallery 32
32
Visualization components
User-Interface Interaction: Color, Size, Texture, Proximity, Annotation, Interactivity Immediate interaction not only allows direct manipulation of the visual objects displayed but also allows users to select what to be displayed (Card et al., 1999) Shneiderman (1996) summarizes six types of interface functionality: Overview, Zoom, Filtering, Details on demand, Relate, history Information Analytics Indexing Extract the semantics of information Analysis Clustering, classification 33
33
Visualization pipeline
Acquire -> Parse -> Filter -> Mine -> Represent -> Refine -> Interact Represent Parse Interact Acquire Refine Filter/Mine
34
Visualization software
Host language (C/C++/Java/Python) plus OpenGL Stat/math package with graphics R MATLAB Special-purpose info viz software Earth mapping, biological network visualization, etc. Browser-enabled graphics/info viz packages Google Charts Processing / Processing.js D3 Java + Flash (becoming rarer) 35
35
Graph Drawing and Graph Visualization
36
36
Information Visualization vs. Graph Drawing
Old topic, many books, etc. May have other goals than visualization E.g. VLSI design Graph Visualization Size key issue Usability requires nodes to be discernable Navigation considered
37
Graph Visualization Hierarchical graph of the evolution of the UNIX operating system 38
38
Graph Visualization The Call Graph
Three concentric rings show containment Files Classes Methods The curved lines indicate function calls 39
39
Graph visualization Circle chart 40
40
When is Graph Visualization Applicable?
Ask the question: is there an inherent relation among the data elements to be visualized? If YES – then the data can be represented by nodes of a graph, with edges representing the relations. If NO – then the data elements are “unstructured” and goal is to use visualization to analyze and discover relationships among data. Source: Herman, Graph Visualization and Navigation in Information Visualization: a Survey
41
Traditional Graph Drawing
Optimization based on a set of criteria (mathematical aesthetics) Minimize edge crossings Minimize area Maximize smallest angle Maximize symmetry Do all at once is hard. Often unsuitable for interactive visualization Many optimizations are NP-Hard Approximation algorithms very complex Precompute layout, or compute once at the beginning of an application then support interaction Slide adapted from Jeff Heer
42
Traditional Graph Drawing
poly-line graphs (includes bends) orthogonal drawing planar, straight-line drawing upward drawing of DAGs
43
Layout Approaches Tree-ify the graph - then use tree layout
Hierarchical graph layout Radial graph layout Optimization-based techniques Includes spring-embedding / force-directed layout Adjacency matrices Structurally-independent layout On-demand revealing of subgraphs Distortion-based views Hyperbolic browser (this list is not meant to be exhaustive)
44
Tree-based graph layout
Select a tree-structure out of the graph Breadth-first-search tree Minimum spanning tree Other domain-specific structures Use a tree layout algorithm Benefits Fast, supports interaction and refinement Drawbacks Limited range of layouts
45
Tree-ify the graph
46
Traditional Tree Layouts
H-tree layout: best for balanced trees Radial view Balloon view: related to 3-d cone tree
47
Hierarchical graph layout
Use directed structure of graph to inform layout Order the graph into distinct levels this determines one dimension Now optimize within levels determines the second dimension minimize edge crossings, etc The method used in graphviz’s “dot” algorithm Great for directed acyclic graphs, but often misleading in the case of cycles
48
Hierarchical Graph Layout
Evolution of the UNIX operating system Hierarchical layering based on descent
49
Hierarchical graph layout
Gnutella network
50
Radial Layout Animated Exploration of Graphs with Radial Layout, Yee et al., 2001 Gnutella network
51
Optimization-based layout
Specify constraints for layout Series of mathematical equations Hand to “solver” which tries to optimize the constraints Examples Minimize edge crossings, line bends, etc Multi-dimensional scaling (preserve multi-dim distance) Force-directed placement (use physics metaphor) Benefits General applicability Often customizable by adding new constraints Drawbacks Approximate constraint satisfaction Running time; “organic” look not always desired
52
Example: Force-Directed Layout
Uses physics model to layout graph, Nodes repel each other, edges act as springs, and some amount of friction or drag force is used. Special techniques to dampen “jitter”. visual wordnet visuwords
53
Hyperbolic Browser: Inspiration
54
Using Distortion and Focus + Context
The Hyperbolic Tree Browser The Hyperbolic Browser: A Focus + Context Technique for Visualizing Large Hierarchies, Lamping & Rao, CHI 1996. Uses non-Euclidean geometry as basis of focus + context technique The hyperbolic browser is a projection into a Euclidean space – a circle The circumference of a circle increases at a linearly with radius (2 PI) The circumference of a circle in hyperbolic space increases exponentially Exponential growth in space available with linear growth of radius Makes tree layout easy Size of objects decreases with growth of radius Reduces expense of drawing trees when cut-off at one pixel
55
Appearance of Initial Layout
Root mapped at center Multiple generations of children mapped out towards edge of circle Drawing of nodes cuts off when less than one pixel 364 nodes – Uniform Distribution 1004 nodes – Poisson Distribution (i.e., more realistic)
56
Structurally-Independent Layout
Ignore the graph structure. Base the layout on other attributes of the data Examples: Geography Time Benefits Often very quick layout Optimizes communication of particular features Drawbacks May or may not present structure well
57
Structurally Independent Layout
The “Skitter” Layout Internet Connectivity Angle = Longitude geography Radius = Degree # of connections Skitter,
58
References David Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for insight provenance. Information Visualization 8(1): 42-55, 2009. David Gotz and Zhen Wen: Behavior-driven visualization recommendation. IUI 2009: , 2008. Eser Kandogan: Just-in-time annotation of clusters, outliers, and trends in point-based data visualizations. IEEE VAST 2012: Mengdie Hu, Huahai Yang, Michelle X. Zhou, Liang Gou, Yunyao Li, and Eben Haber: OpinionBlocks: A Crowd-Powered, Self-Improving Interactive Visual Analytic System for Understanding Opinion Text. To appear in Proc. INTERACT 2013. Zhen Wen and Michelle X. Zhou: Evaluating the Use of Data Transformation for Information Visualization. IEEE Trans. Vis. Comp. Graph. 14(6): , 2008. Zhen Wen and Michelle X. Zhou: An optimization-based approach to dynamic data transformation for smart visualization. IUI 2008: 70-79 Zhen Wen, Michelle X. Zhou, and Vikram Aggarwal: An Optimization-based Approach to Dynamic Visual Context Management. INFOVIS 2005: Huahai Yang, Yunyao Li, and Michelle X. Zhou: A Crowd-sourced Study: Understanding Users’ Comprehension and Preferences for Composing Information Graphics. In Submission to TOCHI 2013. Michelle X. Zhou and Min Chen: Automated Generation of Graphic Sketches by Example. IJCAI 2003: 65-74 Michelle X. Zhou, Min Chen, and Ying Feng: Building a Visual Database for Example-based Graphics Generation. INFOVIS 2002: Michelle X. Zhou, Sheng Ma, and Ying Feng: Applying machine learning to automated information graphics generation. IBM Systems Journal 41(3): (2002)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.