Interactive Visualization of Large Graphs and Networks

Slides:

Advertisements

Similar presentations

Two Papers on Network Visualization CPSC 533c Presented by:Jeremy Hilliker

Advertisements

Chapter 19 Design Model for WebApps

H3: Laying Out Large Directed Graphs in 3D Hyperbolic Space Tamara Munzner, Stanford University.

Visualisasi Informasi

From Hierarchies to Polyarchies: Visualizing Multiple Relationships George G. Robertson Microsoft Research George G. Robertson Microsoft Research.

Prefuse: A Toolkit for Interactive Information Visualization Heer, J., Card, S.K., Landay, J.A. Presented by Julia West.

A Nested Model for Visualization Design and Validation Tamara Munzner University of British Columbia Department of Computer Science.

Information Visualization (Shneiderman and Plaisant, Ch. 13)

Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.

1 Presented by Jean-Daniel Fekete. 2  Motivation  Mélange [Elmqvist 2008] Multiple Focus Regions.

Visualization CSC 485A, CSC 586A, SENG 480A Instructor: Melanie Tory.

”Confusion and clutter are failures of design, not attributes of information.” - Edward R. Tufte.

Graph Visualization cs5764: Information Visualization Chris North.

“Occlusion” Prepared by: Shreya Rawal 1. Extending Distortion Viewing from 2D to 3D S. Carpendale, D. J. Cowperthwaite and F. David Fracchia (1997) 2.

Tree-Maps: A Space-Filling Approach to the Visualization of Hierarchical Information Structures Brian Johnson Ben Shneiderman (HCIL TR 91-06) Steve Betten.

Live Re-orderable Accordion Drawing (LiveRAC) Peter McLachlan, Tamara Munzner Eleftherios Koutsofios, Stephen North AT&T Research Symposium August, 2007.

1 SIMS 247: Information Visualization and Presentation jeffrey heer Tree Visualization Oct 26, 2005.

Visualizing Network Data Richard A. Becker Stephen G.Eick Allan R.Wilks.

Cone Trees and Collapsible Cylindrical Trees

SIMS 247: Information Visualization and Presentation jeffrey heer

Information Retrieval: Human-Computer Interfaces and Information Access Process.

DEPARTMENT OF COMPUTER SCIENCE SOFTWARE ENGINEERING, GRAPHICS, AND VISUALIZATION RESEARCH GROUP 15th International Conference on Information Visualisation.

2D or 3D ? Presented by Xu Liu, Ming Luo. Is 3D always better than 2D? NO!

Tree-Maps Cyntrica Eaton February 11, 2001 A Space-Filling Approach to the Visualization of Hierarchical Information Structures Brian Johnson Ben Shneiderman.

Networks and Graphs IS 247 Information Visualization and Presentation 19 April 2002 James Reffell Moryma Aydelott Jean-Anne Fitzpatrick.

An Introduction to Software Visualization Dr. Jonathan I. Maletic Software DevelopMent Laboratory Department of Computer Science Kent State University.

Constellation: A Visualization Tool for Linguistic Queries from MindNet Tamara Munzner François Guimbretière Stanford University George Robertson Microsoft.

SpaceTree: Supporting Exploration in Large Node Link Tree, Design Evolution and Empirical Evaluation Catherine Plaisant, Jesse Grosjean, Benjamin B.Bederson.

By Kiri Bekkers & Katrina Howat

Tree Structures (Hierarchical Information) cs5764: Information Visualization Chris North.

Graph Visualization Tools NAM, Javis, Otter, H3Viewer Burton Filstrup.

Ivan Herman, Guy Melançon, and M. Scott Marshall

1 Visual Analysis of Large Heterogeneous Social Networks by Semantic and Structural Abstraction Zequian shen, Kwan-Liu Ma, Tina Eliassi-Rad Department.

H3: Laying Out Large Directed Graphs in 3D Hyperbolic Space Tamara Munzner Stanford University 元智資工所系統實驗室楊錫謦 1999/11/3.

By LaBRI – INRIA Information Visualization Team. Tulip 2010 – version Tulip is an information visualization framework dedicated to the analysis.

Information Visualization for E-content David Modjeska Assistant Professor Faculty of Information Studies University of Toronto Information Highways 2002.

Information Design and Visualization

JASS 2005 Next-Generation User-Centered Information Management Information visualization Alexander S. Babaev Faculty of Applied Mathematics.

Lecture 12: Network Visualization Slides are modified from Lada Adamic, Adam Perer, Ben Shneiderman, and Aleks Aris.

Visualizing Information in Global Networks in Real Time Design, Implementation, Usability Study.

© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.

A Focus+Context Technique Based on Hyperbolic Geometry for Visualizing Large Hierarchies. John Lamping, Ramana Rao, and Peter Pirolli Xerox Palo Alto Research.

Fall 2002CS/PSY Information Visualization Picture worth 1000 words... Agenda Information Visualization overview  Definition  Principles  Examples.

IAT 814 Trees Chapter 3.2 of Spence ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS +

1 Smashing Peacocks Further: Drawing Quasi-Trees from Biconnected Components Daniel Archambault and Tamara Munzner, University of British Columbia David.

Interacting with Huge Hierarchies: Beyond Cone Trees Jeromy Carriere, Rick Kazman Computer Graphics Lab, Department of Computer Science University of Waterloo,

Robert Kosara, Helwig Hauser 1InfoVis STAR The State of the Art in Information Visualization Robert Kosara, Helwig Hauser.

Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.

KMS: A Distributed Hypermedia System for Managing Knowledge in Organizations Robert M Akscyn, Donald L McCracken & Elise Yoder.

The Concept Browser web-site: Speaker: Ambjörn Naeve a new form of knowledge management tool.

INFM 603: Information Technology and Organizational Context Jimmy Lin The iSchool University of Maryland Thursday, November 1, 2012 Session 9: Visualization.

1 User Interfaces at Microsoft Research Intelligent Information Access using Animated 2 and 3D Information Visualization Mary Czerwinski.

Copyright © 2005, Pearson Education, Inc. Slides from resources for: Designing the User Interface 4th Edition by Ben Shneiderman & Catherine Plaisant Slides.

Do these make any sense?. Navigation Moving the viewpoint as a cost of knowledge.

Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI

Fall 2002CS/PSY Information Visualization 2 Case Study: Portraying Hierarchies Visualizing hierarchies  Variety of techniques Traditional tree views,

Graph Visualization and Beyond … Anne Denton, April 4, 2003 Including material from a paper by Ivan Herman, Guy Melançon, and M. Scott Marshall.

Mao Lin Huang University of Technology, Sydney, Visual Representations of Data and Knowledge.

Hyperbolic Trees A Focus + Context Technique John lamping Ramana Rao Peter Pirolli Joy Mukherjee.

Innovative UI Ideas Marti Hearst SIMS 213, UI Design & Development April 20, 1999.

Interactive Navigation of Large Graphs and Networks Tamara Munzner Stanford University Graphics Group, CS Dept.

INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-

John Lamping, Ramana Rao, Peter Porolli

Web mining is the use of data mining techniques to automatically discover and extract information from Web documents/services

Information Visualization Picture worth 1000 words...

François Guimbretière

Information Design and Visualization

Information Visualization (Part 1)

CIS 376 Bruce R. Maxim UM-Dearborn

François Guimbretière

Presentation transcript:

Interactive Visualization of Large Graphs and Networks Tamara Munzner Stanford University Computer Science Department

Contributions analysis of three software systems relating intended tasks to spatial layout, visual encoding choices two novel layout/drawing algorithms scalable targeted This thesis has contributions in both analysis and algorithms. We present three case studies, and analyze them by relating the intended tasks to our choices for spatial layout and visual encoding. We also present two novel algorithms for layout and drawing. One is aimed at scalability, the other is highly targeted to be effective for a very specific task.

Three Visualization Systems general domain specific graph drawing infovis H3 PM Const H3 web hyperlinks quasi-hierarchical Planet Multicast MBone tunnels find poorly placed Constellation parsed dictionaries refine algorithms The three case studies of interactive graph visualization systems fall onto a range of domain specificity, but all three are more targeted than most traditional graph drawing systems. The H3 system is the most general. It began as an effort to show the hyperlink structure of a web site, and it is well-suited for an entire class of graphs that we call “quasi-hierarchical”, which I’ll define later in the talk. The Planet Multicast system shows the tunnel structure of the Internet’s multicast backbone, or MBone, and was intended to help maintainers of the MBone find badly placed tunnels that were potentially wasting scarce bandwidth resources. It is more targeted than H3, but geographic network display is useful in many more contexts than this one. Constellation is the most targeted system of the three. The datasets that it displays are paths through a large semantic network created by parsing entire online dictionaries. The goal of the project was to help a small group of computational linguists refine their algorithms for creating and querying this large network by creating a specialized visualization system that incorporated a great deal of domain-specific information into its design. I present a detailed analysis of interactive graph viz systems aimed at three different domans

Talk Outline graph drawing, information visualization background software systems goal previous work video discussion evaluation general discussion conclusion I’ll begin this talk with some background on graph drawing. I’ll then present each case study in turn, starting with the goal, showing the previous work for context, playing a video of the system, discussing some of the design issues, and evaluating its success. I’ll then discuss some general design issues that pertain to all three systems, and end with concluding remarks.

Graph Drawing automatic layout and drawing of node-link graphs Hofstadter. Godel, Escher, Bach. Gansner and North. Improved force-directed layouts. The field of graph drawing is about the automatic layout and drawing of node-link graphs. Drawing graphs by hand on paper is difficult and time-consuming. The figure on the left from Godel, Escher, Bach is a heroic effort which we can regard as an upper bound on what’s possible to do by hand. The picture on the right was generated totally automatically by North and Gansner, using the program neato followed by a Voronoi-diagram based refinement pass. GEB: 150? nodes, 500+ edges?

Goal: help humans understand aesthetic criteria minimize crossings expose structure: hierarchy, symmetry, circular The purpose of drawing a graph in this context is to help humans understand it, as opposed to some other goal like VLSI layout. In order to do this, researchers have proposed aesthetic criteria like minimizing the number of edge-edge and edge-node crossing, and built programs that attempt to expose underlying structure. The figures below show the Tom Sawyer Toolkits in action for hierarchical, symmetric, and circular layouts. Tom Sawyer Software. Hierarchical Toolkit Tom Sawyer Software. Symmetric Toolkit Tom Sawyer Software. Circular Toolkit

System Scalability, Data Set Size previous systems H3 data sets my systems Planet Multicast mid-size web sites Constellation Web (pages) exceptional GD systems (dot, Gem3D) MBone (tunnels) Stanford graphics site most GD systems my site Net (routers) dictionary Net (hosts) manual GEB figure The Achilles heel of all of these systems is severely limited scalability. Very few of the graph drawing systems in the literature can handle more than one hundred nodes. A few exceptions like dot and Gem3D scale to several hundred nodes. But most real-world datasets aremuch larger than this, some by several orders of magnitude. The horizontal axis of node count on this figure is log scale. The hand-drawn GEB figure shown earlier is a few hundred nodes, my own personal web site has a few thousand individual URLs, and the mid-range Stanford graphics web site has well over 100,000 documents. The size of network examples ranges from several thousand MBone tunnels in 1996, to somewhere around 90,000 core Internet routers, to estimates of 70 million hosts on the whole Net. A natural language dictionary can contain millions of words, all defined in terms of each other. Finally, recent estimates of the Web put its size at over a billion pages. My three software systems are all attempts to handle large real-world datasets. Planet Multicast at several hundred, Constellation up to a thousand or so, and H3 is specifically designed for scalability and can handle over 100,000 nodes. 100 10M 1K 100M 1M 100K 10K 10 1B node count, log scale

Fundamental Idea extend reach of graph drawing with information visualization approach techniques interactivity incorporate domain-specific information The premise underlying my work is that we can extend the capabilities of graph drawing by taking an information visualization approach to the problem. I exploit sophisticated interaction techniques and incorporating domain-specific information.

Information Visualization external visual representation of data, exploits perceptual system to reduce human cognitive load find appropriate visual metaphor for data that is not implicitly spatial The field of information visualization approaches is built around the idea that an external visual representation of data can exploit the perceptual system to reduce the cognitive load on a human. The key problem is usually posed as finding an appropriate visual metaphor for data that is not inherently spatial.

Interactivity mimic reality beyond 2D paper: pan, zoom 3D object: rotate, translate, scale beyond semantics impossible in real world distortion, multi-scale Interactivity is the great challenge & opportunity of computer-based viz. There’s a long and successful historical tradition of exposition using static paper and physical objects. There’s been more recent efforts with the dynamic media of film and video. But the advent of computers brings interactivity to the table, and that offers unprecedented power and flexibility A very basic way to interact with a computer display is to mimic reality: we can program virtual paper that we can zoom and pan like real paper, and virtual 3D objects that can rigidly rotate, translate, and scale. We can go beyond the simple imitations of reality, and tie user input to the visual display to get semantics impossible in real world. For instance, distortion-based methods for seeing a large context around an area of focus, or multiscale methods where the visual appearance of an object changes radically depending on distance.

Domain/Task Focus user-centered design, ethnography understand high level goals maintain web site break down into lower level tasks minimize user navigation to important pages find and fix broken links design visual encoding evaluate effectiveness A hallmark of many infovis sytems is a focus on the domains and tasks of a group of intended users. This involves methods from user centered design and ethnography, working with people to understand their high level goals. For example, the goal of webmasters would be to create and maintain a web site. But these goals are too high level to address with software. We have to break them down into tasks at a lower level, like minimizing the number of hops users have to make to reach to important pages, or finding and fixing broken links. That’s specific enough that we can design a visual encoding to help support it, and gives us a handle on evaluating the effectiveness of the resulting system.

Evaluating Visualization Systems quantitative algorithmic improvements conceptual framework analysis impact/adoption user studies anecdotal evidence Evaluating a visualization system is much more difficult than evaluating most graphics systems, because it’s hard to judge whether some piece of software really helped somebody get something done better. Something like a rendering system can be quantitatively evaluated based on whether it’s faster or more photorealistic than previous work. The goals are clear because low-level psychophysics are reasonably well understood. Part of visualization evaluation is similarly quantitative: algorithmic improvements to show that some technique is faster or scales to larger datasets. But that’s only a small part of the picture. Conceptual frameworks provide a very powerful way to analyze the design choices in a system. The impact that a system has in terms of the number of people that choose to use it is another way to judge its worth. User studies can be more rigorous, since they can test not only for whether people liked it, but whether it actually improved their performance. Finally, anecdotal evidence of discoveries made that a the user attributes to insights from the system is important in cases where user studies are infeasible because the target audience is small and/or the task is something like scientific discovery.

System 1: H3 time: 1996-8 data: web hyperlinks goal: scalability quasi-hierarchical graphs: can find reasonable spanning tree using domain-specific information goal: scalability method: 3D hyperbolic I created the H3 system between 1996 and 1998. The initial dataset was web hyperlinks. It’s intended for what I call quasi-hierarchical graphs: graphs that are considerably more dense than trees, but where we can use domain-specific information to find a reasonable spanning tree that’s close to the user’s mental model of the structure. My fundamental choice was to trade off generality for scalability. Both layout and drawing occur in 3D hyperbolic space.

Background: Hyperbolic Space Focus+Context distortion project from infinite hyperbolic to finite euclidean pick best model for useful distortion conformal: geodesics warped projective: angles warped I’ll begin with a little bit of background about the two important properties of hyperbolic geometry that I exploit. There are known methods from projecting infinite hyperbolic to finite portion of euclidean space, which provides a view with a large amount of context around a particular focus point. In any projection from one metric space to different one, distortion is inevitable. The distortion of a 2D map created from a 3D globe is a familiar example. There are multiple cartographic projections: those suited for ocean navigation distort areas. If you have a projection that minimizes area distortion, you can’t draw straight lines on it that work for navigation. The same holds for hyperbolic projections: we want to choose the distortions that are the best for information visualization. The projection that we want (they’re called models by mathematicians) is one where 4x4 matrix

Background: Hyperbolic Space exponential room in space exponential number of tree nodes 2D hyperbolic plane hyperbolic hemisphere area exponential: 2p sinh r 2 euclidean hemisphere area geometric: 2pr 2 Thurston and Weeks, The Mathematics of Three Dimensional Manifolds, Scientific American

Previous Work: Hierarchies Cone Trees [Robertson, Mackinlay, Card 91] Tree Maps [Johnson, Shneiderman 91] There’s been a fair amount of work in visualizing hierarchies. The most influential paper was on the Cone Tree System from Robertson, Mackinlay, and Card at Xerox PARC. My H3 layout algorithm is one of many extensions. Treemaps are a very different approach to showing hierarchies that are much better suited for spotting outliers than understanding structure. Area-based methods like the treemp-bsed one on the right are not suitable for scaling to really big datases. Robertson, Mackinlay and Card. "Cone Trees: Animated 3D visualizations of hierarchical information. Johnson and Shneiderman. Treemaps: A Space-filling Approach to the Visualization of Hierarchical Information distortion: [Furnas, Brown, Carpendale, Keahey]

Previous Work: Distortion & Hierarchy 2D Hyperbolic Tree [Lamping, Rao, Pirolli 94,95] scalability analysis later Fractal [Koike, Yoshihara 93] SHriMP [Storey, Muller 95] don’t scale taxonomy [Noik 94] The systems most relevant for H3 are those that combine distortion views with hierarchy or graph drawing. I’ll compare the 2D hyperbolic tree browser and the 3D hyperbolic webviz system to H3 in detail later in the talk. The paper on fractal approaches for visualizing huge hierarchies has no explicit scaling claims, but no figure has more than 1000 nodes at most. The SHriMP multiscale viewer had only extremely small datasets in the example figures. Noik’s taxonomy covers work before 1994. Lamping, Rao, and Pirolli. A Focus+Content Technique Based on Hyperbolic Geometry for Viewing Large Hierarchies.

Concurrent Work: Nicheworks Nicheworks [Wills 97] layout scales to 1M nodes linked views multiple layout approaches very different visual metaphor There are only two recent systems that have a similar approach to merging graph drawing and information visualization, both of which were published during the time range of my H3 papers. The Nicheworks system scales to huge graphs, one order of magnitude larger than what H3 claims. The user can choose one of multiple layout algorithms, and the system supports linked views. The visual metaphor is quite different from H3. Wills et al. Nicheworks.

Concurrent Work: Skeletonization Skeletonization [Herman 98] abstractions for tree structure Herman’s skeletonization work is also a mixture of infovis and graph drawing, and addresses a complementary problem to H3. Skeletonization provides an abstracted global overview, while H3 strives to created the largest possible local view. Herman et al. Skeletonization

H3 Layout novel layout algorithm detailed in thesis hemisphere surface instead of linear circumference bottom-up pass: compute hemisphere sizes top-down pass: place child on parent surface

Information Density: Scale Lamping, Rao, and Pirolli. A Focus+Content Technique Based on Hyperbolic Geometry for Viewing Large Hierarchies.

Information Density: Codimension want balance between clutter and void topological approach to describing density difference between structure and surrounding space sparse dense Carpendale, Cowperthwaite, and Fracchia. Extending Distortion Viewing from 3D to 2D.

Evaluation: Scalability drawing: constant incremental exception: precision layout: linear in |E| 110,000 edges in 12 seconds given DFS input limits: computational: global layout in main memory cognitive: disorientation past ~100K nodes large neighborhood not global overview future: landmarks, LOD, abstraction

Evaluation: Impact product from SGI research use of library viewer use Site Manager aimed at web content creators bundled starting with Irix 6.3 research use of library interface for Skitter Internet tomography data analysis of Autonomous System data viewer use 6 researchers converted data to use viewer image use 6 reprint requests 6 or so: function call graphs, co-citation graphs, biodiversity taxonomies, medical informatics knowledge base, ASes

Evaluation: User Study [Risden, Czerwinski, Munzner, Cook 00] compared 3 browsers for adding content to collection of web pages snap portal (Yahoo style) XML3D: H3 + lists collapsible tree

User Study Results reliably faster for existing category task no decline in quality for new category task differences statistically significant differences statistically insignificant could use more studies to tease apart influence of h3, how and why it’s effective, which view components can be usefully linked. nevertheless it is gratifying to have some readl data abou ttasks for which it’s effective. augment othe rviews? more info than other views: snap: siblings not displayed collapsible shows parent/child/sibling, but only for one prent

System 2: Planet Multicast time: 1996 joint work: Hoffman, Claffy, Fenner data: MBone tunnels task: find badly placed tunnels goal: simple baseline method: 3D geographic We built the Planet Multicast system in 1995 and 1996 to help the maintainers of the Internet’s multicast backbone find badly placed tunnels that were potentially wasting scarce network resources. We used a known 3D geographic metaphor of arcs on a globe. We wanted to try the obvious thing and see how well it worked as a simple baseline. This project was joint work with Eric Hoffman, K. Claffy, and Bill Fenner.

Previous Work: Geographic Network SeeNet3D [Cox, Eick 95] arcs on globe layout Cox and Eick. 3D Displays of Network Traffic. SeeNet [Becker, Eick, Wilks 95] The most relevant previous work was the 1995 SeeNet3D paper by Kenneth Cox and Steven Eick, which introduced the arcs on globe visual metaphor that we used. The previous SeeNet system was a 2D geographic approach. The NSFNet visualization by Donna Cox and Robert Patternson was highly visible at Siggraph 92. NSFNet [Cox, Patterson 92] Becker, Eick, and Wilks. Visualizing Network Data Cox and Patterson. Visualization Study of the NSFNet.

Geographic Layout distance as stand-in for resource usage partially correlated geographical determination arduous major scalability problem immediate comprehension evocative, many image reprints Wired, National Geographic still picture captures much of function The idea behind the geographic layout is that badly placed long distance tunnels are more likely than short ones to waste bandwidth. This is partially true: distance is partially correlated with resource usage, but the match is imperfect. We used distance as a stand-in because the true resource usage data was not available. It’s very hard to gather data about traffic and congestion for the unicast hops underneath a tunnel, or even to know the route taken through the underlying unicast topology. We might have chosen a different visual metaphor if that data was readily available. One implication of the arcs-on-globe visual metaphor is that tunnels that begin and end in the same city are not visible. The scale of the globe acts as a hardcoded filter, so we see only 700 of the 4400 tunnels. The imperfect correllation between geographic distance and resource bottlenecks is a disadvantage. The main roadblock for the system was a data issue, not a visualization issue: the geographical determination is both arduous and imperfect. Gathering this sort of data for the public Internet with ad-hoc methods simply wouldn’t scale. Claffy’s group at CAIDA is working on the problem.

Evaluation: Anecdotal Insights … > pen-mbone-1.sprintlink.net(204.213.238.11) dc-mbone-1.sprintlink.net(206.229.87.99) [1/64/tunnel] > elm.can.net(199.246.170.7) dc-mbone-1.sprintlink.net(206.229.87.99) [1/64/tunnel] > boston.terra.net(199.103.128.254) dc-mbone-1.sprintlink.net(206.229.87.99) [1/0/tunnel/querier] > NS.FLSIG.ORG(192.153.117.162) dc-mbone-1.sprintlink.net(206.229.87.99) [1/64/tunnel] > ace.mid.net(198.247.225.251) dc-mbone-1.sprintlink.net(206.229.87.99) [1/64/tunnel] > fw-mbone-1.sprintlink.net(206.61.106.99) dc-mbone-1.sprintlink.net(206.229.87.99) [1/16/tunnel] > gateway10.crawford.com(198.69.210.2) dc-mbone-1.sprintlink.net(206.229.87.99) [1/32/tunnel] > csce-2--rngm-nb-f-1.net.tamu.edu(128.194.1.11) dc-mbone-1.sprintlink.net(206.229.87.99) [1/64/tunnel] ...

System 3: Constellation time: 1998-9 joint work: Guimbretière data: MindNet query results task: plausibility checking for linguists method: 2D custom goal: targeted The Constellation system was created between 1998 and 1999, and was aimed at thetask of plausibility checking, done by a group of computational linguists who wanted to refine the algorithms used to create and query MindNet, a very large semantic network. Our custom spatial layout was two dimensional, and the goal in this project was to create a highly targeted visualization system that was maximally effective for the task. The second phase of the Constellation system was joint work with Francois Guimbretiere

Definition Graph dictionary entry sentence nodes: word senses links: relation types

Semantic Network definition graphs as building blocks unify shared words large network millions of nodes grammar checking now, translation future global structure known: dense probes return local info

Path Query best N paths between two words words on path itself definition graphs used in computation

Task: Plausibility Checking paths ordered by computed plausibility researcher hand-checks results high-ranking paths believable? believable paths high-ranked? stop words

Top 10 Paths: kangaroo - tail

Goal create unified view of relationships between paths and definition graphs shared words are key thousands of words (not millions) special-purpose algorithm debugging tool not understand the structure of English

Previous Work: Semantic Networks SemNet [Fairchild, Poltrock, Furnas 88] multiple 3D layouts Visual Thesaurus [Thinkmap applet] casual browsing, constant motion < 20 nodes Fairchild, Poltrock, and Furnas. SemNet: Three-Dimensional Graphic Representations of Large Knowledge Bases. There has not been as much work on visualizing semantic networks as some of the other case study domains. The SemNet system from Fairchild, Poltrock, and Furnas used a few different 3D layouts, and had algorithms that tried to avoid edge crossings. It’s relatively old, dating back to 1998, so it has fairly limited scalability. The Visual Thesaurus is one of several lightweight applets for casual browsing that have appeared on the web lately. The layout algorithm is quite simple and only scales to a few dozen nodes at best, and the constant motion make it unsuitable for any real analytical work. Thinkmap applet. www.thinkmap.com cited 3/09/00.

Traditional Layout avoid crossings reason: avoid false attachments B B C D C artifact salience ambiguity

Information Visualization Approach spatial position is strongest perceptual cue encode domain specific attribute plausibility gradient

Constellation Semantic Layout novel layout algorithm detailed in thesis paths as backbone, definition graphs attached curvilinear grid iterative design for maximum semantics with reasonable information density allow crossings for long-distance proxy links

Selective Emphasis highlight sets of boxes and edges interaction additional perceptual channels avoid perception of false attachments

Evaluation: Layout Effectivness

Evaluation: Layout Comparison dot H3

Talk Outline graph drawing background software systems goal previous work video discussion evaluation general discussion conclusion I’ll begin this talk with some background and previous work in the areas of graph drawing, information visualization, and the domains of each case study. I will summarize my research contributions, and move on to the case studies for the main body of the talk. I”ll start each case study by showing a video, and then discussing some of the interesting design decisions. I’ll then make comparisons across all three systems, and end with concluding remarks.

Visual Salience Planet Multicast H3 Constellation long-distance tunnels H3 distant points of possible interest fringe: aggregate information Constellation selective emphasis word size tied to importance

Canonical Word Size

Hidden State Constellation avoids hidden state closed world assumption change salience instead of toggle drawing closed world assumption if not visible, doesn’t exist easy to forget previous actions false negative conclusions H3, PM do have hidden state non-tree links sometimes drawn intra-city tunnels never drawn

Graph Functions structure discovery contextual backdrop linked view pure spatial layout implicit in traditional graph drawing contextual backdrop linked view

Graph Functions structure discovery contextual backdrop linked view additional visual encoding color, linewidth, shape, enclosure combination more than sum of parts linked view

Contextual Backdrop

Graph Functions structure discovery contextual backdrop linked view brushing [Becker and Cleveland 88] invoke other software components

Linked View

Contributions detailed analysis of three software systems interactive, range of domain specificity relate intended tasks to spatial layout, visual encoding two novel layout/drawing algorithms Constellation targeted design H3: scales 100x beyond previous work product, user study interactive systems for graph drawing using infovis techniques along spectrum. all incorporate domain-specific info, some more targeted than others. h3 least, const most performance advantage for certain tasks