Interactive Information Visualization of One Million Items Jean-Daniel Fekete University of Maryland
Scaling issues in Information Visualization Seeing more data items or more dimensions No aggregation, no sampling What are the limits? Technical screen resolution / dimension, 10ms redisplay speed Perceptual visual system accuracy, perception-action loop speed Cognitive how much can we understand and how long does it take?
Visualizing one million items Treemap of a Unix file system containing 1 million files Rectangle sizes related to file sizes Color coded by type: red=executable, blue=text, green=image, yellow=program, gray=unknown What can we see?
Blue and green patterns are web pages (www site) Image repository for PhotoMesa Gray rectangle is a bug, temporary files taking 10% of the www space Two similar patterns = two versions of the mathlab system
Techniques Use accelerated graphics with OpenGL 2GHz Pentium4 1600x1200 pixels resolution Now off-the-shelf! Push existing visualization techniques to their limits Space filling (treemaps) Overlapping (scatter plots)
Relying on Accelerated Graphics Balance the CPU/GPU work GPU can perform many operations “for free” Geometric transformations Color transformations Color interpolation Translucency Counting overlaps CPU prepares data and sends it to GPU Bottleneck is communication CPU GPU Screen
Relying on Accelerated Graphics Breaks the 10 6 barrier 1 million items at interactive speed Permits use of animation E.g. for understanding view transitions But requires: optimizing algorithms using unusual programming techniques adapting visualization techniques
Example of Adapted Visualization Techniques No rectangle outlines Spares pixels Avoids sending the geometry twice Color shading Separate similar items “Free” with accelerated graphics cards
Animated Transitions
Dynamic Labeling
Conclusion You can now break the 10 6 barrier! Was limited to 10 4 E.g. can visualize the phylogenic tree of species Still technically limited by graphics hardware, but close to the perceptual limits New IBM screen with 10 million pixels Need more work to understand how humans can make sense of this amount of data Send your 10 6 data sets!
Credits Thanks to HCIL for inviting me and providing the rich environment for this work Thanks to Catherine Plaisant, Ben Shneiderman and Ben Bederson for their help and advice