“Exploring High-D Spaces with Multiform Matrices and Small Multiples” Mudit Agrawal Nathaniel Ayewah MacEachren, A., Dai, X., Hardisty, F., Guo, D., and Lengerich, G. Proc. IEEE Symposium on Information Visualization (2003), 31–38.
The Plan Motivation Contribution Analysis Methods GeoVISTA studio Conclusions
Discover Multivariate relationships Examine data from multiple perspectives Motivation DATA INFORMATION
Visual analysis of multivariate data Combinations of scatterplots, bivariate maps and space-filling displays Conditional Entropy to identify interesting variables from a data-set, and to order the variables to show more information Dynamic query/filtering called Conditioning Contribution
Back-end: Design Box Building of applications using visual programming tools Front-end: GUI Box Visualizing data using the developed designs Source: GeoVista Studio
Analysis Methods
Sorting Nested sorting – sort a table on selected attributes To understand the relationships between sorted variables and the rest Permutation Matrix : cell values are replaced by graphical depiction of value. Rows/cols can be sorted to search for related entities e.g. Analysis Methods
Augmented seriation: Organizing a set of objects along a single dimension using multimodal multimedia Correlation matrices Reorderable Matrices: Simple interactive visualization artifact for tabular data Analysis Methods Sorting Source: (Siirtola, 1999)
Space-filling visualization Analysis Methods Sunburst methods Mosaic plot Pixel-oriented methods Source: (Keim, 1996) Source: (Schedl, 2006) Source: (Young, 1999)
Multiform Bivariate Small Multiple Small Multiples A set of juxtaposed data representations that together support understanding of multivariate information Analysis Methods Source: (MacEachren, 2003)
Analysis Methods Multiform Bivariate Matrix Source: (MacEachren, 2003)
GeoVista Studio
Demonstration Basic Demo Application construction Scatterplot, Geomap Dynamic linking, eccentric labeling etc.
Dealing with High Dimensionality
High Dimensionality Interactive Feature Selection Guo, D., Coordinating Computational and Visualization Approaches for Interactive Feature Selection and Mulivariate Clustering. Information Visualization 2(4):
High Dimensionality “Goodness of Clustering” high coverage high density high dependence E.g. Correlation Chi-squared Conditional Entropy HIGH LOW
Conditional Entropy Discretize two dimensions into intervals Nested Means mean Source: (Guo, 2003)
Conditional Entropy Source: (Guo, 2003)
Ordering Dimensions Related dimensions should be close together Sort By: Conditional Entropy Sort Method: Minimum Spanning Tree ABCD A5169 B51521 C16154 D9214 AB CD Ordering: B A D C unsorted
Demonstration Advanced Demo Interactive Feature Selection PCP, SOM, Matrix Conditioning
Conclusions Strengths Dynamic Linking of different representations Visualizing clusters of dimensions Rich and extensible toolbox Weaknesses Usability Arrangement of Windows
References Guo, D., (2003). Coordinating Computational and Visualization Approaches for Interactive Feature Selection and Mulivariate Clustering. Information Visualization 2(4): Keim, D (1996) Pixel-oriented Visualization Techniques for Exploring Very Large Databases, Journal of Computational and Graphical Statistics. Schedl, M (2006), CoMIRVA: Collection of Music Information Retrieval and Visualization Applications. Website. Siirtola, H. (1999), Interaction with the Reorderable Matrix. In E. Banissi, F. Khosrowshahi, M. Sarfraz, E. Tatham, and A. Ursyn, editors, Information Visualization IV '99, pages Proceedings International Conference on Information Visualization. Young, F (1999), Frequency Distribution Graphs (Visualizations) for Category Variables, unpublished.