Download presentation
Presentation is loading. Please wait.
1
A Rank-by-Feature Framework for Interactive Multi-dimensional Data Exploration Jinwook Seo and Ben Shneiderman Human-Computer Interaction Lab. & Department of Computer Science University of Maryland, College Park
2
Hierarchical Clustering Explorer (HCE)
3
“ HCE enabled us to find important clusters that we don ’ t know about yet. ”
4
Goal: Find Interesting Features in Multidimensional Data Finding correlations, clusters, outliers, gaps, … is difficult in multidimensional data –Cognitive difficulties in >3D Therefore utilize low-dimensional projections –Perceptual efficiency in 1D and 2D –Use Rank-by-Feature Framework to guide discovery
5
Do you see anything interesting?
6
Do you see any interesting feature?
7
Correlation … What else?
8
Outliers He Rn
9
Demonstration Breakfast Cereals –77 cereals –8 dimensions (or variables) : sugar, potassium, fiber, protein, etc. US counties census data –3138 counties –14 dimensions : population density, poverty level, unemployment, etc.
10
Low-dimensional Projections Techniques –General combination of variables for an axis –Axis parallel a variable for an axis Number of projections Interface for Exploration X 1 +2X 2 -2X 1 +X 2 X1X1 X3X3
11
Exploration by Projections XGobi, GGobi – Scatterplot Browsing www.ggobi.orgwww.research.att.com/areas/stat/xgobi/
12
Exploration by Projections Spotfire DecisionSite – Scatterplots www.spotfire.com
13
Exploration by Projections XGobi, GGobi – Grand Tour
14
Exploration by Projections XmdvTool – Scatterplot Matrix Worcester Polytechnic Institute
15
Dimension selection tool Corrgram by Michael Friendly Square Matrix Display in GeoVISTA studio by Alan M. MacEachren
16
Exploration by Projections Spotfire DecisionSite – View Tip orders scatterplots
17
Design Considerations Hard to interpret arbitrary linear projections Axis-parallel projections Interestingness depends on applications Incorporate users’ interest Overview of all possible projections Rapid change of axis
18
Demonstration Breakfast Cereals –77 cereals –11 dimensions (or variables) : sugar, potassium, fiber, protein, etc. US counties census data –3138 counties –14 dimensions : population density, poverty level, unemployment, etc.
19
Rank-by-Feature Framework: 1D Ranking Criterion Rank-by-Feature Prism Score List Manual Projection Browser
20
Rank-by-Feature Framework: 2D Ranking Criterion Rank-by-Feature Prism Score List Manual Projection Browser
21
Ranking Criterion: Pearson correlation (0.996, 0.31, 0.01, -0.69) Ranking Criterion: Uniformity (entropy) (6.7, 6.1, 4.5, 1.5) A Ranking Example 3138 U.S. counties with 17 attributes
22
Ongoing and Future Work Identify & implement more ranking criteria –Gaps, outliers, etc. Ranking based on users ’ selection of items –Separability of the selected items –Ranking by using only the selected items Scalability Issue –How to handle a large number of dimensions –Grouping by clustering dimensions –Filtering uninteresting entries in the prism
23
More about HCE In collaboration and sponsored by Eric Hoffman: Children’s National Medical Center Freely downloadable at www.cs.umd.edu/hcil/hce www.cs.umd.edu/hcil/hce Version 3.0 beta, May 2004 About 2,000 downloads since April 2002 Licensing to ViaLactia Biosciences (NZ) Ltd.
24
More Applications? Try HCE and the Rank-by-Feature Framework with your problems and data Join the case studies on the use of HCE and the Rank-by-Feature Framework Welcome suggestions and comments
25
Thank you !
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.