Presentation is loading. Please wait.

Presentation is loading. Please wait.

FODAVA-Lead Education, Community Building, and Research: Dimension Reduction and Data Reduction: Foundations for Interactive Visualization Haesun Park.

Similar presentations


Presentation on theme: "FODAVA-Lead Education, Community Building, and Research: Dimension Reduction and Data Reduction: Foundations for Interactive Visualization Haesun Park."— Presentation transcript:

1 FODAVA-Lead Education, Community Building, and Research: Dimension Reduction and Data Reduction: Foundations for Interactive Visualization Haesun Park School of Computational Science and Engineering Georgia Institute of Technology FODAVA Review Meeting, Dec. 9, 2010

2 Challenges in Analyzing High Dimensional Massive Data on Visual Analytics System Screen Space and Visual Perception: low dim and number of available pixels fundamentally limiting constraints High dimensional data: Effective dimension reduction Large data sets: Informative representation of data Speed: necessary for real-time, interactive use Scalable algorithms Adaptive algorithms Development of Fundamental Theory and Algorithms in Data Representations and Transformations to enable Visual Understanding

3 Dimension Reduction Dimension reduction with prior info/interpretability constraints Manifold learning Informative Presentation of Large Scale Data Sparse recovery by L 1 penalty Clustering, semi-supervised clustering Multi-resolution data approximation Fast Algorithms Large-scale optimization/matrix decompositions Adaptive updating algorithms for dynamic and time-varying data, and interactive vis. Data Fusion Fusion of different types of data from various sources Fusion of different uncertainty level Integration with DAVA systems Testbed, Jigsaw, iVisClassifier, iVisClustering,.. FODAVA-Lead Research Topics

4 FODAVA-Lead Research Presentation H. Park – Overview of the FODAVA-lead research, FODAVA Test-bed; Two stage method for 2D/3D representation of clustered data, InteractiveVisualClassifier, InteractiveVisualClustering, Info space alignments for information fusion (multi-language document analysis) A. Gray – Nonlinear dimension reduction (manifold learning), Fast computation of neighborhood graphs, Fast optimizations for SVMs V. Koltchinskii – Low rank matrix estimation and kernel learning on graphs, Sparse recovery, Multiple kernel learning and fusion of data with heterogeneous types (multi language document analysis) J. Stasko – Improved analytical capabilities in JIGSAW, Interplay between math/comp and interactive visualization R. Monteiro – Sparse Principal Component Analysis and Feature selection based on L1 regularized optimization (POSTER)

5 FODAVA Research Test Bed for High Dimensional Massive Data Open source software Integrates foundational results from FODAVA teams as well as other widely utilized methods (e.g. PCA) Easily accessible to a wide community of researchers Makes methods/algorithms readily available to VA research community and relevant to applications Identifies effective methods for specific problems (evaluation) A base for specialized VA systems (e.g. iVisClassifier, iVisClustering) FODAVA Fundamental Research Applications Test Bed

6 Vector Rep. of Raw Data Text Image Audio … Informative Representation and Transformation Visual Representation Dimension Reduction (2D/3D) Temporal Trend Uncertainty Anomaly/Outlier Causal relationship Zoom in/out by dynamic updating … Clustering Summarization Regression Multi-Resolution Data Reduction Multiple Kernel Leaning … Label Similarity Density Missing value … Interactive Analysis Modules in FODAVA Test Bed

7 iVisClassifier [VAST10] (J. Choo, H. Lee, J. Kim, HP) Interactive visual classification system using supervised dimension reduction –Biometric recognition –Text classification –Search space reduction iVisClustering (H. Lee, J. Kihm, J. Choo, J. Stasko, HP) Interactive visual clustering system using topic modeling (LDA) for text clustering

8 Two-stage Linear Discriminant Analysis for 2D/3D Representation of Clustered Data and Computational Zooming in/out [VAST09, J. Choo, S. Bohn, HP] max (G T S b G)min (G T S w G) & max trace ((G T S w G) -1 (G T S b G)) Regularization in LDA Small regularization Large regularization

9 2D Visualization of Clustered Image and Audio Data Spoken Letters (Audio)Handwritten Digits (Image) PCA Rank-2 LDA PCA Rank-2 LDA

10 iVisClassifier: Computational Zoom-in LDA scatter plot, Cluster level PC, Bases view and Heat Map Applying LDA recursively on the selected subset of data

11 iVisClassifier: Cooperative Filtering (Poster and Demo) Utilizing brushing-and-linking

12 Fusion based on Information Space Alignment (J. Choo, S. Bohn, G. Nakamura, A. White, HP) Want: Unified vector representations of heterogeneous data sets Utilize: Reference correspondence information between data pairs, cluster correspondence, etc. Multi-lingual iVisClassifier Two conflicitng criteria: maximize alignment and minimize deformation Data set A (English)Data set B (Spanish) Fused data sets Existing methods: Constrained Laplacian Eigenmap, Parafac2, Procrustes analysis, …

13 Graph Embedding Approach 1.Represent each data matrix as a graph 2. Add zero-length edges between reference point pairs 3. Apply graph embedding algorithm Data setsSimilarity graph Fused data Matrix representation of graphs e.g., Nonmetric multidimensional scaling (preserving rank order of distances) min ∑(d f A (i,j)- ḋ A (i,j)) 2 + ∑(d f B (i,j)- ḋ B (i,j)) 2 + µ∑(d f AB (r,r)- ḋ AB (r,r)) 2 subject to ḋ AB (r,r)< ḋ A (i,j), ḋ AB (r,r)< ḋ B (i,j) for 1 ≤ r ≤ R and i ≠ j, ḋ : rank orders (POSTER)

14 Evaluation: Cross-domain Retrieval English-Spanish Documents Document(Eng)-Phoneme Data Deformation Alignment Parafac2 Nonmetric MDS Metric MDS Laplacian Eig. Procrustes K in K-NN in fused space

15 Summary / Future Research Informative 2D/3D Representation of Data Clustered Data: Two-stage dimension reduction methods effective for a wide range of problems Interpretable Dimension Reduction for nonnegative data: NMF Customized Fast Algorithms for 2D/3D Reduction needed Dynamic Updating methods for Efficient and Interactive Visualization Visual Analytic Methods for Foundational Problems Classification Information Fusion by Space Alignment Clustering Information Fusion via Space Alignment FODAVA Research Test bed and VA System Development Sparse methods with L1 regularization Sparse Solution for Regression Sparse PCA (with Renato Monteiro)

16


Download ppt "FODAVA-Lead Education, Community Building, and Research: Dimension Reduction and Data Reduction: Foundations for Interactive Visualization Haesun Park."

Similar presentations


Ads by Google