IntroDefinitionSizeComplexityWrap-up 1/54 Individual Big Data Visual Analytics: Challenges and Opportunities Remco Chang and Eli Brown Tufts University.

Slides:



Advertisements
Similar presentations
1/26Remco Chang – Dagstuhl 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University.
Advertisements

1/54Remco Chang – LANL 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
ProvenanceIntroLOCCog StateDist FuncWrap-up 1/52 User-Centric Visual Analytics Remco Chang Tufts University.
ScalaRMotivationQueryPlanWrap-up 1/26 Dynamic Reduction of Query Result Sets for Interactive Visualization Leilani Battle (MIT) Remco Chang (Tufts) Michael.
VALTChessVA IntroAppsWrap-up 1/25 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
ProvenanceIntroApplicationPersonalityDist FuncWrap-up 1/36 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
1/26Remco Chang – PNNL 14 Analyzing User Interactions for Data and User Modeling Remco Chang Assistant Professor Tufts University.
Small Displays Nicole Arksey Information Visualization December 5, 2005 My new kitty, Erwin.
Introduction to Data-driven Animation Jinxiang Chai Computer Science and Engineering Texas A&M University.
Chapter 9 Business Intelligence Systems
Dimensionality Reduction
Sparsity, Scalability and Distribution in Recommender Systems
Data Mining – Intro.
Interactive Visualization Using vision to think Luc Girardin Macrofocus GmbH.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.
SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University.
Electronic Visualization Laboratory, University of Illinois at Chicago PAVIS Pervasive Adaptive Visualization and Interaction Service Javid Alimohideen.
Information Design and Visualization
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
The context of the interface Ian Ruthven University of Strathclyde.
Dist FuncIntroPersonalityProvenanceGroupWrap-up 1/40 User-Centric Visual Analytics Remco Chang Tufts University.
Jaegul Choo1*, Changhyun Lee1, Chandan K. Reddy2, and Haesun Park1
VALTVA IntroAppsWrap-up 1/16 Interactive Data Analysis and Model Exploration: A Visual Analytics Approach Remco Chang Tufts University Department of Computer.
Introduction GAM 376 Robin Burke Winter Outline Introductions Syllabus.
What are your interactions doing for your visualization? Remco Chang UNC Charlotte Charlotte Visualization Center.
OnLine Analytical Processing (OLAP)
David S. Ebert David S. Ebert Visual Analytics to Enable Discovery and Decision Making: Potential, Challenges, and.
1/20 (Big Data Analytics for Everyone) Remco Chang Assistant Professor Department of Computer Science Tufts University Big Data Visual Analytics: A User-Centric.
Enhancing Interactive Visual Data Analysis by Statistical Functionality Jürgen Platzer VRVis Research Center Vienna, Austria.
VISUAL ANALYTICS: VISUAL EXPLORATION, ANALYSIS, AND PRESENTATION OF LARGE COMPLEX DATA Remco Chang, PhD (Charlotte Visualization Center) (Tufts University)
VALTVA IntroAppsWrap-up 1/34 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
RecBench: Benchmarks for Evaluating Performance of Recommender System Architectures Justin Levandoski Michael D. Ekstrand Michael J. Ludwig Ahmed Eldawy.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Visualizing Tabular Data CS 4390/5390 Data Visualization Shirley Moore, Instructor September 29,
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Non-Overlapping Aggregated Multivariate Glyphs for Moving Objects Roeland Scheepens, Huub van de Wetering, Jarke J. van Wijk Presented by: David Sheets.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
Organizing Information for Your Readers Chapter 6.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
ProvenanceIntroPersonalityPrimingDist FuncWrap-up 1/52 User-Centric Visual Analytics Remco Chang Tufts University.
Pad++: A Zooming Graphical Interface for Exploring Alternate Interface Physics Presented By: Daniel Loewus-Deitch.
Lecture 07: Dealing with Big Data
Trust Me, I’m Partially Right: Incremental Visualization Lets Analysts Explore Large Datasets Faster Shengliang Dai.
ProvenanceIntroPersonalityPrimingDist FuncWrap-up 1/40 User-Centric Visual Analytics Remco Chang Tufts University.
LECTURE 16: (EVEN MORE) OPEN QUESTIONS IN VISUAL ANALYTICS December 9, 2015 SDS 235 Visual Analytics.
Tara Wagg and Miranda Barry YRDSB Student Services.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1 Remco Chang – Dagstuhl 15 From vision science to data science: applying perception to problems in big data Remco Chang Assistant Professor Computer Science.
FastMap : Algorithm for Indexing, Data- Mining and Visualization of Traditional and Multimedia Datasets.
IntroGoalCrowdPredictionWrap-up 1/26 Learning Debugging and Hacking the User Remco Chang Assistant Professor Tufts University.
Dense-Region Based Compact Data Cube
Data Mining – Intro.
SIMS 247 Lecture 7 Simultaneous Multiple Views
Lecture 18: (even more) Open Problems
Remco Chang Associate Professor Computer Science, Tufts University
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
Current Issues or Challenges in Visual Analytics
Big Data Visual Analytics: Challenges and Opportunities
An Empirical Study of Web Interface Design on Small Display Devices
Data Warehousing and Data Mining
CSc4730/6730 Scientific Visualization
Information Design and Visualization
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Information Visualization (Part 1)
Data Mining: Concepts and Techniques
Presentation transcript:

IntroDefinitionSizeComplexityWrap-up 1/54 Individual Big Data Visual Analytics: Challenges and Opportunities Remco Chang and Eli Brown Tufts University

IntroDefinitionSizeComplexityWrap-up 2/54 Individual Talk Outline Visual Analytics + Big Data: 1.What is Big Data Visual Analytics? Definition and Problem Statement 2.How to Visualize Large Amounts of Data? 3.Tufts Research on Individual Differences 4.How to Visualize High Dimensional Data?

IntroDefinitionSizeComplexityWrap-up 3/54 Individual 1. What is Big Data Visual Analytics? A Definition and Problem Statement

IntroDefinitionSizeComplexityWrap-up 4/54 Individual Defining Big Data for Visual Analytics Let’s say that I have a billion data items, is that Big Data? What if: – These data items only have two attributes (e.g., latitude, longitude)? – If I transpose this dataset such that I have two rows of data, but with a billion attributes?

IntroDefinitionSizeComplexityWrap-up 5/54 Individual Defining Big Data for Visual Analytics Big Data is NOT just about the size of your data For the purpose of this talk, let’s talk about Big Data in the following way: – Size: The number of rows (n) Assume the amount of data cannot fit into a desktop computer’s memory – Complexity: The number of attributes (k) Assume (k > 2)

IntroDefinitionSizeComplexityWrap-up 6/54 Individual Problem Statements Considering the two together is too difficult, so we’ll tackle the two issues independently for now Our goal is to visualize (large| complex) data sets while: – Maintaining interactivity: rendering at 10 fps – Allowing for operations on the data (zoom, pivot, etc)

IntroDefinitionSizeComplexityWrap-up 7/54 Individual 2. How to Visualize Large Amount of Data?

IntroDefinitionSizeComplexityWrap-up 8/54 Individual Problem Statement Visualization on a Commodity Hardware Large Data in a Data Warehouse

IntroDefinitionSizeComplexityWrap-up 9/54 Individual Problem Statement Constraint: Data is too big to fit into the memory or hard drive of the personal computer – Note: Ignoring various database technologies (OLAP, Column-Store, No-SQL, Array-Based, etc) Classic Computer Science Problem… What are some previous techniques? – Truncate (sample, filter) – Resolution reduction (“blurring”, image zooming) – Stream (think Netflix, Hulu) – Pre-fetch (think open world 3D video games)

IntroDefinitionSizeComplexityWrap-up 10/54 Individual Pros and Cons: Truncate Truncate (sample, filter) – Pros: Easy to implement; efficient; scalable – Cons: Sampling is often data- or task-dependent Sampling Algorithm

IntroDefinitionSizeComplexityWrap-up 11/54 Individual Pros and Cons: Resolution Reduction Resolution reduction (“blurring”) – Pros: Allows hierarchical navigations – Cons: Fine details are often lost, not all data types can be easily blurred (order-invariant data)

IntroDefinitionSizeComplexityWrap-up 12/54 Individual Pros and Cons: Streaming Stream [Fisher et al. CHI 2012] – Pros: Query can be terminated at any time – Cons: It is inefficient on the database end t = 1 second t = 5 minute Fisher et al., Trust Me, I'm Partially Right: Incremental Visualization Lets Analysts Explore Large Datasets Faster. CHI 2012

IntroDefinitionSizeComplexityWrap-up 13/54 Individual Pros and Cons: Pre-Fetch Pre-fetch – Pros: Seamless to the user – Cons: Predicting the future is kind of hard Possible in 3D games because of limited degrees of freedom

IntroDefinitionSizeComplexityWrap-up 14/54 Individual Pros and Cons: Pre-Fetch Pre-fetch in Visual Analytics [Chan, Hanrahan, 2008 VAST] – Limit the types of operations a user can do – Allows interactive analysis of over a billion data points Chan et al.,. Maintaining Interactivity While Exploring Massive Time Series. IEEE VAST 2008

IntroDefinitionSizeComplexityWrap-up 15/54 Individual Research at Tufts: User-Centric Pre-Fetching Joint work with Caroline Ziemkiewicz, Alvitta Ottley

IntroDefinitionSizeComplexityWrap-up 16/54 Individual Motivation

IntroDefinitionSizeComplexityWrap-up 17/54 Individual Individual Differences and Interaction Pattern Existing research shows that all the following factors affect how someone uses a visualization: – Spatial Ability – Cognitive Workload/Mental Demand – Personality – Experience (novice vs. expert) – Emotional State – Perceptual Speed – … and more

IntroDefinitionSizeComplexityWrap-up 18/54 Individual Preliminary Study – Novice v. Expert Novice vs. Expert financial experts use of the WireVis system when searching for fraud – Novice exhibited “breadth-first-search” behaviors – Experts exhibited “depth-first-search” behaviors Our next step is to use Machine Learning methods to distinguish a user by analyzing their interactions in real-time

IntroDefinitionSizeComplexityWrap-up 19/54 Individual Preliminary Study – Locus of Control Identified the personality factor, Locus of Control (LOC), as a predictor for how a user interacts with the following visualizations:

IntroDefinitionSizeComplexityWrap-up 20/54 Individual Results When with list view compared to containment view, internal LOC users are: – faster (by 70%) – more accurate (by 34%) Only for complex (inferential) tasks The speed improvement is about 2 minutes (116 seconds) R. Chang et al., How Locus of Control Influences Compatibility with Visualization Style, IEEE VAST R. Chang et al., How Visualization Layout Relates to Locus of Control and Other Personality Factors. TVCG To Appear.

IntroDefinitionSizeComplexityWrap-up 21/54 Individual Cognitive / Affective Priming

IntroDefinitionSizeComplexityWrap-up 22/54 Individual LOC Priming Visual Form List-View Containment Performance Poor Good Internal LOC External LOC Average ->Internal Average LOC R. Chang et al., Poster: Priming locus of control to affect performance. VAST Poster 2012.

IntroDefinitionSizeComplexityWrap-up 23/54 Individual Affective Priming on Visual Judgment R. Chang et al., Influencing Visual Judgment Through Affective Priming, CHI To Appear

IntroDefinitionSizeComplexityWrap-up 24/54 Individual Affective Priming on Visual Judgment R. Chang et al., Influencing Visual Judgment Through Affective Priming, CHI To Appear

IntroDefinitionSizeComplexityWrap-up 25/54 Individual Preliminary Study – Using Brain Sensing (fNIRS) Functional Near-Infrared Spectroscopy a lightweight brain sensing technique measures mental demand (working memory) R. Chang et al., Using fNIRS Brain Sensing to Evaluate Information Visualization Interfaces. CHI To Appear

IntroDefinitionSizeComplexityWrap-up 26/54 Individual This is Your Brain on Bar graphs and Pie Charts 3-back test

IntroDefinitionSizeComplexityWrap-up 27/54 Individual Quick Summary Pre-Fetching is a promising approach for supporting interactive visual analysis of large amounts of data Our “User-Centric” approach is three-pronged: – Understand the user’s cognitive “traits” (e.g., LOC, Numeracy, Spatial Ability, etc.) – Understand the user’s cognitive “states” (Cognitive Load, Affect, etc.) – Alter the user’s behavior by influencing cognitive traits and states through priming

IntroDefinitionSizeComplexityWrap-up 28/54 Individual 3. How to Visualize Complex (High-Dimensional) Data?

IntroDefinitionSizeComplexityWrap-up 29/54 Individual Why is This Problem Hard? You can only see 2D because Your monitor is 2D In other words: you can show at most 2 dimensional data. Everything else is a hack.

IntroDefinitionSizeComplexityWrap-up 30/54 Individual Ways to Visualize k-Dimensional Data Two primary ways to do this “hack” – Divide up the 2D screen into multiple 2D regions Showing no correlation between dimensions Showing k-1 correlations Showing all pair-wise correlations – Project k-Dimensional Data into 2D 3D to 2D k-D projection

IntroDefinitionSizeComplexityWrap-up 31/54 Individual Ways to Visualize k-Dimensional Data Divide up the 2D screen into multiple 2D regions – Showing no correlation between dimensions – Showing k-1 correlations – Showing all pair-wise correlations Project k-Dimensional Data into 2D – 3D to 2D – k-D projection

IntroDefinitionSizeComplexityWrap-up 32/54 Individual Ways to Visualize k-Dimensional Data Divide up the 2D screen into multiple 2D regions – Showing no correlation between dimensions – Showing k-1 correlations – Showing all pair-wise correlations Project k-Dimensional Data into 2D – 3D to 2D – k-D projection Parallel Coordinates

IntroDefinitionSizeComplexityWrap-up 33/54 Individual Ways to Visualize k-Dimensional Data Divide up the 2D screen into multiple 2D regions – Showing no correlation between dimensions – Showing k-1 correlations – Showing all pair-wise correlations Project k-Dimensional Data into 2D – 3D to 2D – k-D projection Scatterplot Matrix

IntroDefinitionSizeComplexityWrap-up 34/54 Individual Ways to Visualize k-Dimensional Data Divide up the 2D screen into multiple 2D regions – Showing no correlation between dimensions – Showing k-1 correlations – Showing all pair-wise correlations Project k-Dimensional Data into 2D – 3D to 2D – k-D projection

IntroDefinitionSizeComplexityWrap-up 35/54 Individual Ways to Visualize k-Dimensional Data Divide up the 2D screen into multiple 2D regions – Showing no correlation between dimensions – Showing k-1 correlations – Showing all pair-wise correlations Project k-Dimensional Data into 2D – 3D to 2D – k-D projection

IntroDefinitionSizeComplexityWrap-up 36/54 Individual Ways to Visualize k-Dimensional Data Divide up the 2D screen into multiple 2D regions – Showing no correlation between dimensions – Showing k-1 correlations – Showing all pair-wise correlations Project k-Dimensional Data into 2D – 3D to 2D – k-D projection Example Projection Methods: (Dimension Reduction) PCA MDS LDA LLE Many others! Usually, try to preserve distances in 2D as they exist in k-D

IntroDefinitionSizeComplexityWrap-up 37/54 Individual What We Have Done (at Tufts) We like projection methods because it is more scalable than the “divide the screen” methods iPCA – does interaction help understanding high dimensional data? – Demo Dis-Function – are interactions in 2D meaningful (recoverable) in k-D? – Switch to Eli

IntroDefinitionSizeComplexityWrap-up 38/54 Individual Summary

IntroDefinitionSizeComplexityWrap-up 39/54 Individual Summary Visual Analytics + Big Data: 1.Definition of Big Data Visual Analytics (Large | Complex) Data Analysis 2.How to Visualize Large Amounts Data? Pre-Fetching using individual differences and priming 3.How to Visualize High Dimensional Data? nD to 2D Projection Translating interactions from 2D to nD

IntroDefinitionSizeComplexityWrap-up 40/54 Individual