Dist FuncIntroVAAppsATGWrap-up 1/25 Visual Analytics Research at Tufts Remco Chang Assistant Professor Tufts University.

Slides:



Advertisements
Similar presentations
Developing a Visual Analytics Approach to Analytic Problem- Solving William Ribarsky UNC Charlotte.
Advertisements

Lindsey Bleimes Charlie Garrod Adam Meyerson
ProvenanceIntroLOCCog StateDist FuncWrap-up 1/52 User-Centric Visual Analytics Remco Chang Tufts University.
Understanding Cancer-based Networks in Twitter using Social Network Analysis Dhiraj Murthy Daniela Oliveira Alexander Gross Social Network Innovation Lab.
LECTURE 10: ANALYTIC PROVENANCE April 6, 2015 COMP Topics in Visual Analytics Note: slide deck adapted from R. Chang.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Searching on Multi-Dimensional Data
EvaluationIntroVis/GfxInteractionWrap-up Thinking Interactively with Visualizations Remco Chang UNC Charlotte Charlotte Visualization Center.
Towards a zoomable cell abstract cell natural coordinate system Data > D Protein Structures from PDB ? A IHGFBCDE > Images from scientific.
VALTChessVA IntroAppsWrap-up 1/25 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
ProvenanceIntroApplicationPersonalityDist FuncWrap-up 1/36 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
Xiaowei Ying, Xintao Wu, Daniel Barbara Spectrum based Fraud Detection in Social Networks 1.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
WireVis Visualization of Categorical, Time-Varying Data From Financial Transactions Remco Chang, Mohammad Ghoniem, Robert Kosara, Bill Ribarsky, Jing Yang,
Chapter 4 DECISION SUPPORT AND ARTIFICIAL INTELLIGENCE
Research to Reality William Ribarsky Remco Chang University of North Carolina at Charlotte.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Database Management: Getting Data Together Chapter 14.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Distributed Model-Based Learning PhD student: Zhang, Xiaofeng.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Large Scale Data Analytics
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
ComputingIntroVAGraphicsInteractionWrap-up 1/33 Data Exploration, Analysis, and Representation: Integration through Visual Analytics Remco Chang, PhD UNC.
ComputingIntroVAGraphicsInteractionWrap-up 1/33 Data Exploration, Analysis, and Representation: Integration through Visual Analytics Remco Chang, PhD UNC.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
1/30Remco Chang – SEAri Workshop 15 Big Data Visual Analytics: A User Centric Approach Remco Chang Assistant Professor Tufts University.
SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
Dist FuncIntroPersonalityProvenanceGroupWrap-up 1/40 User-Centric Visual Analytics Remco Chang Tufts University.
IAT Text ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT]
VALTVA IntroAppsWrap-up 1/16 Interactive Data Analysis and Model Exploration: A Visual Analytics Approach Remco Chang Tufts University Department of Computer.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
What are your interactions doing for your visualization? Remco Chang UNC Charlotte Charlotte Visualization Center.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
1 Xiaoyu Wang UNC Charlotte Erin Miller START Center, U. Maryland Kathleen Smarick START Center, U Maryland William Ribarsky UNC Charlotte Remco Chang.
1/20 (Big Data Analytics for Everyone) Remco Chang Assistant Professor Department of Computer Science Tufts University Big Data Visual Analytics: A User-Centric.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
VISUAL ANALYTICS: VISUAL EXPLORATION, ANALYSIS, AND PRESENTATION OF LARGE COMPLEX DATA Remco Chang, PhD (Charlotte Visualization Center) (Tufts University)
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
DimensionIntroVAGraphicsInteractionWrap-up 1/50 Data Exploration, Analysis, and Representation: Integration through Visual Analytics Remco Chang UNC Charlotte.
VALTVA IntroAppsWrap-up 1/34 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent W. Freeh Dr. Kevin Bowyer Supported in part by the National Science.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
I Robot.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
ProvenanceIntroPersonalityPrimingDist FuncWrap-up 1/52 User-Centric Visual Analytics Remco Chang Tufts University.
The Interplay Between Mathematics/Computation and Analytics Haesun Park Division of Computational Science and Engineering Georgia Institute of Technology.
ProvenanceIntroPersonalityPrimingDist FuncWrap-up 1/40 User-Centric Visual Analytics Remco Chang Tufts University.
Visual Analytics: An opportunity for the HPC community Shawn J. Bohn September 8-10, 2008 HPC User Forum Meeting.
A Novel Visualization Model for Web Search Results Nguyen T, and Zhang J IEEE Transactions on Visualization and Computer Graphics PAWS Meeting Presented.
1/41 Visualization and Analysis of Text Remco Chang, PhD Assistant Professor Department of Computer Science Tufts University December 17, 2010 Cologne,
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
LECTURE 12: ANALYTIC PROVENANCE November 16, 2015 SDS235: Visual Analytics Note: slide deck adapted from R. Chang.
Evaluating the Relationships between User Interaction and Financial Visual Analysis Dong Hyun Jeong, Wenwen Dou, Felesia Stukes, William Ribarsky, Heather.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
IntroGoalCrowdPredictionWrap-up 1/26 Learning Debugging and Hacking the User Remco Chang Assistant Professor Tufts University.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Michael.
Introduction to Machine Learning, its potential usage in network area,
Lecture 15: Analytic Provenance
Inquiry, Pedagogy, & Technology: Automated Textual Analysis of 30 Refereed Journal Articles David A. Thomas Mathematics Center, University of Great Falls,
Big Data Visual Analytics: Challenges and Opportunities
به نام خدا Big Data and a New Look at Communication Networks Babak Khalaj Sharif University of Technology Department of Electrical Engineering.
Data Warehousing and Data Mining
CSc4730/6730 Scientific Visualization
Visualizing Document Collections
Introduction to Visual Analytics
Presentation transcript:

Dist FuncIntroVAAppsATGWrap-up 1/25 Visual Analytics Research at Tufts Remco Chang Assistant Professor Tufts University

Dist FuncIntroVAAppsATGWrap-up 2/25 Problem Statement The growth of data is exceeding our ability to analyze them. The amount of digital information generated in the years 2002, 2006, 2010: – 2002: 22 EB (exabytes, ) – 2006: 161 EB – 2010: 988 EB (almost 1 ZB) 1: Data courtesy of Dr. Joseph Kielman, DHS 2: Image courtesy of Dr. Maria Zemankova, NSF

Dist FuncIntroVAAppsATGWrap-up 3/25 Problem Statement The data is often complex, ambiguous, noisy. Analysis of which requires human understanding. – About 2 GB of digital information is being produced per person per year – 95% of the Digital Universe’s information is unstructured 1: Data courtesy of Dr. Joseph Kielman, DHS 2: Image courtesy of Dr. Maria Zemankova, NSF

Dist FuncIntroVAAppsATGWrap-up 4/25 Example: What Does Fraud Look Like? Financial Institutions like Bank of America have legal responsibilities to report all suspicious activities Data size: approximately 200,000 transactions per day (73 million transactions per year) Problems: – Automated approach can only detect known patterns – Bad guys are smart: patterns are constantly changing – No single transaction appears fraudulent – Few experts: fraud detection is considered an “art” – Data is messy: lack of international standards resulting in ambiguous data Current methods: – 10 analysts monitoring and analyzing all transactions – Using SQL queries and spreadsheet-like interfaces – Limited to the time scale (2 weeks)

Dist FuncIntroVAAppsATGWrap-up 5/25 WireVis: Financial Fraud Analysis In collaboration with Bank of America – Looks for suspicious wire transactions – Currently beta-deployed at WireWatch – Visualizes 7 million transactions over 1 year Uses interaction to coordinate four perspectives: – Keywords to Accounts – Keywords to Keywords – Keywords/Accounts over Time – Account similarities (search by example)

Dist FuncIntroVAAppsATGWrap-up 6/25 WireVis: Financial Fraud Analysis Heatmap View (Accounts to Keywords Relationship) Strings and Beads (Relationships over Time) Search by Example (Find Similar Accounts) Keyword Network (Keyword Relationships) R. Chang et al., Scalable and interactive visual analysis of financial wire transactions for fraud detection. Information Visualization,2008. R. Chang et al., Wirevis: Visualization of categorical, time-varying data from financial transactions. IEEE VAST, 2007.

Dist FuncIntroVAAppsATGWrap-up 7/25 What is Visual Analytics? Visual analytics is the science of analytical reasoning facilitated by interactive visual interfaces [Thomas & Cook 2005] Since 2004, the field has grown significantly. Aside from tens to hundreds of domestic and international partners, it now has a IEEE conference (IEEE VAST), an NSF program (FODAVA), and a forthcoming IEEE Transactions journal.

Dist FuncIntroVAAppsATGWrap-up 8/25 Individually Not Unique Analytical Reasoning and Interaction Visual Representation Production, Presentation Dissemination Data Representation Transformation Validation and Evaluation Data Mining Machine Learning Databases Information Retrieval etc Tech Transfer Report Generation etc Quality Assurance User studies (HCI) etc Interaction Design Cognitive Psychology Intelligence Analysis etc. InfoVis SciVis Graphics etc

Dist FuncIntroVAAppsATGWrap-up 9/25 In Combinations of 2 or 3… Analytical Reasoning and Interaction Visual Representation Production, Presentation Dissemination Data Representation Transformation Validation and Evaluation Data Mining Machine Learning Databases Information Retrieval etc InfoVis SciVis Graphics etc

Dist FuncIntroVAAppsATGWrap-up 10/25 In Combinations of 2 or 3… Analytical Reasoning and Interaction Visual Representation Production, Presentation Dissemination Data Representation Transformation Validation and Evaluation Interaction Design Cognitive Psychology Intelligence Analysis etc. Tech Transfer Report Generation etc

Dist FuncIntroVAAppsATGWrap-up 11/25 Extending Visual Analytics Principles Global Terrorism Database – Application of the investigative 5 W’s Bridge Maintenance – Exploring subjective inspection reports Biomechanical Motion – Interactive motion comparison methods Where When Who What Original Data Evidence Box R. Chang et al., Investigative Visual Analysis of Global Terrorism, Journal of Computer Graphics Forum, 2008.

Dist FuncIntroVAAppsATGWrap-up 12/25 Extending Visual Analytics Principles Global Terrorism Database – Application of the investigative 5 W’s Bridge Maintenance – Exploring subjective inspection reports Biomechanical Motion – Interactive motion comparison methods R. Chang et al., An Interactive Visual Analytics System for Bridge Management, Journal of Computer Graphics Forum, To Appear.

Dist FuncIntroVAAppsATGWrap-up 13/25 Extending Visual Analytics Principles Global Terrorism Database – Application of the investigative 5 W’s Bridge Maintenance – Exploring subjective inspection reports Biomechanical Motion – Interactive motion comparison methods R. Chang et al., Interactive Coordinated Multiple-View Visualization of Biomechanical Motion Data, IEEE Vis (TVCG) 2009.

Dist FuncIntroVAAppsATGWrap-up 14/25 Human + Computer A Mixed-Initiative Perspective So far, our approach is mostly user-driven Human vs. Artificial Intelligence Garry Kasparov vs. Deep Blue (1997) – Computer takes a “brute force” approach without analysis – “As for how many moves ahead a grandmaster sees,” Kasparov concludes: “Just one, the best one” Artificial Intelligence vs. Augmented Intelligence Hydra vs. Cyborgs (1998) – Grandmaster + 1 computer > Hydra (equiv. of Deep Blue) – Amateur + 3 computers > Grandmaster + 1 computer 1 How to systematically repeat the success? – Unsupervised machine learning + User – User’s interactions with the computer 1. ComputerTranslationHuman

Dist FuncIntroVAAppsATGWrap-up 15/25 Examples of Human + Computer Computing CAPCHA – RE-CAPCHA – General Crowd-Sourcing Adaptive / Intelligent User Interfaces (IUI) User assisted clustering / searching

Dist FuncIntroVAAppsATGWrap-up 16/25 Simple Example Distance Function

Dist FuncIntroVAAppsATGWrap-up 17/25 Application 1: Find Important Features Data set: X, 178x13 3 classes add 10 random number columns as extra features

Dist FuncIntroVAAppsATGWrap-up 18/25 1 st Step: Success Trying to separate circled green dots from all blue dots

Dist FuncIntroVAAppsATGWrap-up 19/25 Result Recall the structure of data set Weight vector: – Randomly generated features gets low weights Original Wine Dataset, each instance has 13 feature values 10 Randomly generated feature values for every instance

Dist FuncIntroVAAppsATGWrap-up 20/25 Visual Analytics for Political Science

Dist FuncIntroVAAppsATGWrap-up 21/25 Aggregate Temporal Graph 1000 simulations 60 time steps in each simulation (time step == a node) (edge == transition) Merged time steps if two states are the same

Dist FuncIntroVAAppsATGWrap-up 22/25 Aggregate Temporal Graph

Dist FuncIntroVAAppsATGWrap-up 23/25 Gateways and Terminals Each of the yellow vertices is a Gateway to the vertex set of {A}. That is, every maximal path leaving a yellow vertex eventually passes through A. Vertex G is a Gateway to each of the yellow vertices, or Terminals. That is, every maximal path leaving G passes eventually through each of the yellow vertices.

Dist FuncIntroVAAppsATGWrap-up 24/25 Applications of Aggregate Temporal Graphs A generalizable representation of problems involving parameter spaces that are too large to explore as a whole, but which are composed of related individual parts can be examined independently Collaborative Analysis – Each analyst’s trail is a simulation – Each configuration state is a node Web Analytics – Each visit is a simulation – Each configuration of a page is a node

Dist FuncIntroVAAppsATGWrap-up 25/25 Conclusion Analytical Reasoning and Interaction Visual Represent ation Production, Presentatio n Disseminati on Data Representat ion Transformat ion Validation and Evaluation Visual Analytics is a growing new area that is looking to address some pressing needs – Too much (messy) data, too little time By combining strengths and findings in existing disciplines, we have demonstrated that – There are some great benefits – But there are also some difficult challenges

Dist FuncIntroVAAppsATGWrap-up 26/25 Questions? Thank you!

Dist FuncIntroVAAppsATGWrap-up 27/25 Backup Slides

Dist FuncIntroVAAppsATGWrap-up 28/25 (2) Investigative GTD Where When Who What Original Data Evidence Box R. Chang et al., Investigative Visual Analysis of Global Terrorism, Journal of Computer Graphics Forum (Eurovis), 2008.

Dist FuncIntroVAAppsATGWrap-up 29/25 WHY ? WHY ? This group’s attacks are not bounded by geo-locations but instead, religious beliefs. Its attack patterns changed with its developments. (2) Investigative GTD: Revealing Global Strategy

Dist FuncIntroVAAppsATGWrap-up 30/25 Domestic Group A geographically- bounded entity in the Philippines. The ThemeRiver shows its rise and fall as an entity and its modus operandi. (2) Investigative GTD: Discovering Unexpected Temporal Pattern

Dist FuncIntroVAAppsATGWrap-up 31/25 What is in a User’s Interactions? Types of Human-Visualization Interactions – Word editing (input heavy, little output) – Browsing, watching a movie (output heavy, little input) – Visual Analysis (closer to 50-50) VisualizationHuman Output Input Keyboard, Mouse, etc Images (monitor)

Dist FuncIntroVAAppsATGWrap-up 32/25 Discussion What interactivity is not good for: – Presentation – YMMV = “your mileage may vary” Reproducibility: Users behave differently each time. Evaluation is difficult due to opportunistic discoveries.. – Often sacrifices accuracy iPCA – SVD takes time on large datasets, use iterative approximation algorithms such as onlineSVD. WireVis – Clustering of large datasets is slow. Either pre-compute or use more trivial “binning” methods.

Dist FuncIntroVAAppsATGWrap-up 33/25 Discussion Interestingly, – It doesn’t save you time… – And it doesn’t make a user more accurate in performing a task. However, there are empirical evidence that using interactivity: – Users are more engaged (don’t give up) – Users prefer these systems over static (query-based) systems – Users have a faster learning curve We need better measurements to determine the “benefits of interactivity”