WireVis Visualization of Categorical, Time-Varying Data From Financial Transactions Remco Chang, Mohammad Ghoniem, Robert Kosara, Bill Ribarsky, Jing Yang,

Slides:



Advertisements
Similar presentations
Developing a Visual Analytics Approach to Analytic Problem- Solving William Ribarsky UNC Charlotte.
Advertisements

OptiShip ® Multi-carrier Shipping System. OptiShip ® customers save on average 13.6% of parcel shipping costs… OptiShip ® is a comprehensive system that.
Chapter 10: Designing Databases
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
How-To & Search Strategies April 16 th, Contents Using Grant Forward Search Results Filtering your Search – Keywords – Categories – Sponsors – Deadlines.
VALTChessVA IntroAppsWrap-up 1/25 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
Dist FuncIntroVAAppsATGWrap-up 1/25 Visual Analytics Research at Tufts Remco Chang Assistant Professor Tufts University.
Online Banking Fraud Prevention Recommendations and Best Practices This document provides you with fraud prevention best practices that every employee.
6 th Annual Focus Users’ Conference 6 th Annual Focus Users’ Conference Accounts Receivable Presented by: Robert Myers Presented by: Robert Myers.
Data Analysis in Excel. Importance of Data Analysis Tracking and analyzing data are increasingly important in business, medicine, sports, politics, and.
1.7.6.G1 © Family Economics & Financial Education –March 2008 – Financial Institutions – Online Banking Funded by a grant from Take Charge America, Inc.
Search Engines and Information Retrieval
Research to Reality William Ribarsky Remco Chang University of North Carolina at Charlotte.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
1 / 19 Perspective on Visualizing Social Sciences Remco Chang Charlotte Visualization Center UNC Charlotte.
LSP 121 Access Forms, Reports, and Switchboard. Overview You’ve learned several of the basic nuts-and-bolts of creating a database, entering data, and.
Mgt 240 Lecture MS Excel and Access: Introduction to Databases September 23, 2004.
Data Mining.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Project Update: Law Enforcement Resource Allocation (LERA) Visualization System Michael Welsman-Dinelle April Webster.
TimeCleanser: A Visual Analytics Approach for Data Cleansing of Time-Oriented Data Theresia Gschwandtner, Wolfgang Aigner, Silvia Miksch, Johannes Gärtner,
BUSINESS DRIVEN TECHNOLOGY
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Access 2007 ® Use Databases How can Access help you to find and use information?
1 Raymond Doray Conflicts between the new Canadian Money Laundering Act and the rules of professional conduct and ethics September 13, 2002.
Page Up or Down to navigate through the program.
Chapter Eight Database Applications and Implications.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Chapter 11 Databases. 11 Chapter 11: Databases2 Chapter Contents  Section A: File and Database Concepts  Section B: Data Management Tools  Section.
SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
L.E.A. Data Technologies L.E.A. Data Technologies Introduction.
CS 474 Database Design and Application Terminology Jan 11, 2000.
1 Xiaoyu Wang UNC Charlotte Erin Miller START Center, U. Maryland Kathleen Smarick START Center, U Maryland William Ribarsky UNC Charlotte Remco Chang.
VISUAL ANALYTICS: VISUAL EXPLORATION, ANALYSIS, AND PRESENTATION OF LARGE COMPLEX DATA Remco Chang, PhD (Charlotte Visualization Center) (Tufts University)
Exploring Text: Zipf’s Law and Heaps’ Law. (a) (b) (a) Distribution of sorted word frequencies (Zipf’s law) (b) Distribution of size of the vocabulary.
VALTVA IntroAppsWrap-up 1/34 User-Centric Visual Analytics Remco Chang Tufts University Department of Computer Science.
Database What is a database? A database is a collection of information that is typically organized so that it can easily be storing, managing and retrieving.
1 / 17 Visualization of GTD and Multimedia Remco Chang Charlotte Visualization Center UNC Charlotte.
Instant Information Access With Magnify Search Dr. Rado Kotorov Technical Director Strategic Product Mgt.
Robin Mullinix Systems Analyst GeorgiaFIRST Financials PeopleSoft Query: The Next Step.
Managing the Impacts of Change on Archiving Research Data A Presentation for “International Workshop on Strategies for Preservation of and Open Access.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Capabilities of Software. Object Linking & Embedding (OLE) OLE allows information to be shared between different programs For example, a spreadsheet created.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
CSC Intro. to Computing Lecture 10: Databases.
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
ITAAT DHE Solutions Ltd UK Moodle databses David Evans. 31 st March 2011.
Gold – Crystal Reports Introductory Course Cortex User Group Meeting New Orleans – 2011.
1 Visual Analytics Techniques that Enable Knowledge Discovery: Detect the Expected and Discover the Unexpected Jim J. Thomas Director, National Visualization.
CISC 849 : Applications in Fintech Financial Visualization Leonardo De La Rosa Institute for Financial Services Analytics University of Delaware.
Evaluating the Relationships between User Interaction and Financial Visual Analysis Dong Hyun Jeong, Wenwen Dou, Felesia Stukes, William Ribarsky, Heather.
Jianping Fan Department of Computer Science University of North Carolina at Charlotte Charlotte, NC Relevance Feedback for Image Retrieval.
IntroGoalCrowdPredictionWrap-up 1/26 Learning Debugging and Hacking the User Remco Chang Assistant Professor Tufts University.
21 Copyright © 2008, Oracle. All rights reserved. Enabling Usage Tracking.
The Big Picture Things to think about What different ways are there to collect information automatically? What are the advantages and disadvantages of.
If you have a transaction processing system, John Meisenbacher
What is ClassiFire ® ? ClassiFire is a patented Artificial Intelligence system used in the Stratos-HSSD ® range of Aspirating Smoke Detectors. ClassiFire.
Using Visual Basic.NET Programming Tools in the AIS Course Training Session Brian R. Kovar Kansas State University 7 th AIS Educator Annual Meeting June.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Thank you/Appreciate time Intro me- Manage channel last 2 years
Advantages of ICT over Manual Methods of Processing Data
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Big Data Visual Analytics: Challenges and Opportunities
Welcome.
Take notes on your own paper
PolyAnalyst™ text mining tool Allstate Insurance example
Data Analytics Case Study
Presentation transcript:

WireVis Visualization of Categorical, Time-Varying Data From Financial Transactions Remco Chang, Mohammad Ghoniem, Robert Kosara, Bill Ribarsky, Jing Yang, Evan Suma, Caroline Ziemkiewicz UNC Charlotte Daniel Kern, Agus Sudjianto Bank of America

2/20 WireVis: Multi-National Collaboration Canada Caroline Ziemkiewicz USA Bill Ribarsky Evan Suma Daniel Kern (BofA) Egypt Mohammad Ghoniem Austria Robert Kosara China Jing Yang Taiwan Remco Chang Indonesia Agus Sudjianto (BofA)

3/20 WireVis: Disclaimer  Highly sensitive data Involving individuals’ financial records  All names and specific strategies used by Bank of America have been removed from this presentation  Informative relating to Bank of America have been obscured For example, instead of saying there are 215 transactions, I might say there are between transactions.

4/20 WireVis: Why Fraud Detection?  Financial Institutions like Bank of America have legal responsibilities to the federal government to report all suspicious activities (money laundering, terrorist support, etc) Monetary and operational penalties including the possibility of being shut down  Advantages? Other than consumer trust, there is little to gain from fraud detection Great for us! Because there is no competitive advantage, the institutions are willing to work together Everyone wants to do “best practice” Viscenter Symposium

5/20 WireVis: Challenges to Financial Fraud Detection  Bad guys are smart Automatic detection (black box) approach is reactive to already known patterns Usually, bad guys are one step ahead  Evaluation is difficult Financial Institutions do not perform law enforcement Suspicious reports are filed Turn around time on accuracy of reports could be long Difficult to obtain “Ground Truth” What is the percentage of fraudulent activities that are actually found and reported?

6/20 WireVis: Challenges with Wire Fraud Detection  Size More than 200,000 transactions per day  “No a transaction by itself is suspicious”  Lack of International Wire Standard Loosely structured data with inherent ambiguity Indonesia Charlotte, NC Singapore London

7/20 WireVis: Challenges with Wire Fraud Detection  No Standard Form… When a wire leaves Bank of America in Charlotte… The recipient can appear as if receiving at London, Indonesia or Singapore  Vice versa, if receiving from Indonesia to Charlotte The sender can appear as if originating from London, Singapore, or Indonesia Indonesia Charlotte, NC Singapore London

8/20 WireVis: Using Keywords  Keywords… Words that are used to filter all transactions Only transactions containing keywords are flagged Highly secretive Typically include Geographical information (country, city names) Business types Specific goods and services Etc Updated based on intelligence reports Ranges from words Could reduce the number of transactions by up to 90% Most importantly, give quantifiable meanings (labels) to each transaction

9/20 WireVis: Current Practice at Bank of America  Database Querying Experts filter the transactions by keywords, amounts, date, etc. Results are displayed in a spreadsheet.  Problems Cannot see more than a week or two of transactions Difficult to see temporal patterns It is difficult to be exploratory using a querying system

10/20 WireVis: System Overview Heatmap View (Accounts to Keywords Relationship) Strings and Beads (Relationships over Time) Search by Example (Find Similar Accounts) Keyword Network (Keyword Relationships)

11/20 WireVis: Heatmap View  List of Keywords  Sorted by frequency from high to low (left to right)  Hierarchical Clusters of Accounts  Sorted by activities from big companies to individuals (top to bottom)  Fast “binning” that takes O(3n)  Number of occurrences of keywords  Light color indicates few occurrences

12/20 WireVis: Strings and Beads  Time  Y-axis can be amounts, number of transactions, etc.  Fixed or logarithmic scale  Each string corresponds to a cluster of accounts in the Heatmap view  Each bead represents a day

13/20 WireVis: Keyword Network  Each dot is a keyword  Position of the keyword is based on their relationships Keywords close to each other appear together more frequently Using a spring network, keywords in the center are the most frequently occurring keyword  Link between keywords denote co-occurrence

14/20 WireVis: Search By Example  Target Account  Histogram depicts the occurrences of keywords  User interactive selects features within the histogram used in comparison  Similarity threshold slider  Accounts that are within the similarity threshold appear ranked (most similar on top)

15/20 WireVis: Case Study  Evaluation performed with James Price, lead analyst of WireWatch of Bank of America  Dataset has been sanitized and down sampled  Demo  This system is generalizable to visual analysis of transactional data

16/20 WireVis: Since March 31 st (Vis Deadline)…  Scalability We’re now connected to the database at Bank of America with millions of records over the course of a rolling year (13 months) Connecting to a database makes interactive visualization tricky  Unexpected Results “go to where the data is” – operations relating to the data are pushed onto the database (e.g, clustering) Database Raw Data Stored Procedure Temp Tables SQL JDBC WireVis Client

17/20 WireVis: Since March 31 st …  Performance Measurements Data-driven operations such as re-clustering, drilldown, transaction search by keywords require worst case of 1-2 minutes. All other interactions remain real time No pre-computation / caching Single CPU desktop computer  WireVis is in deployment on James Price’s computer at WireWatch for testing and evaluation

18/20 WireVis: Future Work  Combine Visualization with Querying  Use text analysis (like IN-SPIRE) to automatically identify keywords  Relationships between Accounts Seeing who send money to whom (over time) is important  Evaluation Working with analysts, try to understand how they use the system and how to better their workflow  Tracking and Reporting With tracking, we can make the analysis results “repeatable”, “sharable”, and “accountable”

19/20 WireVis: Lessons Learned  Financial Visual Analysis is Necessary! Financial institutions have more data than they can comprehend. Using visualization to organize the data is a promising future direction.  Working with Financial Institutions Takes Patience Dealing with sensitive data means more precautions are needed. For good reasons, financial institutions are slow to change. Gaining trust and credibility takes time Lawyers, lawyers, lawyers This paper has been nearly 2 years in the making…  Collaborate with the Financial Institution Working with a data and systems expert at the institution makes development much more simple.

20/20 Questions and Comments? 20 Thank you!

21/20 On a more personal note…  Just found out before the session that my brother and his wife just had their second daughter named Nola. Both mother and daughter are well!

22/20 WireVis: Backup Slides

23/20 WireVis: Design Principles  Interactivity Visual analysis requires interacting with the data to see patterns and trends. WireVis is built using OpenGL to maximize interaction.  Filtering With millions of transactions, the ability to filter out unwanted information is crucial.  Overview and Detail Following Schneiderman’s mantra, the user needs to see overview and be able to drill down into detailed information.  Multiple Coordinated Views No single information visualization tool can depict all aspects of a complex dataset, using correlated, coordinated views can piece together the big picture.

24/20 WireVis: System Demo  Interactivity  Filtering  Overview and Detail  Multiple Coordinated Views  Sample Analysis In real-life scenarios, often the strongest clues are based on keyword relationships – the semantic understanding of keywords’ co-occurrences. E.g. why does a company supposed dealing in goods ‘A’ sending money to a company that has to do with goods ‘B’?