1/41 Visualization and Analysis of Text Remco Chang, PhD Assistant Professor Department of Computer Science Tufts University December 17, 2010 Cologne, Germany
2/41 CMVVisExamplesP Topics Introduction Information Visualization – Novel visual representations – Storytelling – User-Driven Visual Analysis – Data exploration – Hypotheses generation – Interactive visualization + Computation
3/41 CMVVisExamplesP Topics Visualization Pre-attentive Processing Examples courtesy of Chris Healey
4/41 CMVVisExamplesP Topics Visualization This is helpful because: – It allows us to process more information quickly – We can see trends and patterns
5/41 CMVVisExamplesP Topics Storytelling US Budget from
6/41 CMVVisExamplesP Topics Storytelling Minard’s Map: Napolean’s March to Moscow
7/41 CMVVisExamplesP Topics Visualization Influences the thought… Images courtesy of Barbara Tversky
8/41 CMVVisExamplesP Topics Visual Encoding Affects the: – Types of possible operations – The user’s thinking process Zhang and Norman. The Representation Of Numbers. Cognition. (1995)
9/41 CMVVisExamplesP Topics Classifying Numeric Systems
10/41 CMVVisExamplesP Topics Example: Arithmetic Slide courtesy of Pat Hanrahan
11/41 CMVVisExamplesP Topics Example: Arithmetic
12/41 CMVVisExamplesP Topics Example: Arithmetic
13/41 CMVVisExamplesP Topics Example: Arithmetic
14/41 CMVVisExamplesP Topics Examples of Text Visualization Wordle Images Courtesy of Many Eyes
15/41 CMVVisExamplesP Topics Examples of Text Visualization WordTree
16/41 CMVVisExamplesP Topics Examples of Text Visualization WordTree
17/41 CMVVisExamplesP Topics Examples of Text Visualization Phrase Net
18/41 CMVVisExamplesP Topics Examples of Text Visualization Google Auto- Complete
19/41 CMVVisExamplesP Topics Examples of Text Visualization Visualizing changes in Wikipedia Images Courtesy of Info.fm
20/41 CMVVisExamplesP Topics Examples of Text Visualization ThemeRiver 20
21/41 CMVVisExamplesP Topics Visual Exploration Coordinated Multi-Views (CMV) Where When Who What Original Data Evidence Box
22/41 CMVVisExamplesP Topics WHY ? WHY ? This group’s attacks are not bounded by geo-locations but instead, religious beliefs. Its attack patterns changed with its developments. Coordinated Multi-Views
23/41 CMVVisExamplesP Topics LIDAR Linked Feature Space 23/37
24/41 CMVVisExamplesP Topics LIDAR Change Detection 24/37
25/41 CMVVisExamplesP Topics Urban Model 25/37
26/41 CMVVisExamplesP Topics Urban Visualization 26/37
27/41 CMVVisExamplesP Topics Coordinated Multi-Views Financial Wire Fraud – With Bank of America – Discover suspicious international wire transactions Bridge Maintenance – With US DOT – Exploring subjective inspection reports Biomechanical Motion – With U. Minnesota and Brown – Interactive motion comparison methods
28/41 CMVVisExamplesP Topics Coordinated Multi-Views Financial Wire Fraud – With Bank of America – Discover suspicious international wire transactions Bridge Maintenance – With US DOT – Exploring subjective inspection reports Biomechanical Motion – With U. Minnesota and Brown – Interactive motion comparison methods
29/41 CMVVisExamplesP Topics Coordinated Multi-Views Financial Wire Fraud – With Bank of America – Discover suspicious international wire transactions Bridge Maintenance – With US DOT – Exploring subjective inspection reports Biomechanical Motion – With U. Minnesota and Brown – Interactive motion comparison methods
30/41 CMVVisExamplesP Topics CMV + Text Analysis
31/41 CMVVisExamplesP Topics Parallel Topics Task: Given the proposals submitted to the National Science Foundation (NSF), identify: – Proposals that are interdisciplinary – Proposals that are potentially transformative – Proposals that are focused
32/41 CMVVisExamplesP Topics Parallel Topics Approach: – Apply topic modeling algorithms to identify latent topics (David Blei, “Latent dirichlet allocation”, 2003) – Visualize the distribution of proposals based on the topics
33/41 CMVVisExamplesP Topics Topic Modeling Given a set of k documents, find n number of topics – Each document then is described as: (W 1 * Topic 1, W 2 * Topic 2, W 3 * Topic 3, …, W n * Topic n ) W 1 + W 2 + W 3 + … + W n = 1 Topic 1Topic 2…Topic N Document …0.005 Document …0.01 … Document K ∑ =
34/41 CMVVisExamplesP Topics Topic Modeling A topic is a combination of keywords
35/41 CMVVisExamplesP Topics Parallel Topics Based on “Parallel Coordinates” – Each vertical axis is a topic – Each set of horizontal connected lines is a document
36/41 CMVVisExamplesP Topics Visual Signatures Single topicBi-topic No salient topic We identify different signatures for proposals: – Single Topic – focused research – Bi-Topic – Interdisciplinary research – No-Topic – Potentially transformative research
37/41 CMVVisExamplesP Topics Selecting Single Topic Proposals Topic 1Topic 2…Topic N Document …0.005 Document …0.01 … Document K SD = 0.14 SD = 0.06 Max SD
38/41 CMVVisExamplesP Topics Selecting Multi-Topic Proposals education technology Interactive environment
39/41 CMVVisExamplesP Topics Selecting No-Topic Proposals
40/41 CMVVisExamplesP Topics Recap Objective: To discover interdisciplinary and potentially innovative research proposals Parallel Topics – data-centric approach Approach: To support interactive selection of proposals based on their number of topics
41/41 CMVVisExamplesP Topics Questions and Comments? Thank you!!