1 Dimensions / Depth James Slack CPSC 533C February 10, 2003.

Slides:



Advertisements
Similar presentations
ENV Envisioning Information Lecture 6 – Document Visualization Ken Brodlie
Advertisements

UNIT-III By Mr. M. V. Nikum (B.E.I.T). Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:-
Information Retrieval in Practice
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
INTERPRETER Main Topics What is an Interpreter. Why should we learn about them.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Xyleme A Dynamic Warehouse for XML Data of the Web.
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M.
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
1 CS 430 / INFO 430 Information Retrieval Lecture 9 Latent Semantic Indexing.
Wise, Thomas, Pennock, Lantrip, Pottier, Schur, and Crow
1 Information Input and Processing Information Theory: Some times called cognitive psychology, cognitive engineering, and engineering psychology. Information.
Overview of Search Engines
Clustering Unsupervised learning Generating “classes”
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Document (Text) Visualization Mao Lin Huang. Paper Outline Introduction Visualizing text Visualization transformations: from text to pictures Examples.
Overview of the Database Development Process
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Visualizing repetitions in texts : interactive arc diagrams Project topic for CMSC 734 Information Visualization.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Robert Kosara, Helwig Hauser 1InfoVis STAR The State of the Art in Information Visualization Robert Kosara, Helwig Hauser.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
Advanced Scientific Visualization
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
V Material obtained from summer workshop in Guildford County, July-2014.
Mao Lin Huang University of Technology, Sydney, Visual Representations of Data and Knowledge.
Visualizing textual data CPSC A. Butt / Feb. 26 '09.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
CS 8751 ML & KDDData Clustering1 Clustering Unsupervised learning Generating “classes” Distance/similarity measures Agglomerative methods Divisive methods.
V. Clustering 인공지능 연구실 이승희 Text: Text mining Page:82-93.
1 CS 430: Information Discovery Lecture 11 Latent Semantic Indexing.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
GALAXIES/THEMESCAPES JAMES WISE, JAMES THOMAS, KELLY PENNOCK, DAVID LANTRIP, MARK POTTIER, ANNE SCHUR, VERN CROW -MAULIK SHUKLA.
1 Visual Encoding Andrew Chan CPSC 533C January 20, 2003.
Writing for Computer science ——Chapter 6 Graphs, figures, and tables Tao Yang
Reading literacy. Definition of reading literacy: “Reading literacy is understanding, using and reflecting on written texts, in order to achieve one’s.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
Data Science Dimensionality Reduction WFH: Section 7.3 Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall.
Information Retrieval in Practice
Plan for Today’s Lecture(s)
A Signal Processing Approach to Vibration Control and Analysis with Applications in Financial Modeling By Danny Kovach.
Search Engine Architecture
CEN3722 Human Computer Interaction Cognition and Perception
Advanced Scientific Visualization
The research process András István Kun.
Personalized Social Image Recommendation
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Proceedings of Infoviz’95
The research process András István Kun.
Visualization of Web Search Results in 3D
Multimedia Information Retrieval
Module 6: Presenting Data: Graphs and Charts
Visualizing Structures in Strings
Visual Perception.
Why Compress? To reduce the volume of data to be transmitted (text, fax, images) To reduce the bandwidth required for transmission and to reduce storage.
Data Warehousing and Data Mining
CSc4730/6730 Scientific Visualization
A Review in Quality Measures for Halftoned Images
Design open relay based DNS blacklist system
HCC class lecture 13 comments
Exploring and Understanding ChIP-Seq data
Dr. Debaleena Chattopadhyay Department of Computer Science
Advanced Design Applications The Engineering Design Process
Group 9 – Data Mining: Data
Restructuring Sparse High Dimensional Data for Effective Retrieval
What's New in eCognition 9
Data Analytics Case Study
Presentation transcript:

1 Dimensions / Depth James Slack CPSC 533C February 10, 2003

2 Overview Linear data sources Information processing Aggregate visualization methods Embedding semantics of information Repetition and other patterns Examples in InfoVis

3 Linear Data Sources Univariate data arranged spatially or temporally Complexity issues: –Patterns in text are cognitively hard to find –Text input could be viewed spatially –Cognition from visual abstractions of text is becoming more relevant

4 Information Processing Why do we need information? Technical aspects Characterizing text by language semantics Browsing versus querying Interfacing with text visualization

5 Considering Visualization? The technical considerations: 1.Define what needs to be visualized 2.Transform input; must be possible! 3.Analyze to suit the input 4.Technique & derivative data storage

6 Text Features 3 general types of features 1.Frequency based 2.Statistics on words or other tokens 3.Semantic features

7 Text Features Frequency based text features: –Statistics on presence and count of unique words –Feature sets are word statistics

8 Text Features Statistics on words or other tokens –Occurrence, frequency, and context of individual tokens define feature set –Sets can be explicitly specified or deterministically partitioned

9 Text Features Semantic features –Natural groups of similar topics –Knowledge of language –Words have semantic meaning

10 Characterizing Text Feature sets of text –A shorthand description of the original –Reduction in length, not in meaning –Semantics are often important, although not always necessary –Represented for efficient computation

11 Browsing vs. Querying Querying is more precise –Specific results discarded or retained –The most specific features are important –Popularity of query is relative, closeness ratio compares potential matches –Similarity of results appear

12 Browsing vs. Querying Browsing is more general –Choose similarity over exactness –The most common features are important –Clustering is a natural partition –Similarity of clusters appears –Analytical information processing

13 Interfacing With Visualizations Spatial representations enhance cognition Clusters can be viewed with browsing A global overview of data is important Techniques to visit clusters Too many data points? –Display cluster centroids instead

14 Assisting Perception Interface should provide: 1.Preconscious visual form for information 2.Interactions to sustain, enrich process of knowledge building 3.Fluid environment for reflective cognition 4.Framework for temporal knowledge building

15 Aggregate Visualization Information overloads cognitive abilities Understanding global, not local contexts Visualize abstract representations of complex underlying structure What can we gain from global context?

16 Embedding Semantics Are some visualizations without meaning? Galaxies, ThemeScapes highlight semantic meaning with relevant labels Cluster viewer uses calendar to highlight temporal univariate patterns Dot plots, arc diagrams use connectivity of similar input strings independent of semantics

17 Repetition and Patterns How can you show something is repeated? –Place two occurrences close together –Colour two occurrences similarly –Connect two occurrences with a line Each method has merits –No method works in all cases –We want to keep spatial/temporal information

18 Infovis Examples SPIRE –Galaxies and ThemeScapes Calendar Based Visualization Dot Plots Arc Diagrams

19 From SPIRE Spatial Paradigm for Information Retrieval and Exploration Galaxies cluster docupoints ThemeScapes model landscape

20 Galaxies Projection of clustering algorithms into 2D Galaxies are clusters of related data Proximity of galaxies is relevant Designed to add temporal patterns to clustering

21 Galaxies

22 ThemeScape Abstract 3D landscape of information Reduce cognitive load using terrain Elevation, colour encode theme strength redundantly Landscape metaphor translates well –Peaks are easy to recognize –Interesting characteristics include ridges and valleys

23 ThemeScape

24 ThemeScape

25 Calendar Based Visualization Time is linear, monotonic, scalar Prediction is a useful side effect of visualizing the past Time series data is often univariate Periodic patterns emerge in time series data

26 Calendar Based Visualization How about using 3 dimensions? –X-axis: Time of day –Y-axis: Days of data period –Z-axis: Univariate data samples

27 Calendar Based Visualization

28 Calendar Based Visualization Weekly variation obscured by pretty graphics Where are the trends? Is colour necessary for this? Is colour sufficient for this? Can everything be shown without overload?

29 Calendar Based Visualization A more natural way: use a calendar Cluster data into meaningful groups –Decide what the groups mean later? 1.Simple formulae are sufficient for clustering 2.Use robust statistical techniques 3.Generate binary clustering trees 4.Select desired clusters to visualize 5.Show clusters on calendar layout, simple graphs coloured appropriately

30 Calendar Based Visualization

31 Visualizing Structure in Strings M. Wattenberg: Arc diagrams Summarize long strings, indicate repetition

32 Dot Plots Finds structure in string data Correlation matrix Diagonal symmetry Redundant information Interesting repetitions can be confusing

33 Dot Plots

34 Arc Diagrams Finds structure in string data Cognitive improvement over dot plots Adaptable to reduce noise in data Applications are varied: –Music –Text –Compiled code –Nucleotide sequences

35 Arc Diagrams Interactive demonstration: –

36 Alternate Ending Something went wrong with the demo, so here is a synopsis of arc diagrams

37 Arc Diagrams

38 Arc Diagrams

39 Arc Diagrams

40 Arc Diagrams

41 Paper References Visualizing the non-visual: spatial analysis and interaction with information from text documents Wise, J.A.; Thomas, J.J.; Pennock, K.; Lantrip, D.; Pottier, M.; Schur, A.; Crow, V., Proc InfoVis 1995.Visualizing the non-visual: spatial analysis and interaction with information from text documents Cluster and Calendar based Visualization of Time Series Data Jarke J. van Wijk Edward R. van Selow, Proc InfoVis 99.Cluster and Calendar based Visualization of Time Series Data Arc Diagrams: Visualizing Structure in Strings. Martin Wattenberg, Proc InfoVis 2002.Arc Diagrams: Visualizing Structure in Strings