Proceedings of Infoviz’95

Slides:



Advertisements
Similar presentations
INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Advertisements

Kien A. Hua Division of Computer Science University of Central Florida.
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Information Retrieval in Practice
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M.
Table Lens From papers 1 and 2 By Tichomir Tenev, Ramana Rao, and Stuart K. Card.
Chapter 14 The Second Component: The Database.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Wise, Thomas, Pennock, Lantrip, Pottier, Schur, and Crow
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Overview of Search Engines
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Document (Text) Visualization Mao Lin Huang. Paper Outline Introduction Visualizing text Visualization transformations: from text to pictures Examples.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 An Efficient Concept-Based Mining Model for Enhancing.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Information Design and Visualization
Fundamentals of Information Systems, Fifth Edition
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation Dmitri G. Roussinov Department of.
Visualization Blaz Zupan Faculty of Computer & Info Science University of Ljubljana, Slovenia.
Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Extracting meaningful labels for WEBSOM text archives Advisor.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
Advanced Scientific Visualization
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
V Material obtained from summer workshop in Guildford County, July-2014.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Visualizing textual data CPSC A. Butt / Feb. 26 '09.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1 ITM 734 Introduction to Human Factors in Information Systems Cindy Corritore Information Visualization.
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
Big Data Using Big Data for Cultures and Communities Jeremy Reffin Simon Wibberley CASM, University of Sussex Carl Miller CASM, Demos July 2014.
GALAXIES/THEMESCAPES JAMES WISE, JAMES THOMAS, KELLY PENNOCK, DAVID LANTRIP, MARK POTTIER, ANNE SCHUR, VERN CROW -MAULIK SHUKLA.
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-5 Chapters 9 and 10.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Managing Data Resources File Organization and databases for business information systems.
1 Dimensions / Depth James Slack CPSC 533C February 10, 2003.
Information Retrieval in Practice
Advanced Computer Systems
Introduction Multimedia initial focus
Multimedia Content-Based Retrieval
Fundamentals & Ethics of Information Systems IS 201
Advanced Scientific Visualization
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Chapter Ten Managing a Database.
Personalized Social Image Recommendation
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Understand Windows Forms Applications and Console-based Applications
Visualization of Web Search Results in 3D
Search Techniques and Advanced tools for Researchers
MANAGING DATA RESOURCES
Multi-Dimensional Data Visualization
Document Visualization at UMBC
Data Warehousing and Data Mining
Visualizing Document Collections
Information Design and Visualization
MANAGING DATA RESOURCES
Dynamic Queries for Visual Information Seeking Ben Shneiderman
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
CHAPTER 7: Information Visualization
Tutorial 7 – Integrating Access With the Web and With Other Programs
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Data Analysis, Interpretation, and Presentation
Presentation transcript:

Proceedings of Infoviz’95 Document Visualization “Visualizing the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents” J. A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow Proceedings of Infoviz’95 Reviewed by Nada Golmie for CMSC 838S Fall 1999

Document Visualization Outline Document visualization: What? Why? How? Examples for 1D, 2D visualizations: vector space analysis (Salton 1995) reduced text + interaction (Eick, 1992) 2D maps of document collections (Lin, 1992) 3D Visualization: SPIRE (Wise et. al. 1995) 3D + Time: Interactive Landscapes (Rennison, 1994) 9/19/2018 Document Visualization

Document Visualization Document visualization is an important IV application due to emerging technology trends: World Wide Web Digital Libraries Communication Advances Mapping a text document: Understand the content of a document. Mapping a collection of documents: Discover relationships among documents. 9/19/2018 Document Visualization

Vector Space Analysis (Salton et. al.) Support of free-form text queries in IR. Text passages are mapped into a vector of terms in high dimensional space: where is the weighted assigned to term k in document . Given document and query a similarity computation is computed as: 9/19/2018 Document Visualization

Reduced Text + Interaction: SeeSoft (Eick, 1992) Reduced representation display of lines as rows, files as columns (max 900 rows per column) Colors are used to display statistics statistics include: age, programmer, feature, type of line, number of times the line was executed Direct manipulation techniques find interesting patterns capability to read actual code using magnification 9/19/2018 Document Visualization

Document Visualization SeeSoft (Eick, 1992) 9/19/2018 Document Visualization

Document Visualization 2D Maps (Lin, 1992) Framework for information retrieval: mapping of high dimensional document space into 2D map. document relationships are explored using visual cues such as: dots, links, clusters, and areas. Neural network self-organizing learning algorithm based on Kohonen’s feature map: preserves distance relationships between input data. allocates different numbers of nodes to inputs based on their occurrence frequencies. Sitemap 9/19/2018 Document Visualization

Visual Text Analysis: SPIRE SPIRE (Spatial Paradigm for Information Retrieval and Exploration) is a software that allows users: to explore complex relationships between text documents. to rapidly discover known and hidden information relationships by reading only the pertinent documents rather than wading through large volumes of text. 9/19/2018 Document Visualization

Document Visualization Applications SPIRE was originally developed for the U.S. intelligence community. Other potential applications include: environmental assessment market analysis corporations researching competitive products, health care providers searching patient records, or attorneys reading through previous cases. 9/19/2018 Document Visualization

2D Scatterplot: Galaxies Galaxies computes word similarities and patterns in documents and then displays the documents on a computer screen to look like a universe of "docustars”: closely related documents will cluster together in a tight group. unrelated documents will be separated by large spaces. 9/19/2018 Document Visualization

Document Visualization Galaxies 9/19/2018 Document Visualization

3D Landscapes: Themescapes Themes within the document spaces appear on the computer screen as a relief map of natural terrain: mountains in Themescapes indicate where themes are dominant; valleys indicate weak themes. shapes reflect how the thematic information is distributed and relate across documents. Themes close in content will be close visually based on the many relationships within the text spaces. 9/19/2018 Document Visualization

Document Visualization Themescapes 9/19/2018 Document Visualization

Visualization Transformations Definition of text: written form of natural language. Text conversion to spatial form: algorithms & processes. Meaningful visualizations: mathematical procedures and analytical measures. Database management:store and manage text and its derivative forms. 9/19/2018 Document Visualization

Processing Text Requirements Identification and extraction of text features: frequency-based measures of words higher order statistics taken on words: occurrence, frequency, context of individual words are used to characterize defined word classes. Semantic approaches using natural language understanding. Efficient and flexible representation of documents in terms of text features. Support of information retrieval and visualization. 9/19/2018 Document Visualization

Visual Output of Text Processing Vector representation of document in high dimensional feature space. Comparisons, filters, transformations can be applied Projection onto 2-3D visualization dimensionality reduction scaling clustering in high dimension feature space and centroids of clusters are fed into layout algorithms (principal component analysis or multidimensional scaling) 9/19/2018 Document Visualization

Document Visualization Interface Design Three display types: Backdrop: central display resource. Workshop: grid with resizable windows to hold multiple views. Chronicle: space where views are placed and linked to form a visual story. Tools provided to allow more in-depth analysis: point and click, grouping, annotation, query, subset, temporal slicing. 9/19/2018 Document Visualization

Document Visualization Screenshot 9/19/2018 Document Visualization

Document Visualization Favorite Sentences “The bottleneck in the human processing and understanding of information in large amounts of text can be overcome if the text is spatialized in a manner that takes advantage of common powers of perception.” “So much has already been written about everything that you can’t find out anything about it”. James Thurber (1961). 9/19/2018 Document Visualization

Document Visualization Contributions Effective use of physical metaphors such as night sky and landscape to provide overview visualization on the collection of documents: helps answer simple questions about the database Discussion on processing text for visualization. Platform includes integrated tools and techniques for text manipulation and analysis. 9/19/2018 Document Visualization

Document Visualization Critique How to measure the effectiveness of the visualization in discovering relationships and answering detailed questions about the documents: may depend on the ease of interaction need to verify claim: “discovering in 35 minutes what would have taken two weeks otherwise”. There could be cluttering and occlusion resulting from layout algorithms (complex for large collections of documents) Clustering may reduce feature sensitivity from individual documents. 9/19/2018 Document Visualization

Document Visualization Other Comments Agree with the need to create visual tools to aid cognitive skills, however skeptical about statement: “And the limitations of Information Age will not be set by the speed with which human mind can read”: Paper contains too many sound biting sentences and buzz words which could be distracting: “fluid environment for reflective cognition and higher-order thought” 9/19/2018 Document Visualization

Galaxy of News: Interactive Landscapes Parse content to extract key information Build an associative relation network Classify elements into hierarchies Sort peer elements spatially and temporally Construct visual information space based on classified elements Dynamic response to visual interaction 9/19/2018 Document Visualization

Galaxy of News: Summary Use of motion to visualize relationships among documents. Documents have no fixed position in space associative relation network built dynamically fixed positioning of categories Space constructed is based on conceptual abstract metaphors (galaxies) and could have any dimensions. 9/19/2018 Document Visualization