Visualization of AAG Paper Abstracts André Skupin Dept. of Geography University of New Orleans AAG Pittsburgh, April 5, 2000
AAG Conference Abstracts
Web Search Engine Interface
Research Motivation I Methodology Geography’s role in information visualization –geographic concepts regions scale –cartographic techniques generalization labeling –GIS technology data integration
Research Motivation II Application Developments in Academic Geography –based on geography’s written output –generalizable for any corpus of documents
Data Capture & Pre-Processing Source Data: –abstracts submitted to AAG 1999 Hawaii –complete abstracts as text file –2220 abstracts Pre-Processing: –Separation into three parts: author information abstract text keywords chosen by authors
Keyword Component Indexing (1) extract keywords chosen by authors (2) break keywords into components (3) match components against content of all abstracts result: –all abstracts indexed –overall richer then only author-chosen keywords –vector-space model with 2220 docs & 741 terms
Spatialization projection of elements of a high-dimensional information space into a low-dimensional representation (Skupin & Buttenfield 1997) –> project document/keyword matrix into 2D Technique: Self-Organizing Map (SOM) –input: raw document/keyword matrix –output: two-dimensional grid of neurons with weight for each keyword
Base Map Creation Implementation: SOM_PAK & C++ 1. Choose SOM Dimensions –e.g. 85 x 115 neurons 2. Train Grid of Neurons –each neuron gets weight for each keyword –preservation of high-dim. document topology 3. Apply SOM to Data Set –documents assigned to single neurons 4. Assign unique locations to documents
Base Map of AAG Abstracts Complexity –> Generalization ? –> Scale ? Labeling –> Weighted Index ? Visualization –> GIS Software ?
High-Dimensional Clusters Projected onto Map HierarchicalCoarse SOMK-Means
Multi-Scale Spatialization w/ Labels
Map Design for 2D Spatialization Visual Hierarchies Geographic Space Information Space
Research Directions I Applications visualize trends in geography –author trajectories through time –subject emergence –geography of geography
Papers by ZIP Code
Research Directions II Techniques Cluster Solutions –U-matrix (-> contiguous clusters in 2D) –AutoClass (-> with optimized cluster numbers) –quantify performance of cluster solutions Visualization –multi-band thematic visualization
SOM Plane “GIS”
SOM Plane “visualization”
SOM Plane “urban”
Color Composite “GIS” “urban” “visualization”: Full Extent
Color Composite “GIS” “urban” “visualization”: Zoom-In