Download presentation
Presentation is loading. Please wait.
Published byDonna Goodman Modified over 9 years ago
1
Visualizing textual data CPSC 601.28 A. Butt / Feb. 26 '09
2
Overview Project implications Summarize "Tilebars" –Hearst / PARC (Xerox) Summarize "Visualizing the Non-Visual" –Wise et al / Pacific Northwest Lab (Battelle) Key Issues Summary References
3
Project Implications Research area is partly based on text-based environmental reports –textual reporting feeds into textual (quasi-judicial) regulatory framework –rooms of binders (e.g. >20,000 pages for Mackenzie Pipeline Project) Vocabulary specialized / semantically complete –"no significant adverse environmental impacts"
4
TileBars goals are to simultaneously view: –length of a document –relative frequency of specific words –distribution of words with respect to each other benefits include: –enhanced relevancy of search response –patterns of frequency by document / author –compactness of information
5
Tilebars Visual representation via –rectangular block: size equates to document length –three bars within the block: each corresponds to a query –in each bar tiles indicate location, saturation of tile indicates frequency 5 articles, 3 search queries 1st, 2nd, 5th appear compact / relevant 1st and 2nd appear to have better concurrency 3rd and 4th potentially less relevant, greater time investment to read
6
Visualizing the Non-Visual goals are to: –overcome time constraints in processing textual information –overcome attention constraints; avoid becoming overwhelmed by volume of textual information benefits include: –escape limitations of traditional text –increase throughput and comprehension of information processing –feedback on text structure to enhance visualization
7
Visualizing the Non-Visual Employ a "natural landscape" metaphor –leverage evolutionary psychological adaptations via natural landscapes for representation –galaxy or star-fields ("night sky") –themescapes ("cartographic" or "landscape") –although statistical measures used for clustering, they are not used as directly as in tile bars –self-organizing maps
8
Galaxies PNL software development (DOE) Display is a review of cancer literature Branched to SPIRE / In-SPIRE for government documents
9
Themescapes PNL software development (DOE) Branched to SPIRE / In-SPIRE for government documents (renamed "Themeview") Branched into NVAC (National Visual and Analytics Centre) - part of the Homeland Security infrastructure
10
Themescapes (2.0?) Branched progeny of themescapes Used in searching IP / Patents Subscription service Failed metaphors??
11
Key Issues Vocabulary / semantics - how do you interpret meaning from text statistics? –earlier failures of natural language processing –contingent semantics Employing metaphors (Zhang 2008) –rely on unusual linkages (versus analogy) to highlight –degree of "unusual-ness" is critical: too much or too little leads to confusion
12
Summary www.wordle.net
13
References Marti A. Hearst: TileBars: Visualization of Term Distribution Information in Full Text Information Access. CHI 1995: 59-66 James A. Wise and James J. Thomas and Kelly Pennock and David Lantrip and Marc Pottier and Anne Schur and Vern Crow. Visualizing the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents. Proc. IEEE Symp. Information Visualization, InfoVis, pp. 51-58, IEEE Computer Soc. Press, 30-31, October 1995. (in text pages 442-450) Jin Zhang. The Implication of Metaphors in Information Retrieval. Visualization in Information Retrieval, Elsevier, 2008. (pages 215-237)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.