Download presentation
Presentation is loading. Please wait.
Published byClifford Pierce Modified over 9 years ago
1
Document Collections cs5984: Information Visualization Chris North
2
Where are we? Multi-D 1D 2D Hierarchies/Trees Networks/Graphs Document collections 3D Design Principles Empirical Evaluation Java Development Visual Overviews Multiple Views Peripheral Views
3
Structured Document Collections Multi-dimensional author, title, date, journal, … Trees dewey decimal Networks web, citations
4
Envision Ed Fox, et al. Multi-D similar to Spotfire
5
Unstructured Document Collections Focus on Full Text Examples: digital libraries, encyclopedia Web, homepages, photo collections Tasks: search, keyword Browse Themes, subjects, topics, library coverage Size, distributions
6
Visualization Strategies Cluster Maps Keyword Query Relationships Reduced representation User controlled layout today
7
Cluster Map Create a “map” of the document collection Similar documents near Dissimilar document far “Grocery store” concept
8
Document Vectors Doc1Doc2Doc3 … “aardvark”120 “banana”210 “chris”003 … Similarity between pair of docs = Layout documents in 2-D map by similarity similar to spring model for graph layout
9
Cluster Algorithms Partition clustering: Partition into k subsets Pick k seeds Iteratively attract nearest neighbors Hierarchical clustering: Dendrogram Group nearest-neighbor pair Iterate
10
Kohonen Maps Xia Lin, “Document Space” samal, ying http://faculty.cis.drexel.edu/sitemap/index.html
12
Themescapes, Cartia PNL Mountain height = Cluster size
13
WebSOM http://websom.hut.fi/websom/
14
Map.net http://maps.map.net/start
15
Cluster Map Good: Map of collection Major themes and sizes Relationships between themes Scales up Bad: Where to locate documents with multiple themes? »Both mountains, between mountains, …? Relationships between documents, within documents? Algorithm becomes (too) critical
16
Keyword Query Keyword query, Search engine Rank ordered list “Information Retrieval”
17
Tilebars Hearst, “Tilebars” reenal, xueqi http://elib.cs.berkeley.edu/tilebars/
18
VIBE Korfhage, http://www.pitt.edu/~korfhage/interfaces.htmlhttp://www.pitt.edu/~korfhage/interfaces.html Documents located between query keywords using spring model
19
VR-VIBE
20
Keyword Query Good: Reduces the browsing space Map according to user’s interests Bad: What keywords do I use? What about other related documents that don’t use these keywords? No initial overview Mega-hit, zero-hit problem
21
Assignment Thurs: Document Collections Bederson, “Image Browsing” » Rui, anusha Card, “Web Book and Web Forager” » mrinmayee, ming Demo your hw3: tues or thurs
22
Next Week Tues: 3-D data Kniss, “Interactive Volume Rendering with Direct Manip” » xueqi, mahesh Thurs: Workspaces Robertson, “Task Gallery” » supriya, varun Upson, “AVS” » christa, jun Thanksgiving break Tues 27: Debates Kobsa, “Empirical comparison of comm infovis systems” » kunal, zhiping
23
Upcoming Sched Tues: 3-D data Thurs: Workspaces Thanksgiving break Tues 27: Debates Thurs 29: How (not) to lie with visualization Dec: project presentations Dec 7: CHI 2-pagers due, student posters due
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.