Document Collections cs5984: Information Visualization Chris North.

Slides:



Advertisements
Similar presentations
ENV Envisioning Information Lecture 6 – Document Visualization Ken Brodlie
Advertisements

INFO624 - Week 2 Models of Information Retrieval Dr. Xia Lin Associate Professor College of Information Science and Technology Drexel University.
Visualization Taxonomies and Techniques Text: Documents and Collections University of Texas – Pan American CSCI 6361, Spring 2014.
Unsupervised learning
Information Retrieval Visualization CPSC 533c Class Presentation Qixing Zheng March 22, 2004.
PaperLens Understanding Research Trends in Conferences using PaperLens Work by Bongshin Lee, Mary Czerwinski, George Robertson, and Benjamin Bederson Presented.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
© Anselm Spoerri Lecture 10 Visual Tools for Text Retrieval (cont.)
9/18/2001Information Organization and Retrieval Vector Representation, Term Weights and Clustering (continued) Ray Larson & Warren Sack University of California,
Graph Visualization cs5764: Information Visualization Chris North.
Interfaces for Selecting and Understanding Collections.
CS 5764 Information Visualization Dr. Chris North.
Visual Computing Text Visualization Based on slides by Chris North, Virginia Tech Jeffrey Heer, Stanford University.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Memoplex Browser: Searching and Browsing in Semantic Networks CPSC 533C - Project Update Yoel Lanir.
ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR depends on –Search in IR and search in human memory.
WHAT HAVE WE DONE SO FAR?  Weeks 1 – 8 : various components of an information retrieval system  Now – look at various examples of information retrieval.
Tree Structures (Hierarchical Information) cs5764: Information Visualization Chris North.
Information Design and Visualization
IAT Text ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT]
Trees (Hierarchical Information) cs5984: Information Visualization Chris North.
AuthorLink: Instant Author Co-Citation Mapping for Online Searching Xia Lin Howard D. White Jan Buzydlowski Drexel University Philadelphia,
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Kohonen Mapping and Text Semantics Xia Lin College of Information Science and Technology Drexel University.
Multi-Dimensional Functions cs5984: Information Visualization Chris North.
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Intuitive Database Query System, Zooming Query Results Previews Drawing upon existing literature on zooming interface technology, intuitive navigation.
2-D cs5984: Information Visualization Chris North.
University of Malta CSA3080: Lecture 4 © Chris Staff 1 of 14 CSA3080: Adaptive Hypertext Systems I Dr. Christopher Staff Department.
THE ABSTRACT OBJECT RELATIONSHIP BROWSER (absORB) COS 333 Project Demo Thursday, May 7th, 2009 Laura Bai ’10 Natasha Indik ’10 Ryan Bayer ’09 Tsheko Mutungu.
Publication Spider Wang Xuan 07/14/2006. What is publication spider Gathering publication pages Using focused crawling With the help of Search Engine.
Interactive Visualizations for Biodiversity Information Bongshin Lee Researcher Visualization and Interaction Research Group Microsoft Research Bongshin.
IAT Text ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT]
How (not) to lie with visualization cs5984: Information Visualization Chris North.
Multiple View Strategies cs5984: Information Visualization Chris North.
Info Vis: Multi-Dimensional Data Chris North cs3724: HCI.
Clustering C.Watters CS6403.
3-D Data cs5984: Information Visualization Chris North.
An Interactive System for CO-Citation Visualization Xia Lin Jan Buzydlowski Howard D. White Drexel University Philadelphia, PA, USA.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Visual Overview Strategies cs5984: Information Visualization Chris North.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Mining massive document collections by the WEBSOM method Presenter : Yu-hui Huang Authors :Krista Lagus,
Supporting document use through interactive visualization of metadata Visual Interfaces to Digital Libraries JCDL 28/06/2001 Mischa Weiss-Lijn.
WorkSpaces cs5984: Information Visualization Chris North.
3-D Information cs5764: Information Visualization Chris North.
User Interfaces for Information Access Prof. Marti Hearst SIMS 202, Lecture 26.
Debates: Comparison of commercial tools 2-D vs. 3-D cs5984: Information Visualization Chris North.
Multi-Dimensional Data Visualization cs5984: Information Visualization Chris North.
Visualization Design Principles cs5984: Information Visualization Chris North.
SIMS 202, Marti Hearst Content Analysis Prof. Marti Hearst SIMS 202, Lecture 15.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Information Organization: Overview
Clustering of Web pages
Personalized Social Image Recommendation
cs5984: Information Visualization Chris North
Professor John Canny Spring 2003
cs5984: Information Visualization Chris North
cs5984: Information Visualization Chris North
Visualizing Document Collections
cs5984: Information Visualization Chris North
cs5984: Information Visualization Chris North
Document Clustering Matt Hughes.
cs5984: Information Visualization Chris North
cs5984: Information Visualization Chris North
cs5984: Information Visualization Chris North
Map Information Visualization
cs5984: Information Visualization Chris North
Information Organization: Overview
Presentation transcript:

Document Collections cs5984: Information Visualization Chris North

Where are we? Multi-D 1D 2D Hierarchies/Trees Networks/Graphs Document collections 3D Design Principles Empirical Evaluation Java Development Visual Overviews Multiple Views Peripheral Views

Structured Document Collections Multi-dimensional author, title, date, journal, … Trees dewey decimal Networks web, citations

Envision Ed Fox, et al. Multi-D similar to Spotfire

Unstructured Document Collections Focus on Full Text Examples: digital libraries, encyclopedia Web, homepages, photo collections Tasks: search, keyword Browse Themes, subjects, topics, library coverage Size, distributions

Visualization Strategies Cluster Maps Keyword Query Relationships Reduced representation User controlled layout today

Cluster Map Create a “map” of the document collection Similar documents near Dissimilar document far “Grocery store” concept

Document Vectors Doc1Doc2Doc3 … “aardvark”120 “banana”210 “chris”003 … Similarity between pair of docs = Layout documents in 2-D map by similarity similar to spring model for graph layout

Cluster Algorithms Partition clustering: Partition into k subsets Pick k seeds Iteratively attract nearest neighbors Hierarchical clustering: Dendrogram Group nearest-neighbor pair Iterate

Kohonen Maps Xia Lin, “Document Space” samal, ying

Themescapes, Cartia PNL Mountain height = Cluster size

WebSOM

Map.net

Cluster Map Good: Map of collection Major themes and sizes Relationships between themes Scales up Bad: Where to locate documents with multiple themes? »Both mountains, between mountains, …? Relationships between documents, within documents? Algorithm becomes (too) critical

Keyword Query Keyword query, Search engine Rank ordered list “Information Retrieval”

Tilebars Hearst, “Tilebars” reenal, xueqi

VIBE Korfhage, Documents located between query keywords using spring model

VR-VIBE

Keyword Query Good: Reduces the browsing space Map according to user’s interests Bad: What keywords do I use? What about other related documents that don’t use these keywords? No initial overview Mega-hit, zero-hit problem

Assignment Thurs: Document Collections Bederson, “Image Browsing” » Rui, anusha Card, “Web Book and Web Forager” » mrinmayee, ming Demo your hw3: tues or thurs

Next Week Tues: 3-D data Kniss, “Interactive Volume Rendering with Direct Manip” » xueqi, mahesh Thurs: Workspaces Robertson, “Task Gallery” » supriya, varun Upson, “AVS” » christa, jun Thanksgiving break Tues 27: Debates Kobsa, “Empirical comparison of comm infovis systems” » kunal, zhiping

Upcoming Sched Tues: 3-D data Thurs: Workspaces Thanksgiving break Tues 27: Debates Thurs 29: How (not) to lie with visualization Dec: project presentations Dec 7: CHI 2-pagers due, student posters due