Visualizing Documents and Search

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

MICROSOFT OFFICE ACCESS 2007.
Introduction to Knowledge Representation Marti Hearst SIMS 202: Information Organization and Retrieval Lecture 6, Sept 10, 1998.
ISP 433/533 Week 2 IR Models.
© Anselm Spoerri Lecture 10 Visual Tools for Text Retrieval (cont.)
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Database Management: Getting Data Together Chapter 14.
SIMS 247 Information Visualization and Presentation Prof. Marti Hearst October 5, 2000.
Vocabulary Spectral Analysis as an Exploratory Tool for Scientific Web Intelligence Mike Thelwall Professor of Information Science University of Wolverhampton.
Symbols and Language Lexical Relations SIMS 202 Profs. Hearst & Larson UC Berkeley SIMS Fall 2000.
Marti Hearst SIMS 247 SIMS 247 Lecture 19 Visualizing Text and Text Collections March 31, 1998.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Fundamentals of Information Systems, Fifth Edition
JASS 2005 Next-Generation User-Centered Information Management Information visualization Alexander S. Babaev Faculty of Applied Mathematics.
Basic Web Applications 2. Search Engine Why we need search ensigns? Why we need search ensigns? –because there are hundreds of millions of pages available.
Organizing Data and Information AD660 – Databases, Security, and Web Technologies Marcus Goncalves Spring 2013.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
eCommerce Technologies Design & Development for the Web MIS Spring 2004 Instructors: Kelly Fish, Ph.D. John Seydel, Ph.D.
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
MI.GOV Site Design Evaluation October MI.GOV Usability Review  MSU Usability and Accessibility Center (UAC)  Reviewed rankings by studies like.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Information Visualization: Ten Years in Review Xia Lin Drexel University.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Digital Libraries Lillian N. Cassel Spring A digital library An informal definition of a digital library is a managed collection of information,
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Search Result Interface Hongning Wang Abstraction of search engine architecture User Ranker Indexer Doc Analyzer Index results Crawler Doc Representation.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
Visualization in Text Information Retrieval Ben Houston Exocortex Technologies Zack Jacobson CAC.
Chapter 5 System Modeling. What is System modeling? System modeling is the process of developing abstract models of a system, with each model presenting.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
The Development of a search engine & Comparison according to algorithms Sung-soo Kim The final report.
Knowledge and Information Retrieval Dr Nicholas Gibbins 32/4037.
Object Oriented Systems Design
Neo4j: GRAPH DATABASE 27 March, 2017
Queensland University of Technology
Fundamentals of Information Systems, Sixth Edition
Education 499-R01 Search Basics.
Visual Information Retrieval
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
Fundamentals & Ethics of Information Systems IS 201
Chapter Ten Managing a Database.
WordPress “WordPress is a free and open source blog publishing application.” Christina Vasileiou Database management system.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Federated & Meta Search
Professor John Canny Fall 2001 Nov 29, 2001
Professor John Canny Spring 2003
Text Visualization Lecture 11
Visualization of Web Search Results in 3D
What is a Database and Why Use One?
Eric Sieverts University Library Utrecht Institute for Media &
MANAGING DATA RESOURCES
Data Warehousing and Data Mining
An Introduction to Data Warehousing
MIS2502: Data Analytics The Information Architecture of an Organization Acknowledgement: David Schuff.
Packages.
MIS2502: Data Analytics The Information Architecture of an Organization Aaron Zhi Cheng Acknowledgement:
What is a CMS. CMS is content management system CMS is a software that stores content.
CSE 635 Multimedia Information Retrieval
Magnet & /facet Zheng Liang
Chapter 5: Information Retrieval and Web Search
Information Retrieval and Web Design
Information Retrieval and Web Design
Information Visualization
Presentation transcript:

Visualizing Documents and Search

Agenda Introduction Visual Principles What Works? Visualization in Analysis & Problem Solving Visualizing Documents & Search Comparing Visualization Techniques Design Exercise Wrap-Up

Documents and Search Why Visualize Text? Why Text is Tough Visualizing Concept Spaces Clusters Category Hierarchies Visualizing Retrieval Results Usability Study Meta-Analysis

Why Visualize Text? To help with Information Retrieval give an overview of a collection show user what aspects of their interests are present in a collection help user understand why documents retrieved as a result of a query Text Data Mining Mainly clustering & nodes-and-links Software Engineering not really text, but has some similar properties

Why Text is Tough Text is not pre-attentive Text consists of abstract concepts which are difficult to visualize Text represents similar concepts in many different ways space ship, flying saucer, UFO, figment of imagination Text has very high dimensionality Tens or hundreds of thousands of features Many subsets can be combined together

Why Text is Tough As the man walks the cavorting dog, thoughts arrive unbidden of the previous spring, so unlike this one, in which walking was marching and dogs were baleful sentinals outside unjust halls. How do we visualize this?

Why Text is Tough Abstract concepts are difficult to visualize Combinations of abstract concepts are even more difficult to visualize time shades of meaning social and psychological concepts causal relationships

Why Text is Tough Language only hints at meaning Most meaning of text lies within our minds and common understanding “How much is that doggy in the window?” how much: social system of barter and trade (not the size of the dog) “doggy” implies childlike, plaintive, probably cannot do the purchasing on their own “in the window” implies behind a store window, not really inside a window, requires notion of window shopping

Why Text is Easy Text is highly redundant Instant summary: When you have lots of it Pretty much any simple technique can pull out phrases that seem to characterize a document Instant summary: Extract the most frequent words from a text Remove the most common English words People are very good at attributing meaning to lists of otherwise unrelated words

Guess the Text: 6 JOHN 10 PEOPLE 10 ALL 9 STATES 9 LAWS 8 NEW 7 RIGHT 7 GEORGE 6 WILLIAM 6 THOMAS 6 JOHN 6 GOVERNMENT 5 TIME 5 POWERS 5 COLONIES 4 LARGE 4 INDEPENDENT 4 FREE 4 DECLARATION 4 ASSENT 3 WORLD 3 WAR 3 USURPATIONS 3 UNITED 3 SEAS 3 RIGHTS

Visualization of Text Collections How to summarize the contents of hundreds, thousands, tens of thousands of texts? Many have proposed clustering the words and showing points of light in a 2D or 3D space. Examples Showing docs/collections as a word space Showing retrieval results as points in word space Advent of word clouds Most popular on web