UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

The Robert Gordon University School of Engineering Dr. Mohamed Amish
Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Empirically Validated Web Page Design Metrics Melody Y. Ivory, Rashmi R. Sinha, Marti A. Hearst UC Berkeley CHI 2001.
USABILITY AND EVALUATION Motivations and Methods.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
SIMS 213: User Interface Design & Development Marti Hearst Thurs, March 3, 2005.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Semantic Search Jiawei Rong Authors Semantic Search, in Proc. Of WWW Author R. Guhua (IBM) Rob McCool (Stanford University) Eric Miller.
Measuring Information Architecture CHI 01 Panel Position Statement Marti Hearst UC Berkeley.
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
Web TANGO Project Melody Ivory (PhD student) Rashmi Sinha (Postdoc) Marti Hearst (Research Advisor) Undergrads - Steve Demby Anthony Lee Dave Lai HCC Retreat.
Universal Access: More People. More Situations Content or Graphics Content or Graphics? An Empirical Analysis of Criteria for Award-Winning Websites Rashmi.
SIMS 296a-3: Aids for Source Selection Carol Butler Fall ‘98.
Text Mining Tools: Instruments for Scientific Discovery Marti Hearst UC Berkeley SIMS Advanced Technologies Seminar June 15, 2000.
Empirically Validated Web Page Design Metrics Melody Y. Ivory, Rashmi R. Sinha, Marti A. Hearst UC Berkeley CHI 2001.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
UCB HCC Retreat Search Text Mining Web Site Usability Marti Hearst SIMS.
An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.
INFO 624 Week 3 Retrieval System Evaluation
IBM Almaden, Oct 2000 Automating Assessment of Web Site Usability Marti Hearst University of California, Berkeley.
A metadata-based approach Marti Hearst Associate Professor BT Visit August 18, 2005.
Towards a Better Understanding of Web Resources and Server Responses for Improved Caching Craig E. Wills and Mikhail Mikhailov Computer Science Department.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
HFWEB June 19, 2000 Quantitative Measures for Distinguishing Web Pages Melody Y. Ivory Rashmi R. Sinha Marti A. Hearst UC Berkeley.
Gender Issues in Systems Design and User Satisfaction for e- testing software Prepared by Sahel AL-Habashneh. Department of Business information systems.
Automating Assessment of Web Site Usability Marti Hearst Melody Ivory Rashmi Sinha University of California, Berkeley.
NEC Symposium 2000 Automating Assessment of Web Site Usability Marti Hearst University of California, Berkeley.
1 Discovering Unexpected Information from Your Competitor’s Web Sites Bing Liu, Yiming Ma, Philip S. Yu Héctor A. Villa Martínez.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Recommender systems Ram Akella November 26 th 2008.
Empirical Foundations for Web Site Usability Marti Hearst Melody Ivory Rashmi Sinha University of California, Berkeley.
The LINDI Project Linking Information for New Discoveries UIs for building and reusing hypothesis seeking strategies. Statistical language analysis techniques.
Citances and What should our UI look like? Marti Hearst SIMS, UC Berkeley Supported by NSF DBI and a gift from Genentech.
Mining the Web for Design Guidelines Marti Hearst, Melody Ivory, Rashmi Sinha UC Berkeley.
Overview of Web Data Mining and Applications Part I
Formulating the research design
© 2004 Keynote Systems Customer Experience Management (CEM) Bonny Brown, Ph.D. Director, Research & Public Services.
Federated Searching Pre-Conference Workshop - The federated searching cookbook Qin Zhu HP Labs Research Library February 18, 2007.
Introduction to SDLC: System Development Life Cycle Dr. Dania Bilal IS 582 Spring 2009.
Using Taxonomies Effectively in the Organization v. 2.0 KnowledgeNets 2001 Vivian Bliss Microsoft Knowledge Network Group
Easy access to medical literature: Are user habits changing? Is this a threat to the quality of Science? University of Liège - Life Sciences Library.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
-1- Philipp Heim, Thomas Ertl, Jürgen Ziegler Facet Graphs: Complex Semantic Querying Made Easy Philipp Heim 1, Thomas Ertl 1 and Jürgen Ziegler 2 1 Visualization.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Using Taxonomies Effectively in the Organization KMWorld 2000 Mike Crandall Microsoft Information Services
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
MBA7025_01.ppt/Jan 13, 2015/Page 1 Georgia State University - Confidential MBA 7025 Statistical Business Analysis Introduction - Why Business Analysis.
MBA7020_01.ppt/June 13, 2005/Page 1 Georgia State University - Confidential MBA 7020 Business Analysis Foundations Introduction - Why Business Analysis.
Recuperação de Informação B Cap. 10: User Interfaces and Visualization , , 10.9 November 29, 1999.
Text Mining Tools: Instruments for Scientific Discovery Marti Hearst UC Berkeley SIMS IMA Text Mining Workshop April 17, 2000.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Cs Future Direction : Collaborative Filtering Motivating Observations:  Relevance Feedback is useful, but expensive a)Humans don’t often have time.
Unclassified//For Official Use Only 1 RAPID: Representation and Analysis of Probabilistic Intelligence Data Carnegie Mellon University PI : Prof. Jaime.
Assess usability of a Web site’s information architecture: Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Research Design. How do we know what we know? The way we make reasoning Deductive logic Begins with one or more premises, reasoning then proceeds logically.
SIMS 202, Marti Hearst Final Review Prof. Marti Hearst SIMS 202.
TDM in the Life Sciences Application to Drug Repositioning *
Text Tango: A New Text Data Mining Project
Federated & Meta Search
Document Clustering Matt Hughes.
Introduction of Week 9 Return assignment 5-2
Presentation transcript:

UCB CS Research Fair Search Text Mining Web Site Usability Marti Hearst SIMS

UCB CS Research Fair BAILANDO Projects Better Access to Information using Language Analysis and Novel Dynamic Organizations

UCB CS Research Fair Current BAILANDO Projects CHA-CHA & FLAMENCO: Better Search Interfaces LINDI: UI support for Search Text Data Mining TANGO: Automated Web Site Usability

UCB CS Research Fair Search UIs Combine Browsing & Search Place Search Results in Context Large Category Hierarchies

UCB CS Research Fair Cha-Cha Students : Mike Chen, Jamie Laflen, Jason Hong, Jimmy Lin, Shiang Chen

UCB CS Research Fair Medical Category Hierarchy

UCB CS Research Fair DynaCat (Pratt, Hearst, & Fagan 99)

UCB CS Research Fair DynaCat Study Design Three queries 24 cancer patients Compared three interfaces ranked list, clusters, categories Results Participants strongly preferred categories Participants found more answers using categories Participants took same amount of time with all three interfaces Similar results have been verified by another study by Chen and Dumais (CHI 2000)

Cat-a-Cone Interface (Hearst & Karadi 97)

UCB CS Research Fair FLAMENCO: Improving Search via Large Category Hierarchies How to show intersections across category types? How to preview related categories in a user- tailored, dynamic manner?

UCB CS Research Fair Text Data Mining Relationships between information in documents can create new facts, not previously known.

UCB CS Research Fair Imagine You are a medical researcher Your patient has spinal inflammation numbness in fingers low TC levels negative results for all tests How can you help her?

UCB CS Research Fair Idea A new way of searching text. Link pieces of information together to formulate hypotheses …

UCB CS Research Fair LINDI Linking Information for New DIscoveries Three main parts Search UI for building and reusing hypothesis seeking strategies. Statistical language analysis techniques for interpreting the text. Backend for interfacing with various databases and translating different formats.

UCB CS Research Fair Gathering Evidence Spinal Inflammation Numbness in fingers Low TC Levels

UCB CS Research Fair Gathering Evidence Spinal Inflammation Numbness in fingers Low TC Levels Find diseases associated with each

UCB CS Research Fair Gathering Evidence Spinal Inflammation Numbness in fingers Low TC Levels Find unanticipated commonalities

UCB CS Research Fair Supporting Cascaded Search Operations Spinal Inflammation Numbness in fingers Low TC Levels

UCB CS Research Fair

New Language Analysis First use category labels to retrieve candidate documents Then use language analysis to detect causal relationships between concepts Title: Magnesum deficiency implicated in increased stress levels. Interpretation: related-to Use these to find relationships and formulate hypotheses

UCB CS Research Fair Statistical Semantic Parsing Modern statistical techniques Mainly applied to syntactic structure Probabilistic knowledge representation Represent hypotheses with different degrees of certainty.

UCB CS Research Fair Automating Assessment of Web Site Usability

UCB CS Research Fair Why Worry?  Problem: IBM's extranet  Heavy use of help and search  Unhappy users  Solution  Massive web site redesign  Focus on info-organization, not the purchasing process.  Cost: "in the millions"  Results  Not announced or trumped up  Use of "help" decreased 84%  Sales increased 400%

UCB CS Research Fair Web TANGO Tool for Assessing NaviGation & Organization Goal: automated support for comparing design alternatives How: Assess usability of the information architecture Approximate people’s information-seeking behavior (Monte Carlo simulation) Output quantitative usability metrics

UCB CS Research Fair Guidelines There are many usability guidelines A survey of 21 sets of web guidelines found little overlap (Ratner et al. 96) Why? Our hypothesis: not empirically validated So … let’s figure out what works!

UCB CS Research Fair An Empirical Study: Which features distinguish well-designed web pages?

UCB CS Research Fair Methodology Data collection 1108 pages 163 sites 3 levels per site 14 metrics About 85% accurate Text cluster and text positioning counts less accurate

UCB CS Research Fair Metrics

UCB CS Research Fair Preliminary Results Linear regression to predict Webby judges ratings Top 30% vs bottom 30% Prediction accuracy: 72% if categories not taken into account 83% if categories assessed separately

UCB CS Research Fair Goals Create empirical foundations for what is still guesswork Next step: A free online tool Long term goal: An monte carlo simulator for comparing potential designs

UCB CS Research Fair For More Information