Martin Rajman, EPFL Switzerland & Martin Vesely, CERN Switzerland

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Chapter 5: Introduction to Information Retrieval
1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 11 Designing for Usability I.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
Page16/2/2015 Sirlan Usage and usability considerations for SIRLAN solution success.
Search Engines and Information Retrieval
1 The SF Muni Map Project Maggie Law & Kaichi Sung SIMS 2003 Masters Project.
ADVISE: Advanced Digital Video Information Segmentation Engine
A Task Oriented Non- Interactive Evaluation Methodology for IR Systems By Jane Reid Alyssa Katz LIS 551 March 30, 2004.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Presented by Zeehasham Rasheed
Recommender systems Ram Akella November 26 th 2008.
Federated Search of Text Search Engines in Uncooperative Environments Luo Si Language Technology Institute School of Computer Science Carnegie Mellon University.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
User Centered Design April 1-3, 2009 Joshua Ganderson Laura Baalman Jay Trimble.
Evaluation of digital Libraries: Criteria and problems from users’ perspectives Article by Hong (Iris) Xie Discussion by Pam Pagels.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
류 현 정류 현 정 Human Computer Interaction Introducing evaluation.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Introduction to SDLC: System Development Life Cycle Dr. Dania Bilal IS 582 Spring 2009.
Search Engines and Information Retrieval Chapter 1.
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
Visualizing Information in Global Networks in Real Time Design, Implementation, Usability Study.
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Chapter 6: Information Retrieval and Web Search
Keyword Query Routing.
Playing GWAP with strategies - using ESP as an example Wen-Yuan Zhu CSIE, NTNU.
Software Engineering User Interface Design Slide 1 User Interface Design.
Chapter 8 Usability Specification Techniques Hix & Hartson.
User Interfaces 4 BTECH: IT WIKI PAGE:
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Collaborative Query Previews in Digital Libraries Lin Fu, Dion Goh, Schubert Foo Division of Information Studies School of Communication and Information.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab Fall 2003 Initial Presentation.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Smart Web Search Agents Data Search Engines >> Information Search Agents - Traditional searching on the Web is done using one of the following three: -
Sample Short Answer Questions What specifically is "natural" in a natural user interface? What does consistency mean in user interface design & why is.
Chapter 5 Trawling For Requirements. Determining What the Product Should Be without understanding the work that it is to become a part of Many projects.
Introduction to Machine Learning, its potential usage in network area,
Information Retrieval in Practice
Information Retrieval in Practice
Requirements Determination
WP4 Models and Contents Quality Assessment
ThiQar college of Medicine Family & Community medicine dept
WP5: Semantic Multimedia
Business process management (BPM)
User Interface Evaluation
Business process management (BPM)
Iterative design and prototyping
Martin Rajman, Martin Vesely
ELT. General Supervision
Methods Choices Overall Approach/Design
Ying Dai Faculty of software and information science,
User interface design.
Manuscript Transcription Assistant Initiative
Web Mining Department of Computer Science and Engg.
International Marketing and Output Database Conference 2005
Magnet & /facet Zheng Liang
Navigation-Aided Retrieval
Retrieval Utilities Relevance feedback Clustering
Learning to Rank with Ties
Information Retrieval and Web Design
Introduction Dataset search
Presentation transcript:

Making Statistical Data More Easily Accessible on the Web: Results of the StatSearch Case Study Martin Rajman, EPFL Switzerland & Martin Vesely, CERN Switzerland Ing-Mari Boynton, Bert Fridlund, Alf Fyhrlund, Peter Lundquist, Bo Sundgren, Helge Thelander, Martin Wänerskär, Statistics Sweden (SCB), Stockholm

Objectives The main goal of this contribution is to present the StatSearch prototype and its evaluation; StatSearch allows an enhanced access to statistical data available on the Web; A hybrid search interface is proposed, combining Natural Language query-based search with semi- automated navigation through a tree-like hierarchical structure over the data to be accessed.

Outline The StatSearch prototype The graphical interface Semi-automated navigation The algorithm The required hierarchical structure Internal evaluation Conclusions and future work

The StatSearch prototype The StatSearch prototype aims at improving the access to the statistical data available on the Statistics Sweden (SCB) Web site. It combines semi-automated navigation techniques with query based information retrieval techniques. The prototype has been implemented and tested on a real sample of over 5000 (English) statistical documents extracted from the SCB Web site.

StatSearch main characteristics Graphical User Interface Textual similarity computation wrt queries Semi-automated Navigation Natural Language Pre-Processing

Graphical User Interface

Semi-automated navigation QUERY: “Gross Domestic Product” 1. Similarity computation (e.g. Cosine, Okapi) 0.00 0.25 0.67 1.00 2. Propagation of similarity scores (max. rule) 0.25 0.00 1.00 3. Elimination of irrelevant nodes (score = 0) 4. [Definition of the automated navigation rules (e.g. diff > 0.4)] 5. Application of the automated navigation rules d=0.75 d=0.33 d=0.00 6. Automated navigation GDP GNP Export market National accounts Market Domestic Labour Average salary cost National statistics

Evaluation objectives The main purpose of the evaluation was to carry out an on-site, formative, user-based evaluation of the potential of the StatSearch prototype and quantify its added value for the interactive access to statistical information on the Web.

The internal evaluation (1) General characteristics: On-site testing 5 evaluation sessions (1 user / session) At least two scenarios per user 60 minutes max. per evaluation (incl. interview) Distributed over 3 days

The internal evaluation (2) Structure of an evaluation session: (based on experience gained in previous projects) Introduction (3-5mn) To explain the context and purpose of the evaluation to the evaluators Interaction with the prototype based on predefined search scenarios (3-5mn+3-40mn) Search scenarios based on questions most frequently asked by users accessing the SCB Web site In-depth Interview (10mn) To acquire the subjective criteria and general feedback from the evaluators

The internal evaluation (3) Combination of quantitative and qualitative approaches Objective (observable) criteria are measured Duration of the interaction Number of turns taken Subjective (non observable) criteria are acquired from the users Subjective success rate User-friendliness Elements of the evaluation framework and of the system were iteratively modified during the evaluation Not focussed on comparative evaluation

Main results The opinion of the evaluators about the prototype and the evaluation set up was in general positive; Navigation was used more often than search (52% vs. 19% of the interaction time); The usefulness of the visualization of the hierarchical structure of the document collection was emphasized; The advanced interface (combining navigation and search) was used more often than the simple (search only) one (79% vs. 21%); however, the simple interface is still subjectively preferred to the advanced one: The evaluators are used to keyword search techniques The new features need to be better explained (demo, on-line help, …); The system is still perceived as being targeted on specialists and not general users of statistics

Conclusions and future work The feedback from the evaluators was positive enough to encourage the members of the StatSearch consortium (EPFL, CERN, SCB) to continue their collaboration and look for new members; The novel features in the advanced interface should be explained in a better way (an intuitive metaphor is required) A reimplementation of the prototype with optimized relevance computation and automated clustering functionalities is planed for the end 2005; A larger-scale evaluation using the full SCB document database is planned for end 2005 – begin 2006;

Thank you for your attention! If you are interested by the Questions? If you are interested by the StatSearch prototype, contact us! Martin.Rajman@epfl.ch http://cdslabs.cern.ch/ 17