Recognizing Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos Meintanis,

Slides:



Advertisements
Similar presentations
CONTRIBUTIONS Ground-truth dataset Simulated search tasks environment Multiple everyday applications (MS Word, MS PowerPoint, Mozilla Browser) Implicit.
Advertisements

Booking Rules SLCM_AD_315. Course Content This course is designed to teach users how to view, add, and remove restrictions on courses and course sections.
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Optimizing search engines using clickthrough data
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
1 CS 501 Spring 2002 CS 501: Software Engineering Lecture 11 Designing for Usability I.
Eye Tracking Analysis of User Behavior in WWW Search Laura Granka Thorsten Joachims Geri Gay.
A New Learning Tools. Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
User and Task Analysis Howell Istance Department of Computer Science De Montfort University.
More Interfaces for Retrieval. Information Retrieval Activities Selecting a collection –Lists, overviews, wizards, automatic selection Submitting a request.
Experiences and Directions in Spatial Hypertext Frank Shipman Department of Computer Science & Center for the Study of Digital Libraries Texas A&M University.
Intelligent User Interface Research at Texas A&M University: Designing Adaptive Systems to Support Information Triage Frank Shipman Associate Director,
Designing Software for Personal Music Management and Access Frank Shipman & Konstantinos Meintanis Department of Computer Science Texas A&M University.
Designing Systems to Support Document Triage Frank Shipman Center for the Study of Digital Libraries Texas A&M University.
INFO 624 Week 3 Retrieval System Evaluation
Spatial Hypertext for Digital Library Providers and Patrons Frank Shipman Department of Computer Science & Center for the Study of Digital Libraries Texas.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Identifying Useful Passages in Documents based on Annotation Patterns Frank Shipman, Morgan Price, Cathy Marshall, Gene Golovchinsky FX Palo Alto Laboratory.
Intelligent User Interfaces Research Group Directed by: Frank Shipman.
Managing Change on the Web Luis Francisco-Revilla Frank M. Shipman Richard Furuta Unmil Karadkar Avital Arora Center for the Study of Digital Libraries.
Digital Library Service Integration (DLSI) --> Looking for Collections and Services to be DLSI Testbeds
An Agent-Oriented Approach to the Integration of Information Sources Michael Christoffel Institute for Program Structures and Data Organization, University.
Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos.
Scalable Text Mining with Sparse Generative Models
The Visual Knowledge Builder: A Second Generation Spatial Hypertext Frank M. Shipman III Haowei Hsieh Preetam Maloor J. Michael Moore.
ECDL 2006 An Exploration of Space-Time Constraints on Contextual Information in Image-based Testing Interfaces Unmil Karadkar, Marlo Nordt Richard Furuta.
Managing Software Projects in Spatial Hypertext : Experiences in Dogfooding Frank Shipman Department of Computer Science & Center for the Study of Digital.
Projects in the Intelligent User Interfaces Group Frank Shipman Associate Director, Center for the Study of Digital Libraries.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
® Microsoft Office 2010 Excel Tutorial 1: Getting Started with Excel.
Overview of Search Engines
COMPETENCY MAPPING 1 ‘Best organs succeed not because of people but because they have right people’. 2. Modern orgns. realized that: HR is the most valuable.
Student Research Week 2006 Image-based Evaluation of Video-acquired Research Skills Unmil Karadkar, Marlo Nordt Richard Furuta Cody Lee Christopher Quick.
Discovery Education Streaming Overview. Log in Screen.
1. An Overview of the Data Analysis and Probability Standard for School Mathematics? 2.
LESSON 8 Booklet Sections: 12 & 13 Systems Analysis.
QUALITATIVE MODELING IN EDUCATION Bert Bredweg and Ken Forbus Yeşim İmamoğlu.
CONCLUSION & FUTURE WORK Normally, users perform triage tasks using multiple applications in concert: a search engine interface presents lists of potentially.
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Update on Virginia’s Growth Measure Deborah L. Jonas, Ph.D. Executive Director for Research and Strategic Planning Virginia Department of Education July-August.
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
Using handheld computers to support the collection and use of reading assessment data Naomi Hupert.
Implicit Acquisition of Context for Personalization of Information Retrieval Systems Chang Liu, Nicholas J. Belkin School of Communication and Information.
SEG3120 User Interfaces Design and Implementation
NBDL (National Biology Digital Library) A NSDL Core Integration System Project PI: Su-Shing Chen n University of Missouri-Columbia n National Computational.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
CONCLUSION & FUTURE WORK Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared.
CONCLUSION & FUTURE WORK Normally, users perform search tasks using multiple applications in concert: a search engine interface presents lists of potentially.
Visualization and Spatial Hypertext Haowei Hsieh Center for the Study of Digital Libraries Texas A&M University CPSC 436, 9/28/2006.
Basics of Research and Development and Design STEM Education HON4013 ENGR1020 Learning and Action Cycles.
NASA Earth Observing System Visualization Tools ARSET - AQ Applied Remote SEnsing Training – Air Quality A project of NASA Applied Sciences Introduction.
CONCLUSIONS & CONTRIBUTIONS Ground-truth dataset, simulated search tasks environment Multiple everyday applications (MS Word, MS PowerPoint, Mozilla Browser)
Theme 2: Data & Models One of the central processes of science is the interplay between models and data Data informs model generation and selection Models.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Real Time Collaboration and Sharing
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 13 Usability 1.
Unified Relevance Feedback for Multi-Application User Interest Modeling Sampath Jayarathna PhD Candidate Computer Science & Engineering.
Acknowledgements : This research is supported by NSF grant INTRODUCTION MULTI LAYER PERCEPTRONS (MLP) DATA SET FOR TRAINING Learning weights using.
INTRODUCTION TO THE WIDA FRAMEWORK Presenter Affiliation Date.
WINDOWS 7 Windows 7 is an operating system that Microsoft has produced for use on personal computers. It is the follow-up to the Windows Vista Operating.
CONCLUSIONS & CONTRIBUTIONS Ground-truth dataset, simulated search tasks environment Implicit feedback, semi-explicit feedback (annotations), explicit.
Visualizing User Activity History
Proactive Analytic Workspaces fro Heterogeneous Data
Search Engine Architecture
Connecting Interface Metaphors to Support Creation of Path-based Collections Unmil P. Karadkar, Andruid Kerne, Richard Furuta, Luis Francisco-Revilla,
Video Summarization by Spatial-Temporal Graph Optimization
Learning about Learning Styles
Cues, Questions, and Advance Organizers
Presentation transcript:

Recognizing Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos Meintanis, Anna Zacchi, Haowei Hsieh, Frank Shipman Center for the Study of Digital Libraries & Department of Computer Science Texas A&M University Catherine C. Marshall Microsoft Corporation

Document Triage Document triage is the rapid evaluation of a set of documents for later use. Document triage places different demands on attention than single- document reading activities Continuum of types of reading: –working in overview (metadata), –reading at various levels of depth (skimming), –reading intensively

Visual Knowledge Builder (VKB)

Search in VKB

Supporting Document Triage Central problem in document triage is limited time. VKB enables rapid expression of human assessment using visual cues Goal: have system aid in selecting documents How: observe user’s triage activities to provide cues that will aid in the selection of further documents

Process for Providing Support 1.Recognize user interest in and interpretation of documents 2.Generate a representation of user interests 3.Identify documents that match these interests 4.Provide visual cues to indicate the potential value of documents

Process for Providing Support 1.Recognize user interest in and interpretation of documents 2.Generate an abstract representation of user interests 3.Identify documents that match these interests 4.Provide visual cues to indicate the potential value of documents

Acquiring User Interest Model Explicit Methods –users tend not to provide explicit feedback Implicit Methods –Reading time has been used in many cases –Scrolling and mouse events have been shown somewhat predictive –Annotations have been used to identify passages of interest Problem: Individuals vary greatly and have idiosyncratic work practices

Data from an Earlier Study Task: subjects placed in role of a reference librarian, selecting and organizing information on ethnomathematics for a teacher Setting: top 20 search results from NSDL & top 20 search results from Google Subjects given as much time as they deemed necessary (after training) After completing task, the 24 subjects were asked to identify: –5 documents they found most valuable –5 documents they found least valuable

slide w/vkb + IE

What Actions Were Correlated with Document Preferences? Lots (ordered from most to least correlated) –Number of object moves –Scroll offset –Number of scrolls –Number of border color changes –Number of object resizes –Total number of scroll groups –Number of scrolling direction changes –Number of background color changes –Time spent in document –Number of border width changes –Number of object deletions –Number of document accesses –Length of document in characters

Modeling based on Reading and Interpretation Document triage combines multiple forms of reading and interpretation Infrastructure for applications to construct and share interest models Location/Overview Application Organizing Application Reading Application User Interest Estimation Engine Reading Application Reading Application Interest Profile Manager Interest Profile

Interest Models Based on data from an earlier study, we developed four interest models –Three were mathematically derived Reading-Activity Model Organizing-Activity Model Combined Model –One hand-tuned model included human assessment based on observations of user activity and interviews with users.

Quick Comparison of Models How much of difference in original data was modeled? –Reading-activity model 47.7% –Organizing-activity model63.6% –Combined model70.8% How well would models do for new data?

Evaluation of Models 16 Subjects with same –Task (collecting information on ethnomathmatics for teacher) and –Setting (20 NSDL and 20 Google results) Different display configuration –Using a single display in this case where used two displays before Different rating of documents –Subjects rated all documents on a 5-point Likert scale (with 1 meaning “not useful” and 5 meaning “very useful”)

Predictive Power of Models Models were conservative due to data from original study. Used aggregated user activity and user evaluations to evaluate models Model Avg. Residue Std. Dev. Reading-activity model Organizing-activity model Combined model Hand-tuned model

Size of Errors

Next Steps Update models –Revise weights based on Likert-scale data –Incorporate additional features of user activity Run another set of subjects with same form of document evaluation Evaluate predictive power for individuals Evaluation with other domains/tasks –Effect of document set –Effect of domain/subject matter expertise

Summary Our goal is to support document triage by inferring user interest Developed infrastructure for applications to share interest model Compared reading-activity, organizing- activity, and combined models Combined model better than reading-activity model (p=0.02) and organizing-activity model (p=0.07). Lots of work left to do …

Contact Information Frank Shipman Download VKB 2 from: