Human Memory Model Predicting Document Access in Large Multimedia Repositories (1996) JAMES E. PITKOW, MARGARET M. RECKER Sam Boham, Asif Hussaini, Christian.

Slides:



Advertisements
Similar presentations
A study of teachers and researchers practices with digital documents - grey or not Céline Bourasseau Cédric Dumas Ecole des Mines de Nantes Nantes.
Advertisements

Benthic Assessments One benthic ecologists concerns and suggestions Fred Nichols USGS, retired.
Indexing DNA Sequences Using q-Grams
Human Memory Adapts to Patterns of Information Use and Why (maybe) LarKC Should Too Lael Schooler.
Chapter 5: Introduction to Information Retrieval
شهره کاظمی 1 آزمايشکاه سيستم های هوشمند ( گزارش پيشرفت کار پروژه مدل مارکف.
Chapter 16 The World Wide Web.
Microsoft Office 2010 Office 2010 and Windows 7: Essential Concepts and Skills Mark Worden Instructor Use your spacebar or down arrow key to advance slides.
1 A Balanced Introduction to Computer Science, 2/E David Reed, Creighton University ©2008 Pearson Prentice Hall ISBN Chapter 17 JavaScript.
New Sampling-Based Summary Statistics for Improving Approximate Query Answers P. B. Gibbons and Y. Matias (ACM SIGMOD 1998) Rongfang Li Feb 2007.
Design Guidelines for Effective WWW History Mechanisms Linda Tauscher and Saul Greenberg University of Calgary This talk accompanied a paper, and was presented.
Adaptive Hypermedia on the Web: Methods, Technology and Applications Paul De Bra Eindhoven University of Technology Eindhoven, The Netherlands Centrum.
The Adaptive Nature of Memory J. R. Anderson & L. J. Schooler.
CS 5764 Information Visualization Dr. Chris North.
Specific Learning Difficulties: Dyslexia is one of many labels for a Specific Learning Difficulty. Other Labels for other Learning Difficulties include:
Data Mining By Archana Ketkar.
THE BASICS OF THE WEB Davison Web Design. Introduction to the Web Main Ideas The Internet is a worldwide network of hardware. The World Wide Web is part.
Optimal Crawling Strategies for Web Search Engines Wolf, Sethuraman, Ozsen Presented By Rajat Teotia.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
October 23, Expanding the Serials Family Continuing resources in the library catalogue.
Website Content, Forms and Dynamic Web Pages. Electronic Portfolios Portfolio: – A collection of work that clearly illustrates effort, progress, knowledge,
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Item Web 2.0 application relevant to teacher’s work.
Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Chapter 1 Introduction to Data Mining
Predicting Content Change On The Web BY : HITESH SONPURE GUIDED BY : PROF. M. WANJARI.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Review of Literature Chapter Five.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
Personal Information Management Vitor R. Carvalho : Personalized Information Retrieval Carnegie Mellon University February 8 th 2005.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Search and Navigation Based on the paper, “Improved Search Engines and Navigation Preference in Personal Information Management” Ofer Bergman, Ruth Beyth-Marom,
Short-Term Economic Statistics Working PartyJune Short Term Economic Statistics Timeliness Framework Richard McKenzie OECD.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
Qingqing Gan Torsten Suel CSE Department Polytechnic Institute of NYU Improved Techniques for Result Caching in Web Search Engines.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Chapter 6: Information Retrieval and Web Search
인지구조기반 마이닝 소프트컴퓨팅 연구실 박사 2 학기 박 한 샘 2006 지식기반시스템 응용.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Search Engines.
Objectives 2.1Scatterplots  Scatterplots  Explanatory and response variables  Interpreting scatterplots  Outliers Adapted from authors’ slides © 2012.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Configuration Management and Change Control Change is inevitable! So it has to be planned for and managed.
Optimal Database Marketing Drozdenko & Drake,
Characterising Browsing Strategies in the World Wide Web Lara D. Catledge & James E. Pitkow Presented by: Mat Mannion, Dean Love, Nick Forrington & Andrew.
CHI Web Behavior Patterns1 Separating the Swarm Categorization Methods for User Sessions on the Web Jeffrey Heer, Ed H. Chi Palo Alto Research.
A Balanced Introduction to Computer Science, 3/E David Reed, Creighton University ©2011 Pearson Prentice Hall ISBN Chapter 17 JavaScript.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Self-Organized Web Usage Regularities. Problems of foraging information on WWW Slow accession Difficulty in finding useful information is related to balkanization.
D-skyline and T-skyline Methods for Similarity Search Query in Streaming Environment Ling Wang 1, Tie Hua Zhou 1, Kyung Ah Kim 2, Eun Jong Cha 2, and Keun.
Microsoft Excel 2013 Chapter 8 Working with Trendlines, PivotTable Reports, PivotChart Reports, and Slicers.
Web Analytics Xuejiao Liu INF 385F: WIRED Fall 2004.
TEMPLATE DESIGN © Crawling is the process of automatically exploring a web application to discover the states of the application.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
HTML PROJECT #1 Project 1 Introduction to HTML. HTML Project 1: Introduction to HTML 2 Project Objectives 1.Describe the Internet and its associated key.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Modern Information Retrieval
Microsoft Office 2003 Illustrated Introductory, Premium Edition
SIS: A system for Personal Information Retrieval and Re-Use
Learning Process Visual and Auditory as input channels – be aware of overwhelming these channels resulting in excess cognitive load Limited working memory,
Web Mining Department of Computer Science and Engg.
Magnet & /facet Zheng Liang
Chapter 17 JavaScript Arrays
Presentation transcript:

Human Memory Model Predicting Document Access in Large Multimedia Repositories (1996) JAMES E. PITKOW, MARGARET M. RECKER Sam Boham, Asif Hussaini, Christian Lorenz, Ed Watson

Content 1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2 By Other People 4. Summary

1. Introduction “We may look into that window on the mind as though a glass darkly, but what we are beginning to discern there looks very much like a reflection of the world“ [Shepard 1990, p. 213]

1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2 By Other People 4. Summary

The Model Study of human memory with long tradition (psych. literature) Review of memory literature by Anderson and Schooler Relationship R1: Practice trials  subsequent performance during test Known as power law of practice E.g.: 10 light bulbs, push button with corresponding finger Robust relationship (in motor-perceptual and cognitive tasks) Relationship R2 (from related memory research): Time delay in representation  subsequent performance on recall

The Model (2) R1 and R2: Approximation as power function Form of a power function: P = A * T^(-b) P is the measure of performance T represents time A and b as parameters of the model Obtaining linear relationship: logP = logA – b * logT Figure on next slide [Schooler and Anderson 1991, pp. 397]

The Model (3) [Anderson 1981, p. 10]

The Model (4) New approach (Anderson and Schooler): Environmental explanation for these relationships (R1/R2) Memory system adapted to the structure of the environment Memory system: Make memory available that are most likely to be needed Need probability (p) Probability that a particular item will be needed at the present moment Most items not needed / few only frequently Need Odds: New distribution ==> "NEED ODDS“ = p/(1-p)

The Model (5) p  [0,1] p/(1-p)  [0,+  ] log(p/(1-p))  [- ,+  ] Algorithm that analyses the occurrence of items in large repositories –Predict future access –Applied to analysis of repositories of information in terms of: Frequency Recency Spacing rates (not observed) Used repositories: Newspaper headlines of the New York Times Utterances made to children -adresses from mails sent to one person

The Repository The Georgia Tech WWW repository is a dynamic information ecology Over 2000 multimedia documents Fluctuations in document access Monthly updated data Document deletions, insertions, renamings However, fundamental characteristics of a dynamic information ecology  Need to develop methods for prediction and information- seeking patterns

1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2 By Other People 4. Summary

Frequency Analysis/Results [Recker and Pitkow 1996, pp.358]

Frequency Analysis/Results (2) 72% [Recker and Pitkow 1996, pp. 361]

Recency Analysis/Results [Recker and Pitkow 1996, p.358]

Recency Analysis/Results (2) 92% [Recker and Pitkow 1996, pp.363]

Combining Recency and Frequency 97% [Recker and Pitkow 1996, pp.366]

1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2 By Other People 4. Summary

Applications Use on Non-Text Information to increase the relevance. Design of Information System –When dynamics pages are involved Navigation Strategies –Designing of websites Visualization of Access

Applications (2) Caching Algorithms –Removes the needs for heuristics [Recker and Pitkow 1996, p.371]

Conclusion Good for caching of systems where there is a there a lot of user with few requests. This model seems to have reached it limits in terms of progress, it doesn’t seem to be expanded on For over 100 frequency accesses there is increased variability in the prediction of probability of access. These effects have to be dismissed and so the model loses strength Choice of the window and the pane non-empirically

1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2 By Other People 4. Summary

James Pitkow Using this model to help clustering of web pages –Life, Death, and Lawfulness on the Electronic Frontier (1997) Included it as part of the overall picture of relating documents together Explains how the desirability of information changes with time Not much further after (1999) [Pitkow and Pirolli 1997, p.387]

James Pitkow (2) Looked at the Problem, using different Models Strong Regularities in World Wide Web Surfing(1998) –Looking at Page Hits –Using Real Data – Xerox, Aol, etc –Modelling using Gaussian distribution Visualisations to view the problem Visualizing the Evolution of Web Ecologies(1998) Emerging Trends in the WWW User Population(1996) [Chi, Pitkow, Mackinlay, Pirolli, Gossweiler and Card (1998)]

1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2 By Other People 4. Summary

Future Work WebViz: A Tool for WWW Access Log Analysis –A Tool for database designers and maintainers giving them a graphical display of the data. –Tool establishes an access pattern. –The tool helps structural and contextual changes resulting in more efficient use of the document space.

Future Work (2) “Stuff I’ve Seen” - A System for Personal Information Retrieval and Re-use –Assumes: Most knowledge work involves finding and re-using previously used information –The system provides a unified index of information that a person has seen before –Uses rich contextual clues –Users found information more easily when using “Stuff I’ve Seen”

Future Work (3) Characterizing Reference Locality in the WWW (1996) Presents a New Model for characterizing web access patterns for engineering web caching systems. Based on Work by Piktow and Recker Combines both Spatial information and temporal Information: –Spatial Locality - data stored close together –Temporal Locality – property that data likely to be accessed soon again after being recently accessed

1. Introduction 2. Paper 2.1. Model 2.2. Repository 2.3. Analysis Frequency Analysis/Results Recency Analysis/Results Combining Frequency and Recency 2.4. Applications 2.5. Conclusions 3. Future Work 3.1 By Authors 3.2. By Other People 4. Summary

Summary Well tested model. Accurately predicts future use. Wide range of applications. Been taken further but in a limited way.

References Shepard, R. N. (1990). Mind sights. New York: Freeman. Schooler, L. J. and Anderson, J. R. (1991). Reflections of the Environment in Memory. Anderson, J. R. (1981). Cognitive Skills and Their Acquisition. Recker, M. M. and Pitkow, J. E. (1996). Predicting Document Acess in Large Multimedia Repositories. Pitkow, J. and Pirolli, P. (1997). Life, Death, and Lawfulness on the Electronic Frontier. Conference on Human Factors in Computing Systems, CHI '97, Atlanta Chi, E. H., Pitkow, J., Mackinlay, J., Pirolli, P., Gossweiler, R. and Card, S. K. (1998). Visualizing the Evolution of Web Ecologies. ACM Conference on Human Factors in Software (CHI '98), Los Angeles