Secondary Evidence for User Satisfaction With Community Information Systems Gregory B. Newby University of North Carolina at Chapel Hill ASIS Midyear Meeting.

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

Web Usage Mining Web Usage Mining (Clickstream Analysis) Mark Levene (Follow the links to learn more!)
WEB DESIGN TABLES, PAGE LAYOUT AND FORMS. Page Layout Page Layout is an important part of web design Why do you think your page layout is important?
AskMe A Web-Based FAQ Management Tool Alex Albu. Background Fast responses to customer inquiries – key factor in customer satisfaction Costs for customer.
Collecting, Analyzing and Using Visitor Data Chapter 12.
CPSC 203 Introduction to Computers Tutorial 59 & 64 By Jie (Jeff) Gao.
Dave Krause ANRCS Web Action Team.  Data is collected from a web site based on what the user does during the visit.
Web Server Hardware and Software
Measuring Scholarly Communication on the Web Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK Bibliometric Analysis.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Web Metrics October 26, 2006 Steven Schwartz President, PowerWebResults.com Southeastern Massachusetts E-Commerce Network University of Massachusetts –
Progress Report 11/1/01 Matt Bridges. Overview Data collection and analysis tool for web site traffic Lets website administrators know who is on their.
Automatic Data Collection: Server Logs As with all methods, have to ask: What are the goals for your system? –What constitutes success, or good quality.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
SESSION 9 THE INTERNET AND THE NEW INFORMATION NEW INFORMATIONTECHNOLOGYINFRASTRUCTURE.
Computers Going Online Internet Resources and Applications Finding information on the Web browsing: just looking around searching: trying to find specific.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
9/11/2008 Michelle Warcholic. 9/11/2008
Sharepoint Portal Server Basics. Introduction Sharepoint server belongs to Microsoft family of servers Integrated suite of server capabilities Hosted.
Evaluating Web Server Log Analysis Tools David Strom SD’98 2/13/98.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
Section 13.1 Add a hit counter to a Web page Identify the limitations of hit counters Describe the information gathered by tracking systems Create a guest.
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
CS 401 Paper Presentation Praveen Inuganti
Making YOUR WEBSITE MORE EFFECTIVE Website Evaluation & Usability September 17 th,
Dr Lisa Wise 18/10/2002 Website Metrics Dr Lisa Wise.
CPSC 203 Introduction to Computers Lab 21, 22 By Jie Gao.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Server tools. Site server tools can be utilised to build, host, track and monitor transactions on a business site. There are a wide range of possibilities.
Creating Electronic Portfolios. The Writing Center at Rensselaer AOL InstantMessenger: instantwriter.
Chapter 8 The Internet: A Resource for All of Us.
5 Chapter Five Web Servers. 5 Chapter Objectives Learn about the Microsoft Personal Web Server Software Learn how to improve Web site performance Learn.
Project Proposal Interface Design Website Coding Website Testing & Launching Website Maintenance.
CSE Data Mining, 2002Lecture 11.1 Data Mining - CSE5230 Web Mining CSE5230/DMS/2002/11.
Put it to the Test: Usability Testing of Library Web Sites Nicole Campbell, Washington State University.
Beyond Glitz and Into Content Gregory B. Newby Univ. North Carolina at Chapel Hill ASIS 1999 Midyear Meeting.
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
CPSC 203 Introduction to Computers Lab 23 By Jie Gao.
1 Lies, damn lies and Web statistics A brief introduction to using and abusing web statistics Paul Smith, ILRT July 2006.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
©2010 John Wiley and Sons Chapter 12 Research Methods in Human-Computer Interaction Chapter 12- Automated Data Collection.
Melissa Armstrong – Sponsor Dr. Eck Doerry – Mentor Greg Andolshek Alex Koch Michael McCormick Department of Computer Science SolutionProblemDesign User.
Web Analytics Basic 6-Step Process Based on content from: /od/loganalysis/a/web_analy tics.htm.
Linking electronic documents and standardisation of URL’s What can libraries do to enhance dynamic linking and bring related information within a distance.
Sustainability: Web Site Statistics Marieke Napier UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: URL
Data Mining By Dave Maung.
Log files presented to : Sir Adnan presented by: SHAH RUKH.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
Web Site Statistics A Metric for Measuring Engagement.
EVALUATE YOUR SITE’S PERFORMANCE. Web site statistics Affiliate Sales Figures.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
Internet Architecture and Governance
Environmental Scanning and Library 2.0 Computers in Libraries 2006 Marianne E. Giltrud May 8, 2006.
 History (WWW & Internet)  Search tools  Search Engines vs. Subject Directory  Meta search Engines  Steps for Searching  Effective Strategies.
Copyright © 2010 Pearson Education, Inc. publishing as Prentice HallChapter Finding, Evaluating, and Processing Information.
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
CFTP - A Caching FTP Server Mark Russell and Tim Hopkins Computing Laboratory University of Kent Canterbury, CT2 7NF Kent, UK 元智大學 資訊工程研究所 系統實驗室 陳桂慧.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Website design and structure. A Website is a collection of webpages that are linked together. Webpages contain text, graphics, sound and video clips.
CSCI-235 Micro-Computers in Science The Internet and World Wide Web.
2004/051 >> Supply Chain Solutions That Deliver Users.
Web Search Architecture & The Deep Web
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
27.1 Chapter 27 WWW and HTTP Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Zaap Visualization of web traffic from http server logs.
Web Mining Ref:
Chapter 12: Automated data collection methods
Planning and Storyboarding a Web Site
Presentation transcript:

Secondary Evidence for User Satisfaction With Community Information Systems Gregory B. Newby University of North Carolina at Chapel Hill ASIS Midyear Meeting 1999

What do we want to know? n Who are information seekers ; users? n What are their needs? n Are their needs being met? n Context: the goals and missions of the community net

What else do we want to know? n Are people viewing sponsorship information? n Reading policy documents? n Displaying images? n Using search engines or indexes? n Local or remote? n Browsing or reading?

Possible sources of evidence n Content analysis: what’s available on the system(s)? Questions asked. n Sociological research: talk to people, look at what they use the net for, etc. n Psychological research: evaluate cognitive change in user knowledge, etc. n Market research: broad data collection from multiple potential audiences

More possible sources of evidence n Secondary data: artifacts generated by information system use n Today’s focus: analysis of log file entries –Web usage statistics –Instrumenting online menu systems –Login or call history –Other system logs ( , FTP)

What questions may be asked of secondary data? n What content is accessed, with what frequency? n What paths are followed to content? n Are entry points, policy documents, or other front-end material bypassed? n Is content read, skimmed or skipped through? n What subsets of content are viewed by individuals (patterns of use)

What’s wrong with Web server logs? n Aggregate level access to content: not the whole story! n What are SESSIONS like (a sequence of accesses by a single person)? n What are paths from item to item (transcends a single “referrer” log) n Are data used linearly (following hyperlinks)? n How long is spent on a document?

More analysis is feasible. Sample: Web server logs n Single line entries for each “hit” (HTTP “GET” or similar request) n Separate file for errors, referrers n Sample entry: n 56kdial52.absi.net - - [22/May/1999:20:12: ] "GET /index.html HTTP/1.0"

Sources of complexity: n Multiple types of servers might be on a single system (e.g., RealServer, database server, search engine) n A Web page visit might involve many files n Frames and other authoring techniques can confuse n More than one person might use the same remote computer

Question: Can we get the “story” of a session? n Yes! Just track through all the “hits” from the same host within a narrow time period –Challenge: how narrow a time period? –Challenge: some hosts support multiple simultaneous users (but not many) –Challenge: lots of files per page might confuse things (but narrow +/- a few second time frames can help) –Challenge: what is structure of site?

Sample “GET” might include multiple files n [20/May/1999:18:44: ] "GET /~gbnewby/inls80/explore2.html HTTP/1.1" n [20/May/1999:18:44: ] "GET /~gbnewby/inls80/octo.gif HTTP/1.1" n [20/May/1999:18:44: ] "GET /~gbnewby/inls80/pmail.gif HTTP/1.1"

Here’s a “story” (gbn’s pages) n [08/May/1999:09:30: ] "GET /~gbnewby/index_top.html HTTP/1.0" [09/May/1999:00:44: ] "GET /~gbnewby/index_top.html HTTP/1.0" [09/May/1999:11:43: ] "GET /gbnewby/forms HTTP/1.0" [09/May/1999:12:06: ] "GET /gbnewby/forms/ HTTP/1.0" [09/May/1999:16:36: ] "GET /~gbnewby HTTP/1.0" [09/May/1999:17:44: ] "GET /~gbnewby/ HTTP/1.0" [10/May/1999:06:20: ] "GET /gbnewby/review2.html HTTP/1.0" [10/May/1999:09:33: ] "GET /gbnewby/vita.html HTTP/1.0" [10/May/1999:13:33: ] "GET /gbnewby/inls80/explore1.html HTTP/1.0" [11/May/1999:02:43: ] "GET /gbnewby/inls80/explore2.html HTTP/1.0" [11/May/1999:09:21: ] "GET /~gbnewby/vita.html HTTP/1.0" [11/May/1999:10:05: ] "GET /gbnewby/presentations/security.html HTTP/1.0" [11/May/1999:13:35: ] "GET /gbnewby/index_top.html HTTP/1.0"

Question: What are entry points for particular documents? n You’re on easy street with httpd “referrer” logs, but these are often not kept (for efficiency) n Otherwise, you don’t know where someone came from unless it was from YOUR site n By looking through a session “story” you can see the path people take to particular pages. Analyze finding aids!

Here’s a path, including searching and reading n [20/May/1999:11:08: ] "GET /docsouth HTTP/1.0" n [20/May/1999:11:08: ] "GET /docsouth/dasmain.html HTTP/1.0" n [20/May/1999:11:08: ] "GET /docsouth/dasnav.html HTTP/1.0" n [20/May/1999:11:08: ] "GET /docsouth/images/greensquare.gif HTTP/1.0" n [20/May/1999:11:08: ] "GET /docsouth/search.html HTTP/1.0"

(part II. This is via metalab.unc.edu) n [20/May/1999:11:08: ] "GET /docsouth/images/greenarrow.gif HTTP/1.0" n [20/May/1999:11:19: ] "GET /docsouth/southlit/southlit.html HTTP/1.0" n [20/May/1999:11:20: ] "GET /docsouth/southlit/southlitmain.html HTTP/1.0" n [20/May/1999:11:20: ] "GET /docsouth/southlit/southlitnav.html HTTP/1.0"

(Part III.) n [20/May/1999:11:38: ] "GET /docsouth/neh/neh.html HTTP/1.0" n [20/May/1999:11:38: ] "GET /docsouth/neh/nehmain.html HTTP/1.0" n [20/May/1999:11:38: ] "GET /docsouth/neh/nehnav.html HTTP/1.0" n [20/May/1999:11:39: ] "GET /docsouth/neh/specialneh.html HTTP/1.0" n [20/May/1999:11:39: ] "GET /docsouth/neh/texts.html HTTP/1.0" n [20/May/1999:11:40: ] "GET /docsouth/harriet/menu.html HTTP/1.0" n [20/May/1999:11:40: ] "GET /docsouth/harriet/small.gif HTTP/1.0" n [20/May/1999:11:41: ] "GET /docsouth/harriet/harriet.html HTTP/1.0" n [20/May/1999:11:41: ] "GET /docsouth/harriet/harrietcva.gif HTTP/1.0" n [20/May/1999:11:41: ] "GET /docsouth/harriet/harriettpa.gif HTTP/1.0"

Question: Where do people go from a particular location? n Again, your “story” logs can track this n Again, caching is a particular challenge. For example, a user might follow hyperlinks, but the logs show discontinuities (because they went via a cached document)

Sample: going from specifics, to index, to sub-index n 4blah18.blahinc.com - - [22/May/1999:00:21: ] "GET /mrm/father.html HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:21: ] "GET /mrm/bluegrass.gif HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:27: ] "GET /index.html HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:27: ] "GET /beige_pale.gif HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:27: ] "GET /pnetlogo.gif HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:28: ] "GET /directory.html HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:28: ] "GET /directory/culture.html HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:28: ] "GET /directory/buggy.jpg HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:28: ] "GET /prairienations/index.htm HTTP/1.0" n 4blah18.blahinc.com - - [22/May/1999:00:30: ] "GET /directory/nature.html HTTP/1.0"

Question: How long is spent on a document? n Easy: inter-click time from a session n You could even make an “average time per document” for some gateway documents (such as user agreements). Or, infer AT/D by tracking those sessions that “seem” to be contiguous. This is challenging: what if someone goes to another site, or takes a nap? n Caching is still a problem

Analysis of other secondary sources of data n See Newby & Bishop 1997 for instrumentation of menu systems –Log choices of menu options –Correlate with basic user demographics (collected online) –Problem: most modern systems are not login-based, they’re Web-based n Access logs: are people coming in from dial-up lines, academic locations, etc? Dial-up = watch graphics!

Conclusions n The “easy” automated tools for Web log analysis are insufficient n They could be extended with some programming effort or utilities n “Eyeballing” the logs is still useful n Be cautious about privacy - both your own site’s policy, and the problems of posting some log data