Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213 Group for User Interface Research.

Similar presentations


Presentation on theme: "Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213 Group for User Interface Research."— Presentation transcript:

1 Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213 Group for User Interface Research

2 SIMS 213 18 April 2002 Talk Outline  What is a web log?  Where do they come from?  Why are they relevant?  How can we analyze them?  Study  Discussion

3 SIMS 213 18 April 2002 What is a web log? A record of a visit to a web page  Visitor (IP address)  URL  Time of visit  Time spent on a page  Browser used  Referring URL  Type of request  Reply code  Number of bytes in the reply  etc… A record of a visit to a web page

4 SIMS 213 18 April 2002 What is a clickstream? A record of a path through web pages  Visitor (IP address)  URL  Time of visit  Time spent on a page  Browser used  Referring URL  Type of request  Reply code  Number of bytes in the reply  Next URL  etc… A record of a path through web pages

5 SIMS 213 18 April 2002 What is a Web Log? Apache web log: 205.188.209.10 - - [29/Mar/2002:03:58:06 -0800] "GET /~sophal/whole5.gif HTTP/1.0" 200 9609 "http://www.csua.berkeley.edu/~sophal/whole.html" "Mozilla/4.0 (compatible; MSIE 5.0; AOL 6.0; Windows 98; DigExt)" 216.35.116.26 - - [29/Mar/2002:03:59:40 -0800] "GET /~alexlam/resume.html HTTP/1.0" 200 2674 "-" "Mozilla/5.0 (Slurp/cat; slurp@inktomi.com; http://www.inktomi.com/slurp.html)“ 202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/indextop.html HTTP/1.1" 200 3510 "http://www.csua.berkeley.edu/~tahir/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“ 202.155.20.142 - - [29/Mar/2002:03:00:14 -0800] "GET /~tahir/animate.js HTTP/1.1" 200 14261 "http://www.csua.berkeley.edu/~tahir/indextop.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)“

6 SIMS 213 18 April 2002 Where do they come from? Servers  Done on most web servers  Standard formats Clients  Browsers, loggers on client machine  Must send data back Proxy Log Proxies  Similar to servers  Hang out in between client and server

7 SIMS 213 18 April 2002 Why are web logs relevant?  Lots of data  Quantitative analysis is much more fun!  User behavior, patterns  Real users, tasks  Or at least more realistic users and tasks  Leaving the usability lab  Testing effect  Fast, easy, cheap  Automatic or almost-automatic

8 SIMS 213 18 April 2002 Ed Chi asks… Usage:  How has information been accessed?  How frequently?  What’s popular? What’s not?  How do people enter the site? Exit?  Where do people spend time?  How long do they spend there?  How do people travel within the site?  Who are the people visiting?

9 SIMS 213 18 April 2002 Ed Chi asks… Structural:  What information has been added, deleted, modified, moved?  Usage + Structural  What happens when the site changes? (Google)  Does navigation change?  Does popularity change?  What about missing data?

10 SIMS 213 18 April 2002 How do you analyze web logs? 1.Data Mining: task or intent unknown  “Automated extraction of hidden predictive information from (large) databases” – Kurt Thearling  Server log analysis 2.Remote Usability Testing: task or intent known  Similar to traditional lab usability testing  Clickstream analysis What are people doing? How well does the site support what people are doing?

11 SIMS 213 18 April 2002 How? Data Mining Statistics and numbers galore!  Gazillions of tools for server log analysis Computers>Software>Internet>Site Management> Log Analysis Computers>Software>Internet>Site Management> Log Analysis  Usually charts, graphs, numbers galore  Analog & NetTracker typical statistics AnalogNetTracker  In 3D too (eBizinsights)eBizinsights

12 SIMS 213 18 April 2002 How? Data Mining cont’d Other interesting work:  Web Ecologies (Chi)  Development over time  Information scent (Chi)  Behavior patterns  Understand how to organize info “Information scent is made of cues that people use to decide whether a path is interesting.“  Useful for web designers?

13 SIMS 213 18 April 2002 Web Ecologies (Chi 1998)

14 SIMS 213 18 April 2002 How? Remote Usability Testing  Analyze clickstream in the context of the task and user intentions  Can be gathered on client, server, and via proxy  Varied granularities of interaction  Mouse movements  page access  Varied levels of user awareness  Interactive  invisible  Varied levels of access  Site only  entire web

15 SIMS 213 18 April 2002 How? Remote Usability Testing WebVip and VisVip (NIST)  Server side logging  Javascript instrumentation  Individual paths within context of site  Animation/replay sessions Questions:  What part of site used for a task? Not used?  How long to finish task? Per page?  What sorts of behavior for task?

16 SIMS 213 18 April 2002 How? Remote Usability Testing ClickViz (Blue Martini)  Server side logging  Custom instrumentation  Aggregate paths based on file system  Include demographics, purchase history  Filtering Questions:  How does visitor of type X compare to type Y?  Success vs. “failure”

17 SIMS 213 18 April 2002 How? Remote Usability Testing NetRaker ClickstreamVividence ClickStreams  Not restricted to servers  Testing suites  Interesting aggregation methods

18 SIMS 213 18 April 2002 How? Remote Usability Testing WebQuilt (GUIR) Logging Design Goals:  Extensible, Scalable  Allow for unobtrusive, “naturalistic” user interaction  Multi-platform, multi-device compatibility  Fast and easy to deploy on any website Solution:  Proxy-based logger rewrites links  Nearly invisible to user  Independent of client browser  Infer actions (e.g. back button clicks)  Stand alone or use with other tools

19 SIMS 213 18 April 2002 How? Remote Usability Testing WebQuilt (GUIR) Visual Analysis Tool:  Put data within context of the design  Show deviations from expected paths  Interactive graph

20 SIMS 213 18 April 2002 Study: Purpose  Exploratory comparison of lab and remote usability testing with mobile devices  What types of usability issues can we:  find with either method?  find with one that we can’t find with the other?  Design implications  testing tools  testing strategies

21 SIMS 213 18 April 2002 Study: The Mobile Web  Limited and/or new interaction methods  Small screens  Graffiti, keypads, thumb-pads  Beyond the desktop  Driving, traveling, walking  Noisy, public Gathering good usability data is vital to making these interfaces, and subsequently these devices, successful.

22 SIMS 213 18 April 2002 Study: Design  10 users asked to find:  Anti-lock brake information on the latest Nissan Sentra  The closest Nissan dealer  http://pda.edmunds.com  Handspring Visor Edge with OmniSky wireless modem  5 users in the lab  5 users in the wild  Web-based questionnaires

23 SIMS 213 18 April 2002 Study: Identifying Usability Issues Lab Data  Tester observations  Participant comments  Questionnaire Remote Data  Clickstream analysis  Questionnaire Severity Levels  0 indicates a comment  1  5 (minor  critical) Four Categories  Device  Browser  Site Design  Test Design

24 SIMS 213 18 April 2002 Study: Caveats  Analysis and observation for both tests done by same person  Issues identified from remote tests first  Avoids biasing remote analysis tools  Looking for potential problem areas

25 SIMS 213 18 April 2002 Study: Results Totals:  18 unique issues  7 found remotely LabRemote Device41 Browser20 Test Design62 Site Design95  5 of the 9 issues  3 of the 4 with severity level > 3  1/3 device or browser related Test Design  2 of the 6 issues  2 of the 4 with severity level > 3

26 SIMS 213 18 April 2002 Study: Process Observations Remote usability testing can capture some usability issues that lab testing already discovers Lab testing gets me:  Qualitative observations  Thinking aloud comments  Non-content usability issues

27 SIMS 213 18 April 2002 Study: Process Observations What can remote testing get us that labs can’t?  Lab effect  Quitting a task is easier when not in lab  Network problems more realistic  With more users  Patterns emerge  Can reduce uncertainty  Faster

28 SIMS 213 18 April 2002 Study: Conclusions Remote usability testing is a promising technique for capturing realistic usage data for mobile web site design Main concerns  Gathering user feedback on mobile devices is even more difficult because of limited input  Understanding users can be ambiguous  Potentially alleviated by ability to test larger number of users

29 SIMS 213 18 April 2002 Discussion  Comments  Questions  Where does web log analysis fit into a design cycle?  Understanding what methods to use when and where  Experiences?  These or other tools? DesignEvaluate Prototype


Download ppt "Analyzing Web Logs Sarah Waterson 18 April 2002 SIMS 213 Group for User Interface Research."

Similar presentations


Ads by Google