Presentation is loading. Please wait.

Presentation is loading. Please wait.

WIRED Week 7 Quick review of Information Seeking Readings Review - Questions & Comment - How does this affect IR system use? - How would this change evaluating.

Similar presentations


Presentation on theme: "WIRED Week 7 Quick review of Information Seeking Readings Review - Questions & Comment - How does this affect IR system use? - How would this change evaluating."— Presentation transcript:

1 WIRED Week 7 Quick review of Information Seeking Readings Review - Questions & Comment - How does this affect IR system use? - How would this change evaluating IR systems? Topic Discussions Web search lab game!

2 What Is Information Seeking? “a process in which humans purposefully engage in order to change their state of knowledge.” p. 5 “a process driven by human’s need for information so that they can interact with the environment.” p. 28 “begins with recognition and acceptance of the problem and continues until the problem is resolved or abandoned” p. 49 Marchionini more than just representation, storage and systematic retrieval

3 Information Seeking in Context Learning Information Seeking Information Retrieval Analytical Strategy Browsing Strategy

4 How do we search? Analytical careful planning recall of query terms iterative query reformulations examination of results batched Browsing heuristic opportunistic recognizing relevant information interactive (as can be)

5 Iseek - WebTracker study Corporate IT and knowledge workers - In work environment - Own browser and network connection Long-term study (weeks) Overall Web use analyzed Bookmarks, printed pages How sites/pages found Frequency of page visits

6 Web Study Methodology Surveys Interviews Web Use Data* - History Files - WebTracker - Server Logs Bookmarks* Printouts

7 Study Elements - Research Design Field Work Field Workers - Data Collection 1. Questionnaire survey 2. WebTracker application (and Proxy Server) 3. Personal interviews

8 Collecting Web Client Data Modified client - Pitkow and Catledge 1995 Bookmarks Chosen Web sites are personal information space Most valuable data file on user’s system Automatically organizing bookmarks History logs The history mechanism Most promising source for usage data

9 WebTracker Expanded Window

10 WebTracker Log

11 Data Analysis Log files tabulated into spreadsheets Examined for clusters or patterns of behavior Selection of episodes of Information Seeking behavior - a highlighting of the episode by the participant during the personal interview; - evidence of the episode having consumed a relatively substantial amount of time and effort; - evidence that the episode was a recurrent activity. Determined the modes of scanning & moves exercised by the participants

12 Behavioral Model Recurring Web behavioral patterns that relate people’s browser actions (Web moves) to their browsing/searching context (Web modes) Modes of scanning: Aguilar (1967) & Weick & Daft (1983, 1984) Moves in information seeking behavior: Ellis (1989) & Ellis et. al. (1993, 1997)

13 Modes of Scanning

14 Modes of Scanning for Information

15 ISeek Behaviors & Web Moves

16 Modes & Moves Model

17 Behavioral Model Verification 61 identifiable episodes

18 Behavioral Model Results People who use the Web engage in 4 complementary modes of information seeking Certain browser based actions & events indicate a particular mode of information seeking Surprises - No Explicit Instances of Monitoring to Support Formal Searching - Very Few Instances of “Push” Monitoring - Extracting Involved Basic Search Strategies Only

19 Interview Highlights Most useful work-related sites: 1.Resource sites by associations & user groups 2.News sites 3.Company sites 4.Search engines Most people do not avidly search for new Web sites Criteria to bookmark is largely based on a site providing relevant & up-to-date information Learning about new Web sites: 1.Search engines 2.Magazines & newsletters 3.Other people/colleagues

20 Survey Highlights The Web was the 3rd most frequently used source Participants spent about 20% of their work hours using the Web Majority looked for technical information on the Web Quality of Web information was perceived to be “very high” (reliable) Web was perceived as accessible as other “internal” sources however less accessible than mass media sources Few participants deliberately set out to search for new sites

21 Study 1 Summary Behavioral model of information seeking on the Web People who use the Web engage in complementary modes of information seeking Certain browser based actions & events indicate particular moves in information seeking The study suggests: - that a behavioral framework that relates user motivations and Web moves may be helpful in analyzing Web-based Information Seeking - that multiple, complementary methods of collecting qualitative and quantitative data may help compose a richer portrayal of how individuals use Web-based information in their natural work settings

22 Study Recommendations

23 Iseek Expanded Study (2) Larger Dataset One Organization Longer Duration Open-ended Interviews IT Survey More Quantitative Modeling - Glassman (1994); - Catledge & Pitkow (1995); - Tauscher & Greenberg (1997a, 1997b); - Huberman, Pirolli, Pitkow, & Lukose (1998)

24 New Types Data Collection Sources - Modified Logs - Interviews (More Focused) - Survey (Broader Focus) - Field Observation (Cube Work) Volume - Over 1400 Consistent Users - Over a Month of Web Use - 8+ GB of data

25 Collecting Web Server Data - Web Server Log Accuracy Hit - a single file is requested from the Web server View - all of the information contained on a single Web page Visit - one series of views at a particular Web site. - Proxy Server Logs Day sampling - stop caching and analyzing data. IP sampling - cancel caching of particular Web users and measuring these results only Continuous sampling - use cookie files to track a particular user(s) - KDD

26 Survey Highlights Users not motivated to change/update browser versions or startup page IT made no modifications of browser until recently, primarily for system access testing Most of most frequent users from technical departments All IT system work now Web-specific

27 Interview Highlights Corporate adoption of Internet access driven by Intranet development Local portrayals of successful Web work drove rapid adoption Use of Intranet viewed as both resource conservation and expanded work Logging of Web use data not a high concern Open to recommendations to improve Web use “Webify”ing Everything seen as good

28 KDD Highlights Extremely High Data Collection Reliability Tightly-focused Web Use (business sites) Very Small (Determinable) Inappropriate Use ( >.001%) Lower than Expected Search Engine Use - Influenced by Startup Page - Internal Search Results Pages Used Higher than Expected (Average) Use of Intranet

29 KDD Use Highlights 40,000+ episodes 11:15 average episode length Search term mode of 1 - Not dominantly work-related terms - Use of intranet search results influential

30 Updated Behavioral Model 32,512 identifiable episodes

31 Behaviors Breakdown

32 Other Studies Tend to focus on server logs, a broad range of Web users, general Web seeking activity, quantitative methods - Glassman (1994): Proxy Study - Catledge & Pitkow (1995): Surveys and Client tool; - Tauscher & Greenberg (1997a, 1997b): The Back button; - Ingwersen (1995 & 1997): Informetrics - Huberman, Pirolli, Pitkow, & Lukose (1998): Information Foraging, “Law of Surfing” - Huberman “Laws of the Web” (2001)

33 Study 2 Summary Behavioral Model Scales Up Server Logs Provide Significant Gains in Quantity Server Logs Provide Challenges in Deriving Quality Organizations Provide Focused View of Overall Web Use Knowledge Workers Collaborate (But Not Enough)

34 Summary (New) Methodology Provide new ideas for data collection & cleaning tools Verify models of Information Seeking and Web Use Discover models of Web usage Find different types of Web users Gain rich descriptions of perception of Web & Web use Evoke new system & interface designs

35 Other Tools for Web Studies Pete Pirolli, Rob Reeder, Ed Chi, et. al (UIR Group Xerox PARC) Web Logger Eytan Adgar, Bernardo Huberman (Web Ecology Group @ PARC, now HP) Andy Edmonds – Uzilla.netUzilla.net Vividence Web Evaluation Tool (WET) Eye Tracking (*)

36 Improving Web Use Expert Systems - SNLP Multimedia Databases & Metadata Display Technology Better GUIs Better, More Available Search Engines/query Syntax - Desktop Search - Ranking - Relevance Help expert users get more expert

37 Web Activities Taxonomies What types of activities on the Web have impact? What we do vs. what seems significant Purpose of people’s search - Find Get a fact or document Download information Find out about a product - Compare/Choose: 51% Methods used to find information - Explore, Monitor, Find, Collect: 71% Content for which they are searching - Medical: 18%, People: 13%, …

38 Berrypicking & IR Flexibility IR systems are rational, users aren’t (always) We don’t search in a linear model - Single query, one good result We gradually build on what we know, how we find it - Footnote chasing (backward chaining) - Citation searching (forward chaining) - Journal run (favorite sites) - Area scanning (browsing) - Subject searches in bibliographies, abstracts & indices - Author searching We combine all of these when searching Interface support for each & combinations

39 Berrypicking Paths

40 Web Search Studies Framework Web IR is still relatively new - Differences in users & information - Changes in IR systems are rapid Who doesn’t search now? “A Web searching study focuses on isolating searching characteristics of searchers using a Web IR system via analysis of data, typically gathered from transaction logs.” p 3 Studying Search Engine use - AltaVista, Excite Web Searching Studies - Single & Multiple Web sites

41 Characterizing Browsing Modifed XMosiac to learn Web browser behavior Path lengths key (but changed) Types of users: - Serendipitous browsers – little repetition, short sequences - General purpose browsers – average, repeated actions - Searchers – long navigational sequences

42 Cognitive Strategies in Web Search Systems help with: re-representation - different external representations, that have the same abstract structure, make problem- solving easier or more difficult. It also refers to how different strategies and representations, varying in their efficiency for solving a problem. graphical constraining - constrain the kinds of inferences that can be made about the underlying represented concept. temporal and spatial constraining - different representations make relevant aspects of processes and events more salient when distributed over time and space.

43 Cognitive Strategies Searching Conditions - Dispersed or Category Structures Fact finding Exploratory searching Novice & Experiences users Top-down, bottom-up & mixed

44 Reading Time, Scrolling & Interaction Can implicit feedback improve relevancy? - 561 documents, 6 subjects - Read documents & score them Better than reading, saving & printing? - Measure use now vs. later - Focused on document, not activity How do you know the user is reading? Is saving a relevance measure? No differences noted in scrolling (4.28) What about following links? Finding, highlighting, copying?

45 How do we really use the Web? People don’t read, they scan Web pages We move quickly, we know we can go back Quick experimentation & short memory Behaviors that work are reinforced & continued Satificing makes measures of quality difficult Web pages as Billboards? What’s billboard information for IR systems?

46 Revisitation Patterns on WWW Mostly Re-Visits (58%) Continually Visit New Pages Access Only A Few Pages Frequently Clusters (Sets) & Short Paths of URLs - Frequency - Recency - “Distance” Types of Navigation - Hub and Spoke - Depth Searching (lots of links before returning, if at all) - Guided Tour (Tasks)

47 Revisitation Patterns 2 Back Button Use Affects Everything (Even More Since Study) Navigation Methods Differ Reasons for Revisiting - Explore Further - Use Feature (Search or Home Page) - “On the Way” to another Page (IA Problem) Users Don’t Understand Browser History Very Well or Do They Misunderstand Page/Site Navigation? Provide Navigation Support Work with the Back Button – Don’t Break its Functionality

48 Web search lab game Break into groups Answer a set of questions Different rules for each search 1.Search as you would 2.Talk & decide before each move 3.No typing this time! 4.Search as you would again 5.Fast as possible


Download ppt "WIRED Week 7 Quick review of Information Seeking Readings Review - Questions & Comment - How does this affect IR system use? - How would this change evaluating."

Similar presentations


Ads by Google