Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jaime Teevan Microsoft Research ECIR 2017

Similar presentations


Presentation on theme: "Jaime Teevan Microsoft Research ECIR 2017"— Presentation transcript:

1 Jaime Teevan (@jteevan) Microsoft Research ECIR 2017
One of the things that’s fun about KSJ Award: The opportunity to reflect back on the research I’ve been doing into web search, and, in particular, personalized web search, for almost the past 20 years. I thought it would be fun for us today to take a trip down memory lane, and remember a little about what the internet and web search used to look like and how it evolved. Doing so will help us understand some of the complexities of web search that make it different from other types of search and understand its unique challenges. It took us a surprisingly long time to figure out how to turn some of those challenges around and use them to our advantage, but now, after 20 years, we're starting to figure it out. And the thing I want to highlight over the course of this talk is how important it is to remember where we're coming from. Both in terms of how the web and web search have evolved and what that means for how we as researchers understand those areas, but also in terms of a person's own particular interaction with information, because we can use a person’s experience to benefit them as long as in the process of personalizing we don't want to actually break the patterns that we observe. But before we go too far into the past, let's start with thinking about the web right now. Everyone freeze. Now be honest -- how many of you are looking at the internet this second? I'll give you the benefit of the doubt and assume you are tweeting about the conference or looking up the program. If you're not, get of the web and start paying attention. :) Now leave your hands up. (extend to 5 minutes, hour, day) The rest of you without your hands up, what have you been doing? Your phone must be broken or you can't get online. Study: Most people notice their phone is missing before they notice their kid is missing. Because this talk is focusing on web search, another show of hands if you have run a search today. Search, Re-Search Jaime Teevan Microsoft Research ECIR 2017

2 Remember the first time you encountered the web?
If you’re under 30, you probably don’t. I remember being in the dungeon of a computer lab where I did my undergraduate work, And some guy showing me a webpage through Lynx for the first time. It was entirely text-based. You had to tab around to follow links. It was a pain in the butt and seemed really dumb. Shows what I know – I thought the same thing when I saw Twitter for the first time. Pretty much, if I think something is dumb, it is guaranteed to be a success. Fortunately, I wasn’t so dumb about recognizing that guy was something special, and I ended up marrying him. Still, we were so little and cute then. The internet was little and cute then, too!

3 The World Wide Web In 1994 Content Tools Search Engines
2,700 websites (14% .com) Tools Mosaic only 1 year old Pre-Netscape, IE, Chrome 4 years pre-Google Search Engines 54,000 pages indexed (Lycos) 15,000 queries/day (WebCrawler) 2.7k websites (13.5% .com) Under 16 million users (0.4% of the world) (Less than twice the current population of London) Yahoo: 1994 Netscape: 1994 Mosaic 1 year old – licensed to Microsoft in 1995 as precursor to IE No Chrome – four years pre-Google! Lycos search catalog: Web crawler queries/day: By way of comparison, Google processes 40k queries/second in 2012, 63k queries/second in 2017: (2 trillion/year 2016, 1 billion/year 1999) And, actually, that’s what really changed my perception of the web: the advent of web search engines. The ability to find and access information entirely changed the value of the content. Went to work at Infoseek in the late 1990s, in Sunnyvale during the big dot com bubble. Search engines were very different in 1994, too. Lycos index contains 54k webpages! Behavioral logs of most popular search engine: 15k queries a day. Pre-link analysis, Boolean search Charging for search, then ads by impression, blinking ads (and “under construction” wepages) Yahoo was the main way for people to find things Most of Yahoo was entirely static. You entered a query and got a hand-curated webpage for that. Some of my early research was using link structure to try to automatically generate those hand-curated webpages.

4 The World Wide Web Today
Of course, that has all changed today. Web doubles in size every year. Trillions of webpages are indexed. Billions of web searches and clicks per day. Search a core fabric of every life – we’ve already seen that you have run several today. The World Wide Web Today Trillions of pages indexed. Billions of queries per day.

5 1996 We assume information is static. But web content changes!
The web is changing not just by growing and adding new content, but also by changing the existing content. 50% of existing content on the web changes every year. For example, this is Microsoft Research’s webpage from 1996 (can’t get all the way back to 1994) – crazy to think it once looked like this. Change on the web isn’t always as obvious, though. It is when we visit Facebook, Twitter, or our favorite news site. We go to those sites looking for change. But most of the time when we encounter change, it isn’t the focus of what we are doing. Example: Academic webpages change. As academics, we rarely bother to change the formatting or appearance of our web content. I’ve been using the same general format on my webpage since graduate school. But we do keep our content up to date. Example: Susan Dumais adds “machine learning,” changes “text retrieval”  “information retrieval.” The fact that web content changes is really different from the way we are used to interacting with information. When you read a book last night and put it down, you didn’t expect it to be different than what you read. Because of this there are all sorts of interesting new opportunities and challenges with our web-based interactions. (I’m going to skip some of the ways that search engines or web browsers can take advantage of content change, but it is an area that we have done a lot of research and I’d be happy to tell you more if you’re interested.) 1996 We assume information is static. But web content changes!

6 Search Results Change New, relevant content Improved ranking
Personalization General instability Can change during a query! One type of web content that people often aren’t aware of is the fact that the search engine result page for a particular query also changes regularly, too. Search result ordering changes a lot 2000 Selberg and Etzioni study: Over the course of a month, most results change by over 50%. Some of this change is unintentional: You may be part of one experiment or another, a server may go down, etc. But some is intentional, intended to help you better find what you are looking for. Just imagine how much it would suck to get the same results from  New content found, new ability to tell what is relevant, content may be personalized or localized. [Selberg & Etzioni. On the instability of web search engines. RIAO 2000.]

7 Search Results Change Search result change can even happen within a single query Bing example: When you first search, speed is very important. Even 100 ms delay can cause problems: We click less, we abandon more, we think the results are worse. As a result, search engines make all sorts of compromises to get you results as fast as they possibly can. For example, Bing now gives 8 results instead of 10 by default when you first issue a query. However, if you click on a result, and then go back to the search result page, you will get 12 results. We know this is an involved query, worth taking the time and devoting the extra resources.

8 Biggest Change on the Web
ecir But despite all of these interesting changes on the web, the BIGGEST change is not related to the size or scope of web content. It is the fact that we can observe this change and observe how people interact with it because it takes place in the cloud. You can see traces of behavioral data in many of the ways you interact with the web. For example, when you run a query you get query suggestions based on all of the queries other people have entered And personalized suggestions based on things that you have searched for in the past Biggest Change on the Web Behavioral data.

9 Behavioral Data Many Years AGo
It is impossible to separate a cube into two cubes, or a fourth power into two fourth powers, or in general, any power higher than the second, into two like powers. I have discovered a truly marvellous proof of this, which this margin is too narrow to contain. Marginalia adds value to books Students prefer annotated texts Do we lose marginalia when we move to digital documents? No! Scale makes it possible to look at experiences in the aggregate, and to tailor and personalize Of course, there is nothing inherently new about looking at usage data to understand content. Researchers have been interested in how people use information for many years. Rare and old books often valuable not just for their content, but their context. And this is because you can tell a lot about a book and the people who have read it by looking at it. Poll: Do you dogear your books? Underline? Highlight? Write notes in the margins? Even without deliberate markings, books show signs of use. My favorite books have broken bindings, open to my favorite pages, etc. We can study these physical signs of how physical objects are used. For example: The University of Texas has a collection of books from the late David Foster Wallace (author) with a lot of marginalia. They are considered a real treasure trove for researchers. Note from Mark Twain about Huckleberry Finn in the margin of a book The Pen and the Book by Walter Besant. Probably the most famous marginalia (from 1637) is Fermat’s last Theorem. Students prefer used textbooks that have annotations. They are taking advantage of knowledge of how others used the information to use the information better themselves. Marshall, Catherine C. (1998). The Future of Annotation in a Digital (Paper) World. Paper Presented at the 35th Annual GSLIS Clinic: Successes and Failures of Digital Libraries, University of Illinois at Urbana-Champaign, March 24, 1998. Concern we’d lose marginalia with digital documents. Effort to recreate marginalia (example: IBM’s dog earing of PDF documents) But actually we gain so much: links between things, interactions with content at a huge scale. With marginalia, historically interested in individual. Scale makes it possible to look at experiences in the aggregate, tailor, personalize. Mark Twain’s comments interesting because he is Mark Twain. If it were a 6th grader, not so interesting. But if it is a million 6th graders, that’s interesting. Or if it is the way YOU read the book, also interesting.

10 Past Surprises About Web Search
Early log analysis Excite logs from 1997, 1999 Silverstein et al. 1999; Jansen et al. 2000; Broder 2002 Queries are not 7 or 8 words long Advanced operators not used or “misused” Nobody used relevance feedback Lots of people search for sex Navigational behavior common Prior experience was with library search Since 1994, we’ve developed a pretty good picture of what Web search looks like But when we first started looking at Web search, we didn’t really know much about what to expect. It’s interesting to go back and read early Web search papers, and see what surprised people that we now take for granted. If for nothing else than for scale. Early log analysis on just thousands of queries. A term was common if it occurred more than 100 times! Actually, gives hope to what you can learn with limited ability to collect logs Queries short, search for sex, people don’t use advanced operators, head v. tail queries Describe navigational queries Most popular query terms: (Table 3) Most popular queries (Excite 1999): sex; yahoo; chat; horoscope; pokemon; Hotmail; games; mp3; weather; ebay [Broder. A taxonomy of web search. SIGIR Forum, 2002] [Jansen, Spink & Saracevic. Real life, real users, and real needs: A study and analysis of user queries on the web. IP&M, 2000.] [Silverstein, Marais, Henzinger & Moricz. Analysis of a very large web search engine query log, SIGIR Forum, 1999.]

11 Search is Complex, Multi-Stepped process
Typical query involves more than one click 59% of people return to search page after their first click Clicked results often not the endpoint People orienteer from results using context as a guide Not all information needs can be expressed with current tools Recognition is easier than recall Typical search session involves more than one query 40% of sessions contain multiple queries Half of all search time spent in sessions of 30+ minutes Search tasks often involves more than one session 25% of queries are from multi-session tasks But now, after years of studying large-scale query logs, we have a pretty clear sense of what web search looks like It is a complex, multi-staged process Although the model for web search is really straight forward: “enter a query, click a result” People use that simple model as part of a much more complex process The typical query involves more than one click. The typical search session involves more than one query. And the typical task involves more than one session. Long sessions, even if there are fewer of them, are most of the time that we spend searching Ironic that even though people spend a lot of time searching, 100ms delay makes us abandon more. Side challenge: Try to figure out how to create more time for search engines to do interesting stuff! (Slow Search) So all of this multi-click, multi-query, multi-session behavior is complex. Exploratory, topics evolve, we learn as we search, [Donato, Bonchi, Chi & Maarek. Do you want to take notes? Identifying research missions in Yahoo! Search Pad. WWW 2010.] [Dumais. Task-based search: A search engine perspective. NSF Task-Based Information Search Systems Workshop, 2013.] [Lee, Teevan & de la Chica. Characterizing multi-click search behavior and the risks and opportunities of changing results during use. SIGIR 2014.] [Teevan, Alvarado, Ackerman & Karger. The perfect search engine is not enough: A study of orienteering behavior in directed search. CHI 2004.] [Teevan, Collins-Thompson, White & Dumais. Viewpoint: Slow search. CACM, 2014.] [Teevan, Collins-Thompson, White, Dumais & Kim. Slow search: Information retrieval without time constraints. HCIR 2013.]

12 People Use the Same Queries Differently
All of this means understanding what a person wants is hard. And what makes it even more hard is that we all are different. When we look at in situ feedback as compared with assessed judge ratings, find considerable mismatch [Kim, Teevan & Craswell. Explicit in situ user feedback for web search results. SIGIR 2016.]

13 People Use the Same Queries Differently
Describe the potential for personalization gap: If we have a perfect search engine and can give everyone what they want, we can do perfectly – this blue line. But if we have to account for different people meaning different things with the same query, we can’t do as well, even with a perfect search engine. Somebody has to lose out, and this green line represents what we observed based on click patterns. We call the gap the potential for personalization. Ratings are expensive, so we have tried to use behavioral data to characterize individual variation in search. [Teevan, Dumais & Horvitz. Potential for personalization. TOCHI, 2010.]

14 Which Query has Less Variation?
campbells soup recipes v. vegetable soup recipe tiffany’s v. tiffany nytimes v. connecticut newspapers v. federal government jobs singaporepools.com v. singapore pools Essentially, this is capturing how much variation there is in what people click. A related measure is click entropy. You can see that click entropy does a pretty good job identifying queries where you think personalization might be more or less useful. For example: Which of these two queries do you think has less variation in what people click (or lower click entropy)? Recipes, Tiffany queries match expectation. But there are some places where behavioral measures break down, don’t show us what to expect. NYTimes – Surprise! (Click Entropy: 2.5 v. 1.0) Click entropy doesn’t always work. What is happening is the target result is not ranked highly, and people have a huge positional bias in what they click. Click position: 2.6 v. 1.6 In general, we see a correlation between entropy and click position of .73 (high!). Singapore Pools – Surprise! (Click Entropy: 2.0 v. 1.5) What is happening here is that the results change for sinaporepools.com change a lot Can’t click on the same thing if it isn’t there! Result Entropy: 10.7 v. 5.7 (N.B.: singaporepools.com is lottery related, while Singapore Pools is a company) [Teevan, Dumais & Liebling. To personalize or not to personalize: Modeling queries with variation in user intent. SIGIR 2008.]

15 Navigational Queries with Low Variation
Use everyone’s clicks to identify queries with low click entropy 12% of the query volume (target: 25%) Only works for popular queries Clicks predicted only 72% of the time Double the accuracy for the average query But what is going on the other 28% of the time? Many typical navigational queries are not identified People visit interior pages craigslist – 3% visit People visit related pages weather.com – 17% visit So there’s a lot variation for some queries, and we can’t always use clicks to successfully identify queries where people have the same intent, But maybe we can at least use consistencies in people’s behavior to identify navigational queries. Recall that navigational queries are ones that are intended to get to a particular page. Andrei used a pop-up survey and found that about 25% of all queries are navigational in intent People have tried to use the query string (e.g., contains “.org” or “.com”) -- about 10% of queries identified as navigational We’ve also tried to do it using consistency in what people click. But navigational targets are actually surprisingly hard to identify from the logs. Look for queries where people click on the same thing, go to the same place – low click entropy Requires a lot of data – can’t tell if one or two clicks is consistent or not. But even when we restrict to queries where click entropy is very low and we have a lot of data, We only can accurately predict what someone is going to click on 72% of the time. That’s good – better than we can do normally, but not good enough to really do too much with. Especially because the predicted URL tends to be the first URL anyway. Additionally, it misses a lot of instances that we think are navigational but that don’t actually look navigational in the logs. [Teevan, Liebling & Geetha. Understanding and predicting personal navigation. WSDM 2011.]

16 Individuals Follow Patterns
We would really like to do a better job than this! Fortunately, while different people are very different, individuals are often very predictable. We follow patterns. Poll: What’s the first thing you do in the morning? How many of you brush your teeth first thing? Take a shower? Have a cup of coffee? Lots of variation! But – how many do the same first thing every day? We all do! Individuals Follow Patterns Getting ready in the morning. Getting to a webpage.

17 ECIR query a good example of this
How many people have searched for ECIR at least once before? How many of you have done it more than once? ECIR can actually mean a lot of things. But when any of us do the search, we know what it means! And search engines have no excuse for not knowing what I want, because I do the same query over and over again. Despite how complex search can be, a good chunk of it we can understand fairly clearly: It involves re-finding

18 Finding often involves refinding
Repeat query (33%) information search and retrieval conference in europe Repeat click (39%) Query  ecir Lots of repeats (43%) Repeat Click New Click Repeat Query 33% 29% 4% New Query 67% 10% 57% 39% 61% Repeat Query 33% New Query 67% Went to the query logs to look for individual patterns in search;. Looked at repeat queries – 1 in 3 queries are repeats! Looked at repeat clicks – even more involve repeat clicks! 43% of search behavior involves a repeat click or query. Surprise to me when we found this magnitude of repeat, even though we were actively looking for it and expected it. (Although perhaps not that surprising given how much people re-use on the web. Cockburn et al found 80% of Web page visits were re-visits) Of course, while all of this behavior is some sort of refinding, each quadrant probably means something a little different For example, you see people sometimes use the same query to click on different results, And actually a significant fraction of this repeat click behavior also involves clicking on new and same results. These are often instances where a person is doing an informational search, learning about a process. For example, you see when a person starts a session that they start with repeat clicks and then move on to new content, where as when a person ends a session they are often likely return to that query. I’ll talk more about this later. But a lot of refinding is to return to a particular URL, seems navigational in nature. For example, these 10% of queries down here where the query changes and a person clicks on a repeat behavior. What is happening here is the person is learning how to search for their target better. “free music”  Pandora; “jobs”  “monster jobs” Queries in chains converge, become longer, more common, rank the URL higher [Teevan, Adar, Jones & Potts. Information re-retrieval: Repeat queries in Yahoo's logs. SIGIR 2007.] [Tyler & Teevan. Large scale query log analysis of re-finding. WSDM 2010.] [Tyler, Teevan, Bailey, de la Chica & Dandekar. Large scale log analysis of individuals' domain preferences in web search. MSR-TR , 2015.]

19 Identifying Personal Navigation
Use an individual’s clicks to identify repeat (query, click) pairs 15% of the query volume Most occur fewer than 25 times in the logs Queries more ambiguous Rarely contain a URL fragment Click entropy the same as for general Web queries Multiple meanings – enquirer Found navigation – bed bugs Serendipitous encounters – etsy 95% We can use this refinding behavior to identify navigation queries that aren’t necessarily navigational for everyone – but are for the individual. Use an individual’s clicks to identify repeat query  single click behavior. Identifies 15% of query volume, and is correct at guessing the result 95% of the time – much better than the 72% we saw when using general behavior. Only 5% overlap with general navigation – lots of unique queries here, most occur fewer than 25 times in total. Often the target result is ranked low. [Teevan, Liebling & Geetha. Understanding and predicting personal navigation. WSDM 2011.] National Enquirer Cincinnati Enquirer [Informational] Etsy.com Regretsy.com (parody)

20 Supporting Personal Navigation
This is an opportunity for safe personalization that works. A lot of ways that it can be used. [Svore, Teevan, Dumais & Kulkarni. Creating temporally dynamic web search snippets. SIGIR 2012.] [Teevan, Cutrell, Fisher, Drucker, Ramos, Andrés & Hu. Visual snippets: Summarizing web pages for search and revisitation. CHI 2009.] [Teevan, Liebling & Geetha. Understanding and Predicting Personal Navigation. WSDM 2011.] Beyond web page preference, domain preference is also highly individualized. Most domains people prefer are only preferred by a few other people. Interestingly, a preference for a domain’s topic means you’ll prefer other domains in that topic area more, not less. For example, for domains such as Fox News and MSNBC, where individuals are thought to prefer one and not the other, there is, nonetheless, a positive correlation in preference. There are also patterns in how people visit web pages – that indicate what they are using the webpage for. [Tyler, Teevan, Bailey, de la Chica & Dandekar. Large scale log analysis of individuals' domain preferences in web search. MSR-TR , 2015.] Tom Bosley - Wikipedia, the free encyclopedia Thomas Edward "Tom" Bosley (October 1, 1927 October 19, 2010) was an American actor, best known for portraying Howard Cunningham on the long-running ABC sitcom Happy Days. Bosley was born in Chicago, the son of Dora and Benjamin Bosley. en.wikipedia.org/wiki/tom_bosley Tom Bosley - Wikipedia, the free encyclopedia Bosley died at 4:00 a.m. of heart failure on October 19, 2010, at a hospital near his home in Palm Springs, California. … His agent, Sheryl Abrams, said Bosley had been battling lung cancer. en.wikipedia.org/wiki/tom_bosley

21 Browsing Often Involves Revisiting
50% to 80% of browsing is revisiting Pages are revisited at different rates Page elements change at different rates Resonance can serve as a filter for important content Even more than search, people revisit pages a lot when browsing! We know that different people revisit different pages at different rates. They may visit quickly, or slowly, or somewhere in between. Page elements also change at different rates Some aspects of this page change very quickly – such as the blog section. Other aspects change slowly – such as the navigational sections. Others, like this middle content, change somewhere in between. By matching the revisitation rate with the rate of change, we can tell what parts are most important. Resonance enables us to filter for interest on the page [Adar, Teevan & Dumais. Resonance on the web: Web dynamics and revisitation patterns. CHI 2009.]

22 New York Times example:
People tend to revisit the New York Times homepage very quickly. They read articles in a hub-and-spoke approach, and check in for new articles during the day. When you look at the parts of the page that change at the same rate as people revisit, you see that it is the news articles – the important content on the site.

23 Woot.com example: The site posts a bargain deal every night at midnight, and offer the deal until the item sells out. People who revisit Woot.com tend to check in when a new deal is posted so they can decide if they want it. When you filter the content by what changes at the same rate, you see it is the product information. You drop the navigational structure – and also the fast-changing blog and comment style content.

24 Costco example: People revisit the Costco homepage after long periods of time, when they next need to shop. The homepage still has plenty of content – all of these deals, for example. But what shows up when you look for resonance between the content change and revisitation patterns Is the navigational structure so they can get to the appropriate sub-content to get their shopping done. (Side note: It wasn’t until I did this work that I saw that Costco has a top-level section titled “Funeral.”) Can use the ability to identify important page segments in many ways, such as: Creating mobile interfaces or smaller interfaces Building better retrieval algorithms based on important content Notifying people of changes to things they find interesting

25 Patterns A Double Edged Sword
However, while the fact that people follow predictable patterns is awesome for allowing us to guess what they’ll do next It is actually a double edge sword. Because if we try to change anything to actually help them move through their pattern better We risk breaking that pattern. Example: You go to brush your teeth, but a different brush is where your toothbrush should be. Example of how hard it is to get out of a pattern: Driving somewhere, and accidentally end up at work because that’s where you always go. Patterns A Double Edged Sword Patterns are predictable. Changing a pattern is confusing.

26 Change Interrupts Patterns
Example: Dynamic menus Put commonly used items at top Slows menu item access Does search result change interfere with refinding? The fact that change interrupts patterns is well known in the HCI community. Dynamic menus example. The question is: Do we face the same risk when we try to help people search better by personalizing refinding queries? [Mitchell & Shneiderman. Dynamic versus static menus: An exploratory comparison. SIGCHI Bulletin, 1989.] [Somberg. A comparison of rule-based and positionally constant arrangements of computer menu items. CHI 1986.]

27 Change Interrupts Refinding
When search result ordering changes people are Less likely to click on a repeat result Slower to click on a repeat result when they do More likely to abandon their search Happens within a query and across sessions Even happens when the repeat result moves up! How to reconcile the benefits of change with the interruption? The answer is: Yes! Across sessions: 26% of time repeat searches at different rank. 94 seconds to re-click v. 192 seconds to re-click (over twice as long!) Within a query: Fewer repeat clicks within a session (53% v. 88%) Slower time to click when returning to a search result list (26 seconds v. 6 seconds) But, of course, the reason we want to introduce change is to help people. We saw we could identify personally relevant content. And we’ve seen that when search results change, even though it slows things down when people actually DO click on a new result they tend to be very satisfied with the result – the change helps them find what they are looking for. How can we reconcile the benefits of introducing change with the fact that it gets in the way of our expectations? Seems like we need magic! Fortunately, we can do magic. [Lee, Teevan & de la Chica. Characterizing multi-click search behavior and the risks and opportunities of changing results during use. SIGIR 2014.] [Teevan, Adar, Jones & Potts. Information re-retrieval: Repeat queries in Yahoo's logs. SIGIR 2007.]

28 Use Magic to Minimize Interruption
Pick a card, any card.

29 Abracadabra Magic happens.

30 Your Card Is Gone!

31 Consistency Only Matters Sometimes
What’s going on here? People forget a lot! We can take advantage of the fact that people don’t attend to very much To introduce changes right before their very eyes Only need to keep the content that a person is focused on constant

32 Bias Personalization by Experience
This can be used in a search result list to provide new content while maintaining a consistent feel. The trick is to bias personalization based on the user’s previous experience with the information. Example: Search for “Eytan Adar” If I click on the result about Eytan’s Twitter account, it suggests that I want to see information related to his social media presence. So perhaps it suggests that his LinkedIn and Facebook pages are of interest to me. Naively, we’d want to rank them at the top of the list – they’re what I want most, so let’s put them first! But I won’t see them there, and they will confuse me because studies show that I internalize the first few results I see. A better place to put them so that I will encounter them and find them useful is immediately after the result I just clicked. We have found that if you are thoughtful about how you add new content You can create search results lists that appear the same and don’t disorient people But contain new content that is useable! When we’ve studied this sort of re-ranking with actual users, we find that it Operates like a new search result list when finding new content, and like an previously viewed list when finding old content. [Kim, Cramer, Teevan & Lagun. Understanding how people interact with web search results that change in real-time using implicit feedback. CIKM 2013.] [Teevan. How people recall, recognize and reuse search results. TOIS, 2008.] [Teevan. The Re:Search Engine: Simultaneous support for finding and re-finding. UIST 2007.]

33 Create Change Blind Web Experiences
This is like playing with change blindness. [Crosswalk appears and disappears.] [Can skip if needed for time.]

34 We might apply the same principle to Woot.com
Keep the stuff I pay attention to on the site the same, and change the rest

35 The Complex Task of Making Search Simple
Challenge: The web is complex Tools change, content changes Different people use the web differently Fortunately, individuals are simple We are predictable, follow patterns Predictability enables personalization Beware of breaking expectations! Bias personalization by expectations Create magic personal experiences We’ve looked at why search can be particularly challenging on the web. However, people follow patterns, we’re predictable, and there’s a lot we can do to take advantage of this But we can’t disrupt the patterns in the process, we need to account that people’s prior experiences set up expectations that we should meet. And the cool thing is that this is possible with a little magic.

36 References Silverstein, Marais, Henzinger & Moricz. Analysis of a very large web search engine query log. SIGIR Forum, 1999. Somberg. A comparison of rule-based and positionally constant arrangements of computer menu items. CHI 1986. Svore, Teevan, Dumais & Kulkarni. Creating temporally dynamic web search snippets. SIGIR 2012. Teevan. The Re:Search Engine: Simultaneous support for finding and re- finding. UIST 2007. Teevan. How people recall, recognize and reuse search results. TOIS, 2008. Teevan, Alvarado, Ackerman & Karger. The perfect search engine is not enough: A study of orienteering behavior in directed search. CHI 2004. Teevan, Collins-Thompson, White & Dumais. Viewpoint: Slow search. CACM, Teevan, Collins-Thompson, White, Dumais & Kim. Slow search: Information retrieval without time constraints. HCIR 2013. Teevan, Cutrell, Fisher, Drucker, Ramos, Andrés & Hu. Visual snippets: Summarizing web pages for search and revisitation. CHI 2009. Teevan, Dumais & Horvitz. Potential for personalization. TOCHI, 2010. Teevan, Dumais & Liebling. To personalize or not to personalize: Modeling queries with variation in user intent. SIGIR 2008. Teevan, Liebling & Geetha. Understanding and predicting personal navigation. WSDM 2011. Tyler & Teevan. Large scale query log analysis of re-finding. WSDM 2010. Tyler, Teevan, Bailey, de la Chica & Dandekar. Large scale log analysis of individuals' domain preferences in web search. MSR-TR , 2015. More at Adar, Teevan & Dumais. Resonance on the web: Web dynamics and revisitation patterns. CHI 2009. Adar, Teevan & Dumais. Large scale analysis of web revisitation patterns. CHI 2008. Broder. A taxonomy of web search. SIGIR Forum, 2002 Donato, Bonchi, Chi & Maarek. Do you want to take notes? Identifying research missions in Yahoo! Search Pad. WWW 2010. Dumais. Task-based search: A search engine perspective. NSF Task-Based Information Search Systems Workshop, 2013. Jansen, Spink & Saracevic. Real life, real users, and real needs: A study and analysis of user queries on the web. IP&M, 2000. Kim, Cramer, Teevan & Lagun. Understanding how people interact with web search results that change in real-time using implicit feedback. CIKM 2013. Kim, Teevan & Craswell. Explicit in situ user feedback for web search results. SIGIR 2016. Lee, Teevan & de la Chica. Characterizing multi-click search behavior and the risks and opportunities of changing results during use. SIGIR 2014. Mitchell & Shneiderman. Dynamic versus static menus: An exploratory comparison. SIGCHI Bulletin, 1989. Selberg & Etzioni. On the instability of web search engines. RIAO 2000.

37 Jaime Teevan (@jteevan) teevan@microsoft.com
Thank You! Jaime Teevan

38 Extra Slides How search engines can make use of change to improve search.

39 Change Can Identify Important terms
Divergence from norm cookbooks frightfully merrymaking ingredient latkes Staying power in page There are all sorts of ways that search engines can use change to help identify relevant information. For example, change tells us something about the meaning of the page – what the page is consistently about, and what it is just for a second. Match query with need – trending query is best matched to dynamic content, steady query to stable content. [Adar, Teevan, Dumais & Elsas. The web changes everything: Understanding the dynamics of web content. WSDM 2009.] [Elsas & Dumais. Leveraging temporal dynamics of document content in relevance ranking. WSDM 2010.] Sep Oct Nov Dec. Time

40 Change to Results Can Help with Finding!
Change to click Unsatisfied initially Gone > Down > Stay > Up Satisfied initially Stay > Down > Up > Gone Changes around click Always benefit NSAT users Best below the click for satisfied users NSAT SAT Up 2.00 4.65 Stay 2.08 4.78 Down 2.20 4.75 Gone 2.31 4.61 Higher number = better [Lee, Teevan & de la Chica. Characterizing multi-click search behavior and the risks and opportunities of changing results during use. SIGIR 2014.] NSAT Changes Static Above 2.30 2.21 Below 2.09 1.99 SAT Changes Static Above 4.93 Below 4.79 4.61

41 The World Wide Web 20 Years Ago
Web usage 36 million users (1% of world) Tools Netscape most popular browser No social networks, no smartphones Google not yet on the scene Search Engines 50 million pages indexed (Infoseek) 19 million queries a day (Altavista) Web usage: Now half the world Hotmail founded in 1996 Search engines started really emerging just around 20 years ago And, actually, that’s what really changed my perception of the web: the advent of web search engines. The ability to find and access information entirely changed the value of the content. Went to work at Infoseek in the late 1990s, in Sunnyvale during the big dot com bubble. Search engines were very different 20 years ago. Infoseek largest index at 50 million Altavista getting the most queries, at 19m per day (220 queries/second) By way of comparison, Google processes 40k queries a second in Or 19 million queries in 5 minutes. (Go back even further: 1994: Lycos index contains 54k webpages! Behavioral logs: 1.5k queries a day.) Pre-link analysis, Boolean search, charging for search Blinking ads, under construction Yahoo was the main way for people to find things Most of Yahoo was entirely static. You entered a query and got a hand-curated webpage for that. Some of my early research was using link structure to try to automatically generate those hand-curated webpages. --- Web usage: Indexed web: Queries: (2 trillion/year 2016, 1 billion/year 1999) (19 million/day 1996)

42 Extra Slides Privacy issues and behavioral logs.

43 Public Sources of Behavioral Logs
Public Web service content Twitter, Facebook, Digg, Wikipedia Research efforts to create logs Lemur Community Query Log Project 1 year of data collection = 6 seconds of Google logs Publicly released private logs DonorsChoose.org Enron corpus, AOL search logs, Netflix ratings Enron corpus, purchased by Andrew McCallum at UMass Amherst for $10k There are not many private logs made available by private companies because of the privacy implications.

44 Example: AOL Search Dataset
August 4, 2006: Logs released to academic community 3 months, 650 thousand users, 20 million queries Logs contain anonymized User IDs August 7, 2006: AOL pulled the files, but already mirrored August 9, 2006: New York Times identified Thelma Arnold “A Face Is Exposed for AOL Searcher No ” Queries for businesses, services in Lilburn, GA (pop. 11k) Queries for Jarrett Arnold (and others of the Arnold clan) NYT contacted all 14 people in Lilburn with Arnold surname When contacted, Thelma Arnold acknowledged her queries August 21, 2006: 2 AOL employees fired, CTO resigned September, 2006: Class action lawsuit filed against AOL AnonID Query QueryTime ItemRank ClickURL jitp :18:18 1 jipt submission process :18:18 3 computational social scinece :19:32 computational social science :20:04 2 seattle restaurants :25:50 2 perlman montreal :15:14 4 jitp 2006 notification :13:13

45 Example: AOL Search Dataset
Other well known AOL users User 927 how to kill your wife User i love alaska Anonymous IDs do not make logs anonymous Contain directly identifiable information Names, phone numbers, credit cards, social security numbers Contain indirectly identifiable information Example: Thelma’s queries Birthdate, gender, zip code identifies 87% of Americans

46 Example: Netflix Challenge
October 2, 2006: Netflix announces contest Predict people’s ratings for a $1 million dollar prize 100 million ratings, 480k users, 17k movies Very careful with anonymity post-AOL May 18, 2008: Data de-anonymized Paper published by Narayanan & Shmatikov Uses background knowledge from IMDB Robust to perturbations in data December 17, 2009: Doe v. Netflix March 12, 2010: Netflix cancels second competition All customer identifying information has been removed; all that remains are ratings and dates. This follows our privacy policy. . . Even if, for example, you knew all your own ratings and their dates you probably couldn’t identify them reliably in the data because only a small sample was included (less than one tenth of our complete dataset) and that data was subject to perturbation. Ratings 1: [Movie 1 of ] 12, 3, [CustomerID, Rating, Date] 1234, 5 , [CustomerID, Rating, Date] 2468, 1, [CustomerID, Rating, Date] Movie Titles 10120, 1982, “Bladerunner” 17690, 2007, “The Queen”


Download ppt "Jaime Teevan Microsoft Research ECIR 2017"

Similar presentations


Ads by Google