Why Decision Engine Bing Demos Search Interaction model Data-driven Research Problems Q & A.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
Chapter 5: Introduction to Information Retrieval
INSIGHT V2 ONLINE + INTRANET Introduction Version 1c.
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
The Marathi Portal with a Search Engine Center for Indian Language Technology Solutions, IIT Bombay.
BBM: Bayesian Browsing Model from Petabyte-scale Data Chao Liu, MSR-Redmond Fan Guo, Carnegie Mellon University Christos Faloutsos, Carnegie Mellon University.
Click Evidence Signals and Tasks Vishwa Vinay Microsoft Research, Cambridge.
Search Engines and Information Retrieval
Information Retrieval in Practice
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
Search engines. The number of Internet hosts exceeded in in in in in
Chapter 5: Information Retrieval and Web Search
Search Engine Optimization March 23, 2011 Google Search Engine Optimization Starter Guide.
Query Log Analysis Naama Kraus Slides are based on the papers: Andrei Broder, A taxonomy of web search Ricardo Baeza-Yates, Graphs from Search Engine Queries.
“ The Initiative's focus is to dramatically advance the means to collect,store,and organize information in digital forms,and make it available for searching,retrieval,and.
Search Optimization Techniques Dan Belhassen greatBIGnews.com Modern Earth Inc.
Search Engine Optimization: Understanding the Engines & Building Successful Sites Zohaib Ahmed Google Analytics Individual Qualified March 2012.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Search Engines and Information Retrieval Chapter 1.
Web Search Created by Ejaj Ahamed. What is web?  The World Wide Web began in 1989 at the CERN Particle Physics Lab in Switzerland. The Web did not gain.
The Technology Behind. The World Wide Web In July 2008, Google announced that they found 1 trillion unique webpages! Billions of new web pages appear.
User Browsing Graph: Structure, Evolution and Application Yiqun Liu, Yijiang Jin, Min Zhang, Shaoping Ma, Liyun Ru State Key Lab of Intelligent Technology.
Streaming Predictions of User Behavior in Real- Time Ethan DereszynskiEthan Dereszynski (Webtrends) Eric ButlerEric Butler (Cedexis) OSCON 2014.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Fan Guo 1, Chao Liu 2 and Yi-Min Wang 2 1 Carnegie Mellon University 2 Microsoft Research Feb 11, 2009.
Query Logs – Used everywhere and for everything Sai Vallurupalli.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
The Business Model and Strategy of MBAA 609 R. Nakatsu.
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Hao Wu Nov Outline Introduction Related Work Experiment Methods Results Conclusions & Next Steps.
Understanding and Predicting Personal Navigation Date : 2012/4/16 Source : WSDM 11 Speaker : Chiu, I- Chih Advisor : Dr. Koh Jia-ling 1.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Lecture 4 Title: Search Engines By: Mr Hashem Alaidaros MKT 445.
Search Engine Optimization 101 What is SEM? SEO? How can I use SEO on my blogs and/or my personal web space?
Chapter 6: Information Retrieval and Web Search
Search Engines. Search Strategies Define the search topic(s) and break it down into its component parts What terms, words or phrases do you use to describe.
Autumn Web Information retrieval (Web IR) Handout #1:Web characteristics Ali Mohammad Zareh Bidoki ECE Department, Yazd University
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Post-Ranking query suggestion by diversifying search Chao Wang.
Sigir’99 Inside Internet Search Engines: Spidering and Indexing Jan Pedersen and William Chang.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
CONCLUSIONS & CONTRIBUTIONS Ground-truth dataset, simulated search tasks environment Implicit feedback, semi-explicit feedback (annotations), explicit.
Information Retrieval in Practice
Presentation by: Rebecca Chambers WebDuck Designs
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Evaluation Anisio Lacerda.
Search Engine Architecture
Information Retrieval (in Practice)
SEARCH ENGINES & WEB CRAWLER Akshay Ghadge Roll No: 107.
Search User Behavior: Expanding The Web Search Frontier
Map Reduce.
Search Engine Optimisation
Statistical Learning Methods for Natural Language Processing on the Internet 徐丹云.
WIRED Week 2 Syllabus Update Readings Overview.
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Chapter 5: Information Retrieval and Web Search
All About the Internet.
Jan Pedersen 10 September 2007
Presentation transcript:

Why Decision Engine Bing Demos Search Interaction model Data-driven Research Problems Q & A

Imprecise Results Research Sessions Inform Decisions

Length of Sessions by Type Q. In the past six months have you used a search engine to help inform your decisions for the following tasks? 66% of people are using search more frequently to make decisions Sources: Microsoft Internal Research conducted by iPos 2009:; Live Search Session Analysis

Search engine Objective: getting the user relevant information (a website) Model: getting out of search results page with a simple click Challenge: query – URL matching Decision engine Objective: completing the task by fulfilling user intent Model: exploring search results by clicking and browsing Challenge: whole page relevance

Explore pane (or left rail) TOC: verify user intent Related search: expand user intent Search history: remind user intent Task completion D-cards: showing structural information for the site Hover: showing more information for the site Simple tasks: e.g. Fedex tracking number …

A lot of data… Billions of query logs, documents, pictures, clicks, etc. Processing them is costly and takes time Statistical learning + distributed computing Can we train 1 Billion samples (query – URL pairs) Within a few hours? No over-fitting? Two examples of Bing-MSR “Bing-it-on” N-gram Statistical model for log mining

LM Applications in Search Query processing: alterations, expansions, suggestions Document processing: classification, clustering Matching and ranking Powerset.com Leading technology: N-gram P(next word | N-1 preceding words) Better “understanding” = model predicts better

High quality model needs lots of data at web-scale: Billions of documents, trillions of words, PetaBytes of storage Smoothing: How to deal with a very long tail Freshness: Web contents are added and revised rapidly Bing-MSR innovation: Constantly Adapting LM (CALM) Highly parallel algorithm designed for cloud computing Refine LM as soon as documents are crawled

We are sharing our resources For details, go to and follow “Bing-It-On Ngram” Compare Bing-It-On with Google’s Release GoogleBing-It-On Ngram Content TypesDocument Body onlyBody, Title, Anchor texts Model TypesRaw Count onlyCount and smoothed models Highest order N55 Training Size (Body)~ 1.0 trillion words> 1.3 trillion # of 1-gram (Body)13 million1 billion # of 5-gram (Body)1 billion237 billion AvailabilityDVDs from LDC On demand web services hosted by MS UpdateSeptember 2006Monthly

Data driven decision is first class citizen in search Toolbar and search logs: ~10 TB/Day each Bing uses log mining to Understand what users want Assess how we are doing Quickly improve Bing Query Processing Ranking User experience Examples: Relevance inference using Bayesian Browsing Model (BBM) User behavior understanding with Hidden Markov Model (HMM)

Search log records only clicked results Skipped results are not explicitly recorded Hidden data mining Model viewed results as Markov chain Skipped results = hidden data How well does it work? Very close to eye tracking data 12 % 2%2%2%2% 2%2%2%2% 1%1%1%1% 1%1%1%1% 8%8%8%8% 8%8%8%8% 26 % 13 % 10 % 41 % 80%80%