Download presentation
Presentation is loading. Please wait.
Published byEgbert Fletcher Modified over 9 years ago
1
Hugh E. Williams Vice President, Experience, Search, and Platforms @hughewilliams, hugh@hughwilliams.com Challenges in Commerce Search
2
eBay Today
3
Of data in our Hadoop and Teradata clusters Page views each day Database calls each day 50+ petabytes 2+ billion 75+ billion 250 million Queries per day
4
$1 trillion Commerce $10 trillion The opportunity ahead is huge Source: Economist Intelligence Unit, Morgan Stanley Note: Market sizes as of 2012, Compounded Annual Growth Rates from 2012 to 2015 Online Commerce
6
Today’s Search Turnaround contributor Series of improvements Ten year old technology
7
Conversion up 13% Better Search 2010 Simple Flows Better Images Merch’ing Other 2012
8
Improving Search from 2009 to 2012 –User experience changes Imagery Reorganization Optimization Major page refresh Speed –Search science Query understanding and rewriting Understanding user intent Behavioral measurement Substantial ranking improvements (particularly to Fixed Price ranking) –And all on a 10+ year old platform named Voyager
9
Query Understanding and Rewriting Our search engine was literal We’re on a journey to make it more intuitive Idea: Mine our query-session data, look for patterns, and use these to map words in user queries to synonyms and structured data Query Rewrite Search User Query eBay Results Search Query
10
PATTERNS: QUERY REWRITES …
11
How do buyers purchase the pilzlampe? It turns out, they do one of a few things: –Type pilzlampe, and purchase –Type pilzlampe, …, pilz lampe, and purchase –Type pilzlampe, …, pilzlampen, and purchase –Type pilz lampen, …, pilzlampe, and purchase –…
12
How do buyers purchase the pilzlampe? From our data mining: –We automatically discover that pilz lampe and pilzlampe are the same –We also discover that pilz and pilze are the same, and lampe and lampen are the same From these patterns, we rewrite the user’s query pilzlampe as: pilzlampe OR “pilz lampe” OR “pilz lampen” OR pilzlampen OR “pilze lampe” OR pilzelampe OR “pilze lampen” OR pilzelampen
13
Are Query Rewrites easy? Nothing is easy at scale – Incorrect strong signals: CMU is not Central Michigan University Mariners is not the same as Marines – Context matters Correcting Seattle Marines to Seattle Mariners is (generally) right Denver Nuggets is not Denver in the Jewelry & Watches category
14
An even bigger opportunity Next Gen Search
15
Cassini : Reengineering eBay Search
16
Top-to-Bottom View
19
How hard is it to ship a new search engine? Voyager is used for much more than the obvious. It’s multi-tenant: –“Default Search” search (already migrated to Cassini in the US) –Completed, null and low (already migrated to Cassini worldwide) –Description search –Deterministic sorts –Query rewrite –Merchandizing –The Feed –Selling (for example, allowing sellers to create listings from similar items) –Category browsing –Motors and other verticals –Many fast “item lookup” scenarios for other teams –Many scenarios we don’t even know about… 19
20
What’s else is hard about eBay search? eBay has over 400 million items listed in multiple languages Our collection of items changes fast You can find just about anything on eBay. We have to optimize for every type of item Not everybody follows the same listing practices, or uses the same keywords or units –Examples include: Units of measure: centimeter versus cm, gigabytes versus gb Colors: Blue versus Aqua, Rojo is the same as Red Synonyms: laptop and notebook, mobile phone and cell phone Abbreviations: SGA means Stadium Giveaway Spelling errors Our goal is to help both buyers and sellers find items even when they use different ways of expressing the same things
21
Technology Deep dive: Infrastructure What’s hard at eBay? – Multi-tenant system – Document additions and deletions – Document modifications – Index updates – Result caching – Data center automation –…
22
Technology Deep dive: Ranking What’s hard at eBay? –Mix of items: good ’til canceled multi quantity vs. single quantity –Gaps in catalog data –A very different problem: different ranking signals to Web search –The deterministic sort: Recall versus precision Consistency with best match –Spam –Result blending
23
But What Comes Next?
24
21% of eBay multiscreen users 44% of GMV share
26
Q&A?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.