Download presentation
Presentation is loading. Please wait.
Published byJason Lyon Modified over 11 years ago
1
Solr @Comcast Entertainment Search Comcast Confidential 1/27/2014 1
2
Our Search Problem Entertainment Domain Specific search Not very clean data 250K shows 500 Million+ showings 14 days of tv schedules OnDemand data mixed with tv shows Programs, Persons, Genres, Networks.. <10 ms search results for prefix queries (CS*) Across devices – STB, iPad, Web Comcast Confidential 1/27/2014 2
3
How we massage data? Automated cleaning Automated disambiguation Editorial cleaning Custom Ranking Comcast Confidential 1/27/2014 3
4
Search History @Comcast StreamSage? Search – Custom (2000-2005) – Berkely DB based (2005-Aug 2009) – Solr based – Take 1(Sept 2009- Sept 2010) – Solr based – Take 2 (Oct 2010 -..) Comcast Confidential 1/27/2014 4
5
Berkely DB based search Stored terms per document 30 ms response (avg) Scalability Custom Replication Hard to add fields 4-5 hrs for indexing on good days Comcast Confidential 1/27/2014 5
6
Solr – Take 1 Composite large document Extra presentation/filtering logic in APIs Ranking skew due to mixed queries Unnecessary documents Comcast Confidential 1/27/2014 6
7
Solr – Take 1 Indexer (Solr Master) Solr Slave Solr Replication QE Solr Slave QE Client Apps Comcast Confidential 1/27/2014 7
8
Solr – Take 2 Typed Documents All processing/presentation logic inside Solr – Custom handlers – Custom scoring Custom Caches Incremental indexing Comcast Confidential 1/27/2014 8
9
Solr – Take 2 Indexer (Solr Master) Solr Slave Solr Replication Solr Slave Client Apps Comcast Confidential 1/27/2014 9
10
Demo Set Top Box Comcast Confidential 1/27/2014 10
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.