Download presentation
Presentation is loading. Please wait.
Published byPatrick Lawrence Modified over 9 years ago
1
COMAD 2008Chakrabarti Bridging the Structured-Unstructured Gap Born in New York in 1934, Sagan was a noted astronomer whose lifelong passion was searching for intelligent life in the cosmos. person scientist physicist astronomer entity region city district state hasDigitisDDDD Where was Sagan born? type=region NEAR “Sagan” Name a physicist who searched for intelligent life in the cosmos type=physicist NEAR “cosmos”… When was Sagan born? type=time pattern=isDDDD NEAR “Sagan” “born” abstraction time year is-a
2
COMAD 2008Chakrabarti Graph Proximity Search Graphs with typed nodes and edges ubiquitous Score candidates by proximity to match nodes Short path from match nodes Many parallel paths from match nodes PageRank, commute time, escape probability, … XML index holistic hasWord cites worksFor wrote sent received wrote company isA P P′ R J
3
COMAD 2008Chakrabarti Problems and Some Solutions Annotation and disambiguation Add links from token segments to entity catalog Learning to rank (KDD2006, ICML2007) Learn relative importance of edge types from relevance feedback Entity search (WWW2006) “Typical battery life of Lenovo X300 laptop” Collective ranking of snippets and entities Indexing for proximity search (WWW2007) Constant query time independent of graph size
4
COMAD 2008Chakrabarti Scaling Up Aggressive open-domain Web annotation Entity catalog from WordNet, Wikipedia, … Search API 2.0: text + structure, indexing Mining semistructured fact/relation views Cloud in our basement! 320 cores, 320GB RAM, 120TB disk Terabytes of crawled Web data Tens of millions of queries from Y and M Click trails on URLs and ads Supported by Y, M, HP
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.