Presentation is loading. Please wait.

Presentation is loading. Please wait.

COMAD 2008Chakrabarti Bridging the Structured-Unstructured Gap Born in New York in 1934, Sagan was a noted astronomer whose lifelong passion was searching.

Similar presentations


Presentation on theme: "COMAD 2008Chakrabarti Bridging the Structured-Unstructured Gap Born in New York in 1934, Sagan was a noted astronomer whose lifelong passion was searching."— Presentation transcript:

1 COMAD 2008Chakrabarti Bridging the Structured-Unstructured Gap Born in New York in 1934, Sagan was a noted astronomer whose lifelong passion was searching for intelligent life in the cosmos. person scientist physicist astronomer entity region city district state hasDigitisDDDD Where was Sagan born?  type=region NEAR “Sagan” Name a physicist who searched for intelligent life in the cosmos  type=physicist NEAR “cosmos”… When was Sagan born?  type=time pattern=isDDDD NEAR “Sagan” “born” abstraction time year is-a

2 COMAD 2008Chakrabarti Graph Proximity Search  Graphs with typed nodes and edges ubiquitous  Score candidates by proximity to match nodes Short path from match nodes Many parallel paths from match nodes  PageRank, commute time, escape probability, … XML index holistic hasWord cites worksFor wrote sent received wrote company isA P P′ R J

3 COMAD 2008Chakrabarti Problems and Some Solutions  Annotation and disambiguation Add links from token segments to entity catalog  Learning to rank (KDD2006, ICML2007) Learn relative importance of edge types from relevance feedback  Entity search (WWW2006) “Typical battery life of Lenovo X300 laptop” Collective ranking of snippets and entities  Indexing for proximity search (WWW2007) Constant query time independent of graph size

4 COMAD 2008Chakrabarti Scaling Up  Aggressive open-domain Web annotation  Entity catalog from WordNet, Wikipedia, …  Search API 2.0: text + structure, indexing  Mining semistructured fact/relation views  Cloud in our basement! 320 cores, 320GB RAM, 120TB disk Terabytes of crawled Web data Tens of millions of queries from Y and M Click trails on URLs and ads  Supported by Y, M, HP


Download ppt "COMAD 2008Chakrabarti Bridging the Structured-Unstructured Gap Born in New York in 1934, Sagan was a noted astronomer whose lifelong passion was searching."

Similar presentations


Ads by Google