Presentation is loading. Please wait.

Presentation is loading. Please wait.

相关工作报告 施林锋 丁文韬 于佳婕.

Similar presentations


Presentation on theme: "相关工作报告 施林锋 丁文韬 于佳婕."— Presentation transcript:

1 相关工作报告 施林锋 丁文韬 于佳婕

2 Jeffrey Pound Ihab F. Ilyas Grant Weddell
Expressive and Flexible Access to Web-Extracted Data: A Keyword-based Structured Query Language Jeffrey Pound Ihab F. Ilyas Grant Weddell

3 Outline Introduction System Overview Experiment
Structured keyword queries Disambiguating query intent Experiment

4 Introduction E.g. Structured query: Keyword query:
Find “all people of German nationality who have won a Nobel award” Structured query: q(x):- GERMAN_PEOPLE(x); hasWonPrize(x; y); NOBEL_PRIZE(y) Expressive but user need to know schema of KB Keyword query: “german has won nobel award” Flexible but loss of structure information Keyword-based structured query “german; has won(nobel award)” Target: GERMAN, hasWeight(Nobel_Prize) Schema may be massive ! (thousands of attribute, relation, …)

5 System Overview 实心框:在线处理部分
虚线框:预处理部分,KB和Document索引建立,Document实体抽取,作为KB的补充

6 Structured keyword queries
k: one or more keywords, e.g. “nobel prize” k(Q): relation, e.g. “born in(Germany)” Q1,Q2…: conjunction, e.g. “harmonica player, songwriter”

7 Disambiguating query intent
The Disambiguation Model Disambiguation Graph Vertexs Partitions Edges

8 Disambiguating query intent
The Scoring Model Semantic Similarity Knowledge Base Support Syntactic Similarity keyword occurrence, edit distance, … Score Aggregation semanticSim(A,B) = |A ^ B| / |A V B| 即A和B同时出现的个数除以A出现的次数加上B出现的次数

9 Experiment Dataset Knowledge base: YAGO 2008
Documents: AQUAINT2 news collection Structured keyword queries: 22 queries YAGO: It contains over 2 million entities, about 250,000 primitive concepts, 100 relations, and over 20 million facts about these entities, concepts, and relations. AQUAINT2: consisting of over 900,000 english news articles from various sources, collected between 2004 and 2006.

10 Experiment Disambiguation Task Top-1 kb mapping

11 Experiment Entity Search Task |M(k)| = 30

12 Interpreting Keyword Queries over Web Knowledge Bases
Jeffrey Pound Alexander K. Hudek Ihab F. Ilyas Grant Weddell

13 Outline Introduction Data and problem definitions
Structuring keyword queries Experiment

14 Introduction Query understanding Semantic query understanding
Represent the underlying information need E.g. “songs by jimi hendrix” Shallow understanding Answer type : music Term annotation: songs – class , jimi Hendrix – entity Semantic query understanding Interpreting keyword queries over KB SONG ∧ ∃createdBy({Jimi_Hendrix})

15 Data and problem defintions
Concept Knowledge base Concept Search Query Query Understanding Problem

16 Solution overview Step1: Queries are first annotated with the semantic constructs from a knowledge representation language (i.e., entity, type, attribute, value, relation). Step2: Annotated queries -> structured query templates, learn from annotated query log(手标的) Step3: 上一篇文章的工作 Step4:

17 Structuring keyword queries
Query Segmentation & Annotation E.g. “songs by jimi hendrix”  Annotated Query(AQ): “songs”:type “by”:rel “jimi hendrix”:ent Baseline Naive bayes, estimating Pr(C|P) Max CRF

18 Structuring Annotated Queries
Query annotation alone does not describe the underlying query intention “songs”:type “by”:rel “jimi hendrix”:ent Structured Query Template E.g. Semantic Summary S(e.g.) = <type, rel, ent> We only know there is a type, relation and entity in query 仍然不知道要查询什么,存在多种歧义

19 Structures to KB Interpretations
After first two step, we got “songs” ⋂ “by” (“jimi Hendrix”) Keyword-based structured query! Keywords  candidate KB items Disambiguation Graph Top-k KB query

20 Experiment Training data: web query log Analysis of a web query log
Keywod queries from Yahoo WebScope probram Only consider entity-based queries, 156 queries 156 queries annotated with structured query template Most frequent template

21 Experiment Method Data(test): Step1: CRF & NB, …
Other comparison method Match query term to graph, then try to connected Build text representation of each node in KB, traditional keyword search over text representation Data(test): Hand-crafted: 96 queries 48 can make a query over YAGO, 48 can’t

22 Experiment Annotation Accuracy Query Interpretation

23 Experiment Keyword Query Answering

24 思考 Query中的所有词都是关键词,需要被mapping “理解”最终需要落地 重新审视谓词映射
Keyword query  execute the “understanding query” 重新审视谓词映射 如何体现效果(场景的选择)、缺乏相关工作的比较

25


Download ppt "相关工作报告 施林锋 丁文韬 于佳婕."

Similar presentations


Ads by Google