Download presentation
Presentation is loading. Please wait.
Published bySolomon Black Modified over 9 years ago
1
Attribute Extraction and Scoring: A Probabilistic Approach Taesung Lee, Zhongyuan Wang, Haixun Wang, Seung-won Hwang Microsoft Research Asia Speaker: Bo Xu
2
Background Goal of Creating Knowledge Base Knowledge base is used to provide this background knowledge to machines Enable machines to perform certain types of inferences as humans do Three backbone of the knowledge base IsA: a relationship between a sub-concept and a concept e.g., an IT company is a company isInstanceOf: a relationship between an entity and a concept e.g., Microsoft is a company isPropertyOf: a relationship between an attribute and a concept e.g., color is a property/attribute of wine
3
Attributes(isPropertyOf) play a critical role in knowledge inference Knowledge Inference Given some attributes, guess which concept they described E.g. “capital city, population”, people may infer that it relates to country Given “color, body, smell,” people may think of wine However, in many cases, the association may not be straightforward, and may require some guesswork even for humans
4
However, in order to perform inference, knowing that a concept can be described by a certain set of attributes is not enough. Instead, we must know how important, or how typical each attribute is for the concept Typically P(c|a) Denotes how typical concept c is, given attribute a P(a|c) Denotes how typical attribute a is, given concept c
5
Problem Definition Goal
6
Framework
7
Attribute extraction from the web corpus Concept-based extraction(CB) E.g., “... the acidity of a wine is an essential component of the wine...” Instance-based extraction(IB) E.g., “the taste of Bordeaux is...” (c, a, n(c, a)) tuples (i, a, n(i, a)) tuples
8
Attribute extraction from an external knowledgebase Extract instance-based attributes from DBpedia entities(KB) use n(i, a) = 1 to produce (i, a, n(i, a)) tuples
9
Attribute extraction from the query logs Extract instance-based attributes from query logs (QB) we cannot expect concept-based attributes from query logs as people usually have interest in a more specific instance rather than a general concept Two-pass approach First pass Extract candidate instance-attribute pairs (i, a) from the pattern “the a of (the/a/an) i” extend the candidate list IU to include the attributes obtained from IB and KB Second pass count the co-occurrence n(i, a) of i and a in the query log for each (I, a) ∈ IU to produce the (i, a, n(i, a)) list
10
Attribute Distribution e.g., the attribute name is frequently observed from the CB pattern, such as “the name of a state is...”, but people will not mention it with the specific state, as in “the name of Washington is...”.
11
Attribute Scoring Computing Typicality from a CB List Computing Typicality from an IB List
12
Computing Naive P(a|i, c) and P(i|c) using Probase A simplifying assumption one instance belongs to one concept P(c|i) = 1 if this concept-instance is observed in Probase and 0 otherwise
13
However, in reality, the same instance name appears with many concepts [C1] ambiguous instance related to dissimilar concepts ‘Washington’ can refer to a president and a state [C2] unambiguous instance related to similar concepts ‘George Washington’ is a president, patriot, and historical figure
14
Unbiasing P(a|i, c) JC(a, c) (join count) defined as the number of instances in c that has attribute join ratio (JR) representing how likely a is associated with c
15
Unbiasing P(c|i) Probase frequency counts n p (c, i) However, using this equation would not distinguish C2 from C1 sim(c, c′) comparing the instance sets of the two concepts using Jaccard Similarity
16
Unbiasing P(c|i) Probase frequency counts n p (c, i) However, using this equation would not distinguish C2 from C1 sim(c, c′) comparing the instance sets of the two concepts using Jaccard Similarity
17
Typicality Score Aggregation P(a|c) One CB list from web documents Three IB lists from web documents, query log, and knowledge base no single source can be most reliable for all concepts Formulate P(a|c) The goal is to learn weights for the given concept
18
Learn the weights Ranking SVM approach pairwise learning VS. regression Since training data does not have to quantify absolute typicality score
19
Pairwise Learning Given x1>x2>x3 Binary Classification Positive example: x1-x2, x1-x3, x2-x3 Negative example: x2-x1, x3-x1, x3-x2
20
Features linear combination of the five features
21
Attribute Synonym Set Leverage the evidence from Wikipedia Wikipedia Redirects Wikipedia Internal Links we take each connected component as a synonymous attribute cluster Within each cluster, we set the most frequent attribute as a representative attribute of the synonymous attributes
22
Experiment——Experimental Settings
23
Experiment——Evaluation Measures and Baseline The number of labeled attributes for evaluation is 4846 attributes from 12 concepts evaluation measures relevance score to the four groups as 1, 2/3, 1/3, and 0, respectively absolute value is hard to acquire, approximate this value consider a set of attributes labeled as the universe
24
Precision
25
Recall
26
Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.