Download presentation
Presentation is loading. Please wait.
Published byByron Peters Modified over 6 years ago
1
An Interactive Approach to Collectively Resolving URI Coreference
Saisai Gong, Wei Hu, Gong Cheng, Yuzhong Qu
2
Contents Background Related Work Overview of our Approach
Evolvement of Individual Partition Computing Consensus Partition Evaluation Conclusion
3
Background owl:sameAs URICoreference ……
URICoreference ……
4
Related Work Fully automatic approaches
OWL semantics Similarities between descriptions Self –training … Automatic approaches remain far from prefect (see Ferrara et al )
5
Related Work (Cont.) Semi-automatic approaches
Active learning Micro-task crowdsourcing … Assumptions made by semi-automatic approaches Users act as “oracle” One single right answer Not always hold Users may have different opinions Disagreement among users happen Distinguish a user's individual URI coreference from the mass Resolve disagreement among users
6
Our Approach iReC iReC: an interactive approach to resolve collectively URI coreference with user involvement Basic idea: achieve a good partition of the URI universe Maintain individual partition for each user Form consensus partition aggregated from individual ones Evolve partitions through user interaction Two goals Alleviate user involvement Reflect the collective power of masses
7
Overview of our Approach
8
Candidate Selector Generating Candidates
Find potential coreference from various sources owl:sameAs links existing resolution services such as sameas.org, keyword-based entity search engines such as Falcons Object Search the user's individual partition the consensus partition Merge URIs belonging to the same equivalent class into a candidate entity
9
Learning Binary Classifier
To reduce user involvement Learning model: averaged perceptron (See Collins 02) Online learning algorithm Learn individual classifier both online and offline, learn global one offline
10
Learning Binary Classifier
Training data Online : latest URI pairs from user feedback Offline training examples Positive : URIs pairs from equivalent classes Negative URI pairs from user feedback URI pairs from different equivalent classes sharing types URI pairs Falcons search result
11
Learning Binary Classifier
Training algorithm Feature : the cartesian product of the two candidates' properties Feature value: for each property pair, compute maximum similarity of the given two properties’ values URIs: vsim=1 iff identical or in equivalent class Numeric literals: vsim=1 iff difference less than threshold Boolean literals: vsim=1 iff value equal Other literals: Jaccard similarity
12
Learning Binary Classifier
Training algorithm
13
Selecting Most Beneficial Candidate
Combine individual classifier and global one by their weights (α_+ β = 1) Confidence of coreference based on margin The larger the absolute value of margin is, the higher the confidence is Uncertainty: the absolute value of margin Select candidate with minimum absolute value of margin
14
Comparative Snippets To facilitate user interaction
Coreferent (non-coreferent resp.): values of discriminative property pairs signicantly similar (dissimilar resp.) Discriminability of property pairs: absolute values of weight in combined classifier
15
Comparative Snippets Compute maximum weighted matching on the bipartite graph from property pairs Get topk property value pairs based on maximum similarity of property values
16
Computing Consensus Partition
Minimize disagreements between individual partitions In our approach, using symmetric difference distance Maximizing NP-complete
17
Computing Consensus Partition
Approximation algorithm clustering-based Compute a partition on the union of individual partitions’ domains first initialize a similarity matrix Mtrx=( ij ) begin with each URI forming an equivalence class separately for each class pair (i, j) , where > 0, merge together classes i,j , and update Mtrx
18
Computing Consensus Partition
19
Evaluation Build link between NYT and Dbpedia of OAEI benchmark
10 fold cross validation
20
Evaluation F-Measure
21
Evaluation Examination
Choose 50 popular URIs from falcons Invite 10 people to resolve URIcoreference on the 50 URIs using SView In average, times verification, 32.0 accepted as positive 53.9 pair of URIs in individual partitions
22
Evaluation User study SUS Vs sigma 72 vs 68
23
Conclusion Averaged Perceptron is feasible User involvement is reduced
24
Reference A. Ferrara, A. Nikolov, J. Noessner, and F. Schare. Evaluation of instance matching tools: the experience of OAEI. Journal of Web Semantics, 21:49-60, 2013. M. Collins. Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In Proc. of EMNLP, pages 1-8, 2002.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.