An Interactive Approach to Collectively Resolving URI Coreference

An Interactive Approach to Collectively Resolving URI Coreference
Saisai Gong, Wei Hu, Gong Cheng, Yuzhong Qu

Contents Background Related Work Overview of our Approach
Evolvement of Individual Partition Computing Consensus Partition Evaluation Conclusion

Background owl:sameAs URICoreference ……
URICoreference ……

Related Work Fully automatic approaches
OWL semantics Similarities between descriptions Self –training … Automatic approaches remain far from prefect (see Ferrara et al )

Related Work (Cont.) Semi-automatic approaches
Active learning Micro-task crowdsourcing … Assumptions made by semi-automatic approaches Users act as “oracle” One single right answer Not always hold Users may have different opinions Disagreement among users happen Distinguish a user's individual URI coreference from the mass Resolve disagreement among users

Our Approach iReC iReC: an interactive approach to resolve collectively URI coreference with user involvement Basic idea: achieve a good partition of the URI universe Maintain individual partition for each user Form consensus partition aggregated from individual ones Evolve partitions through user interaction Two goals Alleviate user involvement Reflect the collective power of masses

Overview of our Approach

Candidate Selector Generating Candidates
Find potential coreference from various sources owl:sameAs links existing resolution services such as sameas.org, keyword-based entity search engines such as Falcons Object Search the user's individual partition the consensus partition Merge URIs belonging to the same equivalent class into a candidate entity

Learning Binary Classifier
To reduce user involvement Learning model: averaged perceptron (See Collins 02) Online learning algorithm Learn individual classifier both online and offline, learn global one offline

Training data Online : latest URI pairs from user feedback Offline training examples Positive : URIs pairs from equivalent classes Negative URI pairs from user feedback URI pairs from different equivalent classes sharing types URI pairs Falcons search result

Training algorithm Feature : the cartesian product of the two candidates' properties Feature value: for each property pair, compute maximum similarity of the given two properties’ values URIs: vsim=1 iff identical or in equivalent class Numeric literals: vsim=1 iff difference less than threshold Boolean literals: vsim=1 iff value equal Other literals: Jaccard similarity

Training algorithm

Selecting Most Beneficial Candidate
Combine individual classifier and global one by their weights (α_+ β = 1) Confidence of coreference based on margin The larger the absolute value of margin is, the higher the confidence is Uncertainty: the absolute value of margin Select candidate with minimum absolute value of margin

Comparative Snippets To facilitate user interaction
Coreferent (non-coreferent resp.): values of discriminative property pairs signicantly similar (dissimilar resp.) Discriminability of property pairs: absolute values of weight in combined classifier

Comparative Snippets Compute maximum weighted matching on the bipartite graph from property pairs Get topk property value pairs based on maximum similarity of property values

Computing Consensus Partition
Minimize disagreements between individual partitions In our approach, using symmetric difference distance Maximizing NP-complete

Approximation algorithm clustering-based Compute a partition on the union of individual partitions’ domains first initialize a similarity matrix Mtrx=( ij ) begin with each URI forming an equivalence class separately for each class pair (i, j) , where > 0, merge together classes i,j , and update Mtrx

Evaluation Build link between NYT and Dbpedia of OAEI benchmark
10 fold cross validation

Evaluation F-Measure

Evaluation Examination
Choose 50 popular URIs from falcons Invite 10 people to resolve URIcoreference on the 50 URIs using SView In average, times verification, 32.0 accepted as positive 53.9 pair of URIs in individual partitions

Evaluation User study SUS Vs sigma 72 vs 68

Conclusion Averaged Perceptron is feasible User involvement is reduced

Reference A. Ferrara, A. Nikolov, J. Noessner, and F. Schare. Evaluation of instance matching tools: the experience of OAEI. Journal of Web Semantics, 21:49-60, 2013. M. Collins. Discriminative training methods for hidden markov models: theory and experiments with perceptron algorithms. In Proc. of EMNLP, pages 1-8, 2002.

An Interactive Approach to Collectively Resolving URI Coreference

Similar presentations

Presentation on theme: "An Interactive Approach to Collectively Resolving URI Coreference"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Interactive Approach to Collectively Resolving URI Coreference

Similar presentations

Presentation on theme: "An Interactive Approach to Collectively Resolving URI Coreference"— Presentation transcript:

Similar presentations

About project

Feedback