Download presentation
Presentation is loading. Please wait.
Published byHandoko Irawan Modified over 6 years ago
1
An Empirical Study of Property Collocation on Large Scale of Knowledge Base
龚赛赛
2
Contents Introduction Survey Measure
3
Introduction In the semantic web, properties are used to describe various entities Like word collocation in natural language, collocation also exists among these properties Property collocation: a combination of properties that happens very often and more frequently than would happen by chance
4
Example Wikidata browser Freebase browser
Property collocation is natural and common in real usage !
5
Introduction Various applications
Entity browsing Vocabulary search and recommendation Ontology analysis and assessment … Previous work lacks comprehensive and dedicated investigation in property collocation Equivalent properties General relatedness of concepts or vocabularies In this paper, we present an empirical study of property collocation in large scale knowledge base
6
Survey Q1: Whether property collocation in the ontology exists and how common it is. Q2: How people agree on property collocation
7
Survey 31 users Setup: select 10 popular classes for Dbpedia, Wikidata and Freebase resp. Each user randomly browse 2 entities of each class and identify collocated properties
8
Survey Basic statistics Dbpedia Wikidata Freebase Entity Num 321 316
315 Covered property Num 18.2%, 513 in 2819 21.7%, 461 in 2128 12.1%, 810 in 6679 Covered property num with direction 832 694 1541 Group num 713 440 644
9
Cumulative percentage of properties for Q1
Dbpedia Wikidata Freebase
10
Cumulative number of groups for Q2
Dbpedia 11.7% for 3+ Wikidata 8% for 3+ Dbpedia: 11.7%, wikidata 8% , Freebase : 10% Freebase 10% for 3+
11
Cumulative number of collocated pairs for Q2
Wikidata 4758 pairs Dbpeida 6265 pairs Freebase 5907 pairs
12
Measure 1 Statistical association
P(pi): the probability that a resource is described by pi in some context Phi Coefficient Symmetrical Uncertainty Coefficient Jaccard Coefficient
13
Measure 2 Semantic collocation Domain, range, property hierarchy
dmin/rmin : minimal classes in domain/range SetSimd, SetSimr: similarity of minimal class set HRel: relatedness based on property hierarchy , shortest path
14
Measure 3 Lexical Similarity I-sub JaroWinkler Levenshtein similarity
Wordnet
15
Measure evaluation Sort collocated property pairs based on the number of users Sort these property pairs based on different measure values Compute Spearman's rank correlation coefficient
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.