Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛 2016-04-18.

Similar presentations


Presentation on theme: "An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛 2016-04-18."— Presentation transcript:

1 An Empirical Study of Property Collocation on Large Scale of Knowledge Base
龚赛赛

2 Contents Introduction Survey Measure

3 Introduction In the semantic web, properties are used to describe various entities Like word collocation in natural language, collocation also exists among these properties Property collocation: a combination of properties that happens very often and more frequently than would happen by chance

4 Example Wikidata browser Freebase browser
Property collocation is natural and common in real usage !

5 Introduction Various applications
Entity browsing Vocabulary search and recommendation Ontology analysis and assessment Previous work lacks comprehensive and dedicated investigation in property collocation Equivalent properties General relatedness of concepts or vocabularies In this paper, we present an empirical study of property collocation in large scale knowledge base

6 Survey Q1: Whether property collocation in the ontology exists and how common it is. Q2: How people agree on property collocation

7 Survey 31 users Setup: select 10 popular classes for Dbpedia, Wikidata and Freebase resp. Each user randomly browse 2 entities of each class and identify collocated properties

8 Survey Basic statistics Dbpedia Wikidata Freebase Entity Num 321 316
315 Covered property Num 18.2%, 513 in 2819 21.7%, 461 in 2128 12.1%, 810 in 6679 Covered property num with direction 832 694 1541 Group num 713 440 644

9 Cumulative percentage of properties for Q1
Dbpedia Wikidata Freebase

10 Cumulative number of groups for Q2
Dbpedia 11.7% for 3+ Wikidata 8% for 3+ Dbpedia: 11.7%, wikidata 8% , Freebase : 10% Freebase 10% for 3+

11 Cumulative number of collocated pairs for Q2
Wikidata 4758 pairs Dbpeida 6265 pairs Freebase 5907 pairs

12 Measure 1 Statistical association
P(pi): the probability that a resource is described by pi in some context Phi Coefficient Symmetrical Uncertainty Coefficient Jaccard Coefficient

13 Measure 2 Semantic collocation Domain, range, property hierarchy
dmin/rmin : minimal classes in domain/range SetSimd, SetSimr: similarity of minimal class set HRel: relatedness based on property hierarchy , shortest path

14 Measure 3 Lexical Similarity I-sub JaroWinkler Levenshtein similarity
Wordnet

15 Measure evaluation Sort collocated property pairs based on the number of users Sort these property pairs based on different measure values Compute Spearman's rank correlation coefficient


Download ppt "An Empirical Study of Property Collocation on Large Scale of Knowledge Base 龚赛赛 2016-04-18."

Similar presentations


Ads by Google