Download presentation
Presentation is loading. Please wait.
1
01/06/15Sergey Chernov 1 Extracting Semantic Relationships between Wikipedia Categories By Sergey Chernov, Tereza Iofciu, Wolfgang Nejdl, Xuan Zhou, Michal Kopycki, Przemyslaw Rys
2
01/06/15Sergey Chernov 2 Preliminaries WIKIPEDIA: largest knowledge sharing system Many pages assigned to CATEGORIES All links are NAVIGATIONAL Can we extract SEMANTIC links? MOTIVATION
3
01/06/15Sergey Chernov 3 Wikipedia Categories Example MOTIVATION
4
01/06/15Sergey Chernov 4 Possible benefits Semi-structured queries “find Countries which had Democratic Non-Violent Revolutions” rephrased as “find page from category Countries which is connected to some page in Non-Violent Revolutions” Hints for authors “you edit page from category Countries, do you want to add a link to page in category Capital?” Raw data for manual semantic markup MOTIVATION
5
01/06/15Sergey Chernov 5 Countries Heuristics Experiments Denmark Austria Capitals Berlin Stockholm Vienna Germany France Paris Number of links NL = 3 Connectivity Ratio CR = 3/4 = 0.75
6
01/06/15Sergey Chernov 6 Dataset INEX 2006 collection Sample category rankings Experiments
7
01/06/15Sergey Chernov 7 Manual assessment methodology Semantic Connection Strength (SCS) Measure: 2 = strong semantic relationship, 1 = average semantic relationship, 0 = weak or no semantic relationship. Instruction for Assessors “category A is strongly related to category B (value 2) if you believe that every page in A should conceptually have at least one semantic link to B;” “A and B are averagely related (value 1), if you believe 50% of pages in A should have semantic links to B;” “otherwise, A and B are weakly related (value 0).”
8
01/06/15Sergey Chernov 8 Experiments with Number of Links Average semantic connections strength for 100 sample categories, extracted using Number of Links. Experiments
9
01/06/15Sergey Chernov 9 Experiments with Connectivity Ratio Average semantic connections strength for 100 sample categories, extracted using Connectivity Ratio. Experiments
10
01/06/15Sergey Chernov 10 General Results and Conclusions Result is skewed toward Countries category Connectivity Ratio is a better measure than Number of Links We have observed that inlinks have better performance than outlinks. Summary
11
01/06/15Sergey Chernov 11 Future Steps More manual exploration, look for additional heuristics Consider more categories SCS composed of Is this a “part of” relation? W1 Is this a “is a” relation? W2 Is this a “synonym” relation? W3 Is this a “antonym” relation? W4 It is related in a different way? Which one? W5 Summary
12
01/06/15Sergey Chernov 12 Thank You!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.