Liang Zheng and Yuzhong Qu An EMD-based Similarity Measure for Multi-Type Entities by Using Type Hierarchy Liang Zheng and Yuzhong Qu
Outline Introduction Similarity Measures Based on EMD Entity Type Weighting Evaluation Conclusion
Introduction Recommending entities with similar types is an important part of entity recommendation. In general, an entity is associated to a set of types in a knowledge base. The traditional similarity measures between two multi-type entities are determined by collection intersection. Several similarity measures between two types have been proposed by exploiting hierarchical structure in some domain. The weighting of each type within a collection of entity types, represents the contribution to the similarity between the two entities.
Introduction In this study, we measure multi-type entity similarity with the earth mover's distance (EMD), which not only takes into account pairwise type similarity, but also the weighting of entity type. The weighting of types is the key factor in the EMD. we also devise a PageRank-based weighting scheme by using type hierarchy.
Similarity Measures Based on EMD The Earth Mover's Distance (EMD) is proposed by Rubner et al. to measure dissimilarity between two multi-dimensional distributions in a feature space. Computing the EMD is based on a solution to the transportation problem.
Entity Type Weighting Statistics-based Scheme Hierarchy-based Scheme
Entity Type Weighting PageRank-based Scheme
Evaluation The performance of the PageRank(PR)-based type weighting scheme and the EMD-based similarity measure on real-world datasets (i.e., DBpedia, Last.fm). We use DBpedia as the knowledge base, and select four entities from different popular types as our test case. Two “gold standard", one is about weighting type for each of the four entities, the other is about similar entities for each of the four entities. Next, we create a “ground truth” of similar artists recommendation from Last.fm. We compare our EMD-based similarity measure with two traditional recommendation methods
Evaluation Experimental Results for Type Weighting Schemes
Evaluation Experimental Results for Different Similarity Measures
Evaluation Experimental Results for Entity Recommendation
Conclusion We propose an EMD-based similarity measure for multi-type entities, which not only takes into account pairwise type similarity, but also the weighting of types. We also devise a PageRank-based weighting scheme by using type hierarchy. The experimental results show that PageRank-based weighting scheme outperforms base-line weighting schemes and that our EMD-based similarity measure outperforms traditional similarity measures.