Ensemble Solutions for Link-Prediction in Knowledge Graphs Denis Krompaß1,2 and Volker Tresp1,2 1 Department of Computer Science. Ludwig Maximilian University, 2 Corporate Technology, Siemens AG 12.09.2015
Outline Knowledge Graphs, what are they and what are they good for? Representation Learning in Knowledge Graphs State of the Art Latent Variable Models Integrating Prior Knowledge about Relation-Types Analyzing the Complementary “Potential” of State of the Art Representation Learning Algorithms
Knowledge Graphs Stores facts about the world as relations between entities. Entities are no longer just strings but real world objects with attributes, taxonomic information and relations to other objects. (AlbertEinstein, bornIn, Ulm) Providing a machine with semantic information: Search engines Information retrieval Word-sense disambiguation … Prominent Examples: Google Knowledge Graph IBM Watson
Learning in Knowledge Graphs Latent Variable Model Knowledge Graph Triples bornIn Similarities 0.2 0.9 1.2 0.3 0.7 0.9 Albert Einstein -0.4 -0.3 -0.8 -0.5 -0.4 1.3 -1.1 0.1 1.3 1.3 bornIn 0.3 2.1 0.7 0.3 0.3 0.1 0.6 -0.9 0.2 0.1 ULM -0.1 1.7 -0.2 -0.1 Latent representations (or embeddings) for Entities and Relation-Types that disentangle complex relationships observed in the data (semantics). Link-Prediction Link-based Clustering Disambiguation
State of the Art Latent Variable Models RESCAL Third-Order Tensor Factorization Methods Least-Squares Cost Function TransE Distance-based Method Ranking Cost Function Google Knowledge Vault Multi-way Neural Network (mwNN) Logistic Cost Function Problem: Large Knowledge Graphs Contain Millions of Entities and thousands of Relation-Types Low dimensional representations have to be learned Try to find ways to increase prediction-quality under this constraint
Prior Knowledge about Relation-Types Type-Constraints (From the Schema) Local closed-world assumption (From the Data) Domain and Range Constraints for Relation-Types Integration in model training RESCAL TransE Google KVault Neural Network With low-dimensional embeddings Latent Variable Model Prediction of new triples +40% (YAGO2*) +77% (Freebase*) Denis Krompaß, Stephan Baier and Volker Tresp. Type-Constrained Representation Learning in Knowledge Graphs. 14th International Semantic Web Conference (ISWC), 2015 *Results on large samples from these knowledge graphs Link-Prediction Improvement +54% (DBpedia-Music*)
Complementary Prediction? State of the art models differ in many aspects Diverse predictors Analysis to which degree the models are complementary Combine through arithmetic mean Use Plat scaling for mapping the different outputs to probabilities 70% Training Set 10% Validation Set Hyperparameter Tuning + Plat Scaling 20 % Test Set
Results Ensemble has always much better link-prediction quality
Results Ensemble has always much better link-prediction quality Best complement is between TransE and mwNN
Results Ensemble has always much better link-prediction quality Best complement is between TransE and mwNN RESCAL provides only complementary predictions in case of the Freebase dataset
Results Ensemble has always much better link-prediction quality Best complement is between TransE and mwNN RESCAL provides only complementary predictions in case of the Freebase dataset For the local closed-world assumption, very similar observations could be made
Summary Models are complementary to each other This applies especially when low dimensional embeddings are used (d=10) Ensemble with d=10 comparable to best single predictor with d=100 Up to more than 10% improvement on top of the improvements achieved when Type-Constraints or the Local closed-world assumption are exploited
Questions ? http://www.dbs.ifi.lmu.de/~krompass/ Denis.Krompass@siemens.com