Download presentation
Presentation is loading. Please wait.
Published byBranden Mathews Modified over 9 years ago
1
LDBC: Benchmarking Graph Data Management Systems www.cwi.nl/~boncz/graphta.ppt Peter Boncz
2
make competing products comparable accelerate progress, make technology viable Why Benchmarking? © Jim Gray, 2005 www.cwi.nl/~boncz/graphta.ppt
3
What is the LDBC? Linked Data Benchmark Council = LDBC Industry entity similar to TPC (www.tpc.org)www.tpc.org Focusing on graph and RDF store benchmarking Kick-started by an EU project Runs from September 2012 – March 2015 9 project partners: www.cwi.nl/~boncz/graphta.ppt
4
SNB: Social Network Benchmark www.cwi.nl/~boncz/graphta.ppt
5
Data Correlations between attributes SELECT personID from person WHERE firstName = AND addressCountry = ‘Germany’ ‘Joachim’ SELECT personID from person WHERE firstName = AND addressCountry = ‘Italy’ ‘Cesare’ Query optimizers may underestimate or overestimate the result size of conjunctive predicates Anti-Correlation LoewPrandelli JoachimCesare Joachim
6
Compact Correlated Property Value Generation Using geometric distribution for function F()
7
Correlated Edge Generation P4 P5 Student “Anna” “University of Leipzig” “Germany” “1990” P1 “University of Leipzig” “Laura” “1990” P3 “University of Leipzig” “1990” P2 “University of Amsterdam” “Netherlands” www.cwi.nl/~boncz/graphta.ppt
8
How to Generate a Correlated Graph? P4 P5 Student “Anna” “University of Leipzig” “Germany” “1990” P1 “University of Leipzig” “Laura” “1990” P3 “University of Leipzig” “1990” P2 “University of Amsterdam” “Netherlands” Danger: this is very expensive to compute on a large graph! (quadratic, random access) ? ? ? ? ? Compute similarity of two nodes based on their (correlated) properties. Use a probability density function wrt to this similarity for connecting nodes Compute similarity of two nodes based on their (correlated) properties. Use a probability density function wrt to this similarity for connecting nodes connection probability highly similar less similar ? www.cwi.nl/~boncz/graphta.ppt
9
Window Optimization P4 P5 Student “Anna” “University of Leipzig” “Germany” “1990” P1 “University of Leipzig” “Laura” “1990” P3 “University of Leipzig” “1990” P2 “University of Amsterdam” “Netherlands” Probability that two nodes are connected is skewed w.r.t the similarity between the nodes (due to probability distr.) connection probability highly similar less similar Window Trick: disregard nodes with too large similarity distance (only connect nodes in a similarity window) www.cwi.nl/~boncz/graphta.ppt
10
Workloads by system SystemInteractiveBusiness IntelligenceGraph Analytics Graph databasesYes Maybe Graph programming frameworks -Yes RDF databasesYes - Relational databasesYes Maybe, by keeping state in temporary tables, and using the functional features of PL-SQL NoSQL Key-valueMaybe - NoSQL MapReduce-MaybeYes www.cwi.nl/~boncz/graphta.ppt
11
Plans For 2014 Finishing Interactive workload – updates (transactional) – substitution parameters New BI and Graph Analytical Workloads Data Generator Improvements – improve dictionaries and distributions for BI – Scale factors and dataset (SN graph) validation Query Drivers – Parallel update generator Auditing Rules for SNB www.cwi.nl/~boncz/graphta.ppt
12
Pointers Code&Queries: github.com/ldbc – ldbc_socialnet_bm ldbc_socialnet_dbgen ldbc_socialnet_qgen Wiki: ldbc.eu:8090/display/TUC – Background & Discussions + Detailed report: ldbc.eu:8090/download/attachments/4325436/LDBC_SNB_ Report_Nov2013.pdf LDBC Technical User Community (TUC) meeting: – Thursday April 3, CWI Amsterdam (see wiki – next week) www.cwi.nl/~boncz/graphta.ppt
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.