Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Ostrovsky | Couchbase

Similar presentations


Presentation on theme: "David Ostrovsky | Couchbase"— Presentation transcript:

1 David Ostrovsky | Couchbase
Who’s afraid of graphs? David Ostrovsky | Couchbase

2 The Seven Bridges of Konigsberg Problem
Leonard Euler The Seven Bridges of Konigsberg Problem Devise a route through the city that only crosses each bridge once. Paper published in 1736 – regarded as the first paper on Graph Theory. Konigsberg, Prussia – which is Kaliningrad, Russia today.

3 Graph Databases Use Nodes, Edges and Properties to store data.
Important to note that a graph database has: Native graph storage – the engine is built to handle graph data Native graph processing capability, including index-free adjacency to facilitate traversals

4 Use Cases For Graph Databases Social – of course
Recommendation systems (a logical extension from the social graph, or stand-alone – find all customers who bought a book that X customers liked., then find all books similar to that one, etc.) Managing interconnected datasets: Networks, Organization Hierarchies, ACL, in-game economy, etc. Geo-location and routing (think Waze or network routing.) Use-cases for migrating from RDBMS: Problems with JOIN performance Continually evolving dataset or open-ended business requirements The domain is naturally designed for graph representation

5 Meet the Players For comparison – MongoDB has a score of , Cassandra

6 Databases vs Frameworks
Real-time queries Smaller datasets Standard NoSQL features (scaling, HA, etc.) Offline/batch Larger datasets Relies on big data platform (usually Hadoop) Frameworks: Giraph – apache project, used by Facebook to power it’s graph search and process trillions of connections. GraphX – Integrated with Apache Spark, has a library of build in algorithms and ETL functionality. Doesn’t perform as well as Giraph. Franus (from the same team as Titan) GraphLab – open source graph toolkit.

7 Querying and Traversal

8 (a) –[:FRIEND]-> (b)
Cypher (Neo4j) a b FRIEND (a) –[:FRIEND]-> (b)

9 SQL-Derivatives (OrientDB)

10 g.v(1).outE('friend').inV.name // Starting with vertex 1
// find outgoing edges ‘friend’, // follow to the next vertex, // and return the property ‘name’. Gremlin is the graph traversal language of Apache TinkerPop, which in turn a graph computing framework for both graph databases (OLTP) and graph analytic systems (OLAP).

11 Scaling Graphs is Hard Most graph partitioning algorithms fall into the N—Hard category, which is a set of problems that are at least as hard as the hardest problem in NP. Some specialized graph partitioning algorithms have NP-Complete complexity. So unless P=NP, graph partitioning solutions will continue to rely on approximations and various statistical approaches.

12 Clustering Architecture
Neo4J Clustering Architecture

13 Polyglot Persistence To the Rescue

14


Download ppt "David Ostrovsky | Couchbase"

Similar presentations


Ads by Google