Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015

TAO - Facebook’s Distributed Data Store  Geographically distributed data store  Provides efficient and timely access to the social graph  Read-optimized

Background  A single Facebook page may aggregate and filter hundreds of items from the social graph  Each user’s content is custom tailored  Infeasible to aggregate and filter on data creation  Aggregated and filtered on viewing  Originally used MySQL and PHP scripts with results cached  Problems  Inefficient edge lists – a change to a single edge requires the entire list to be reloaded  Distributed Control Logic – control logic is run on clients that don’t communicate. Creates failure modes  Expensive read-after-write consitency

Goals  Provide basic access to the nodes and edges of a constantly changing graph in data centers across multiple regions.  Optimized heavily for reads, and explicitly favors efficiency and availability over consistency.  Handle most application needs while allowing a scalable and efficient implementation

Data Model  Objects are nodes in the graph  Associations are directed edges between objects  Actions can be either nodes or edges  Associations naturally model actions that can happen at most once or recorded state transitions.  Associations can be bi-directional but are treated as separate associations. Associations that have inverses are configured as an inverse type associate

Architecture – Storage Layer  Handles a larger volume of data than can be stored on a single MySQL server  Data is divided into logical shards  Each shard is contained in a logical database  Database server handle multiple shards  Number of shards > number of servers  Load balancing of shards

Architecture – Caching Layer  Consists of multiple cache servers that together form a tier  Tier is collectively capable of responding to any TAO request  Request maps to a single cache server using a sharding scheme  Client issues request directly to a cache server  Fill it on demand  Evict items using a least recently used policy

Architecture – Leaders and Followers  Large tiers are problematic – hot spots  Leaders read from and write to the storage layer  Followers forward read misses and writes to a leader  Clients communicate with a follower and never contact a leader  Care must be taken to keep TAO caches consistent.  Each shard has one leader.  All writes to the shard go to the leader. Leader always consistent  Provides eventual consistency for followers

Architecture – Scaling Geographically  Follower tiers can be thousands of miles apart  Network round trip times can become a bottleneck  Each TAO follower must be local to a tier of databases holding a complete copy of the social graph. This would be expensive  Data Center locations are clustered into only a few regions.  Local leader forwards writes to master database  Writes that fail during the switch are reported to the client as failed and are not retried  Master/slave design ensures that all reads can be satisfied within a single region at the expense of returning potentially stale data to clients

Architecture - Visualized

Consistency and Fault Tolerance  Availability and performance are important  Eventually consistent  Database failure: slave database becomes the new master  Leader failures: followers route read and write requests around it. Read misses go directly to DB, writes to a random leader in the tier  Follower failure: other followers pick up the load

Production Workload  Random sample of 6.5 million requests over 40 days  Reads are more common than writes  Most edge queries have empty results  Query frequency, node connectivity, and data size have distributions with long tails  Only 0.2% of requests involved writes!

Performance  Availability  Over a 90 day period a fraction of only 4.9 x 10 ^ -6 queries failed  Hit Rates and Latency  Hit rate for reads was 96.4%  Average write latency in the same region was 12.1 msec and for remote regions 74.4 msec  Replication Lag:  Slave storage lags behind master by less than 1 second 85% of the time  Less than 3 seconds 99%  Less than 10 seconds 99.8%

Questions

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Similar presentations

Presentation on theme: "Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Similar presentations

Presentation on theme: "Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015."— Presentation transcript:

Similar presentations

About project

Feedback