Gorilla: A Fast, Scalable, In-Memory Time Series Database Facebook vldb2015 15210240024 王夏青
Abstract Gorilla: Facebook’s in-memory TSDB(time series database) Key: strike the right balance between efficiency, scalability, and reliability Optimize for remaining highly available for writes and reads, even in the face of failures At the expense of possibly dropping small amounts of data on the write path
Abstract Introduction Background & Requirements Comparison with TSDB systems Gorilla Architecture New Tools on Gorilla Experience & Future Work Conclude
Introduction Motivation: Large-scale internet services: highly-available and responsive for their users An important requirement: accurately monitor the health and performance of the underlying system and quickly identify and diagnose problems Scale: thousands of individual systems running on many thousands of machines, often across multiple geo-replicated datacenters
Introduction Constraints: Gorilla: Writes dominate State transitions High availability Fault tolerance Gorilla: New TSDB satisfies these constraints Functions as a write-through cache of the most recent data entering the monitoring system
Introduction Insight: Users of monitoring systems do not place much emphasis on individual data point s but rather on aggregate analysis Do not store any user data so traditional ACID guarantees are not a core requirement Recent data points are of higher value than older points
Introduction Challenge: High data insertion rate Total data quantity Real-time aggregation Reliability requirements
Background & Requirements Operational Data Store(ODS) Monitoring system read performance issues
Background & Requirements 2 billion unique time series identified by a string key 700 million data points added per minute Store data for 26 hours More than 40,000 queries per second at peak Read succeed in under one millisecond Support time series with 15 second granularity Two in-memory, not co-located replicas Always server reads even when a single server crashes Ability to quickly scan over all in memory data Support at least 2x growth per year
Comparison with TSDB Systems Existing solutions: OpenTSDB Whisper(Graphite) InfluxDB
Gorilla Architecture
Gorilla Architecture Monitoring data: 3-tuple of a string key, a 64 bit time stamp integer and a double precision floating point value A new time series compression algorithm Arrange in-memory data structures to allow fast and efficient scans of all data while maintaining constant time lookup of individual time series
Gorilla Architecture
Gorilla Architecture Compressing time stamps Compressing values
Gorilla Architecture
Gorilla Architecture
Gorilla Architecture
Gorilla Architecture
Gorilla Architecture
Gorilla Architecture
Gorilla Architecture
Gorilla Architecture In-memory data structures: Timeseries Map(TSmap) Shared-pointers Read-write spin lock & 1-byte spin lock
Gorilla Architecture
Gorilla Architecture On disk structures: GlusterFS A Gorilla host -> multiple shards A single directory per shard Each directory: Key lists Append-only logs Complete block files Checkpoint files
Gorilla Architecture Tolerating single node, temporary failures with zero observable downtime Localized failures(such as a network cut to an entire region)
New Tools on Gorilla Correlation engine Charting Aggregations
Experience & Future Work Fault tolerance: Network cuts Disaster readiness Configuration changes and code pushes Bug Single node failures
Experience & Future Work Site wide error rate debugging
Experience & Future Work Lessons learned Prioritize recent data over historical data Read latency matters High availability trumps resource efficiency
Experience & Future Work Add a second, larger data store between in-memory Gorilla and HBase based on flash storage Rewrite write path to wait longer before writing to HBase
Conclusion Gorilla: a new in-memory times series database deployed at Facebook Functions as a write through cache for monitoring data Described a new compression scheme that allows us to efficiently store monitoring data Reduces production query latency Enables new monitoring tools Verified Gorilla’s fault tolerance capabilities
Q&A THANKS