Download presentation
Presentation is loading. Please wait.
Published byAron Owen Modified over 9 years ago
1
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Reporter: yu-wen chang
2
Abstract Bigtable(Bt) is a distributed storage system for managing structured data that is designed to scale to a very large size. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.
3
Outline Introduction Data Model APIs Building Blocks Implementation Refinements Performance Real Applications Conclusion
4
Introduction Bigtable is designed to reliably scale to petabytes of data and thousands of machines. Bigtable has achieved several goals: Wide applicability. Scalability. High performance. High availability.
5
Data Model A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp.
6
Data Model-Row The row keys in a table are arbitrary strings. Bigtable maintains data in lexicographic order by row key. Each row range is called a tablet, which is the unit of distribution and load balancing.
7
Data Model-Column Column keys are grouped into sets called column families. A column key is named using the following syntax: family : qualier. Another useful column family for this table is anchor.
8
Data Model-Timestamp Each cell in a Bigtable can contain multiple versions of the same data; these versions are indexed by timestamp.
9
Data Model http://tw.yahoo.com/ com.yahoo.tw row keyrow key http://tw.bid.y ahoo.com http://buy.yahoo.com.tw TTT Anchor: 拍 賣 Anchor: 購物中心 ColumnColumn Family
10
APIs The Bigtable API provides functions : Creating and deleting tables and column families. Changing cluster, table and column family metadata.
11
Building Blocks Bigtable uses the distributed Google File System (GFS) to store log and data files. The Google SSTable file format is used internally to store Bigtable data. An SSTable provides a persistent, ordered immutable map from keys to values, where both keys and values are arbitrary byte strings.
12
Building Blocks(cont.) Bigtable relies on a highly-available and persistent distributed lock service called Chubby. Chubby provides a namespace that consists of directories and small files. Each directory or file can be used as a lock.
13
Implementation The Bigtable implementation has three major components: Library that linked into client. Master server. Table server.
14
Implementation
15
Each tablet server manages a set of tablets The tablet server handles read and write requests to the tablets that it has loaded, and also splits tablets that have grown too large.
16
Implementation-Tablet Location We use a three-level hierarchy analogous to that of a B+-tree to store tablet location information. Each METADATA row stores approximately 1KB of data in memory. With a modest limit of 128 MB METADATA tablets.
17
Implementation-Tablet Location
18
The client library caches tablet locations. Tablet locations are stored in memory, so no GFS accesses are required. We also store secondary information in the METADATA table, including a log of all events pertaining to each tablet
19
Implementation-Tablet Assignment Each tablet is assigned to one tablet server at a time. Bigtable uses Chubby to keep track of tablet servers. When a tablet server starts, it creates, and acquires an exclusive lock on, a uniquely- named file in a specific Chubby directory.
20
Implementation-Tablet Assignment When a master is started by the cluster management system, it needs to discover the current tablet assignments before it can change them.
21
Implementation-Tablet Assignment The master executes the following steps at startup: The master grabs a unique master lock in Chubby. The master scans the servers directory in Chubby to find the live servers. The master communicates with every live tablet server to discover what tablets are already assigned to each server. The master scans the METADATA table to learn the set of tablets.
22
Tablet Serving
23
Implementation-Compactions When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS. A merging compaction that rewrites all SSTables into exactly one SSTable is called a major compaction.
24
Renements Locality groups: Clients can group multiple column families together into a locality group. Compression: Uses Bentley and McIlroy's scheme and fast compression algorithm. Caching for read performance: Uses Scan Cache and Block Cache. Bloom filters: Reduce the number of accesses.
25
Performance Evaluation
26
Real Applications Google Analytics http://analytics.google.com Google Earth http://earth.google.com Personalized search www.google.com/psearch
27
Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.