Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.

Similar presentations


Presentation on theme: "Hypertable Doug Judd Zvents, Inc.. hypertable.org Background."— Presentation transcript:

1 Hypertable Doug Judd Zvents, Inc.

2 hypertable.org Background

3 Web 2.0 = Data Explosion Web 1.0 Web 2.0 Web 1.0 Web 2.0

4 Traditional Tools Don’t Scale Well  Designed for a single machine  Typical scaling solutions  ad-hoc  manual/static resource allocation

5 hypertable.org The Google Stack  Google File System (GFS)  Map-reduce  Bigtable

6 hypertable.org Architectural Overview

7 hypertable.org What is Hypertable?  A open source high performance, scalable database, modelled after Google's Bigtable  Not relational  Does not support transactions

8 Hypertable Improvements Over Traditional RDBMS  Scalable  High random insert, update, and delete rate

9 hypertable.org Data Model  Sparse, two-dimensional table with cell versions  Cells are identified by a 4-part key  Row  Column Family  Column Qualifier  Timestamp

10 hypertable.org Table: Visual Representation

11 hypertable.org Table: Actual Representation

12 hypertable.org Anatomy of a Key  Row key is \0 terminated  Column Family is represented with 1 byte  Column qualifier is \0 terminated  Timestamp is stored big-endian ones-compliment

13 Concurrency  Bigtable uses copy-on-write  Hypertable uses a form of MVCC (multi-version concurrency control)  Deletes are carried out by inserting “delete” records

14 CellStore  Sequence of 65K blocks of compressed key/value pairs

15 System Overview

16 hypertable.org Range Server  Manages ranges of table data  Caches updates in memory (CellCache)  Periodically spills (compacts) cached updates to disk (CellStore)

17 Client API class Client { void create_table(const String &name, const String &schema); Table *open_table(const String &name); String get_schema(const String &name); void get_tables(vector &tables); void drop_table(const String &name, bool if_exists); };

18 hypertable.org Client API (cont.) class Table { TableMutator *create_mutator(); TableScanner *create_scanner(ScanSpec &scan_spec); }; class TableMutator { void set(KeySpec &key, const void *value, int value_len); void set_delete(KeySpec &key); void flush(); }; class TableScanner { bool next(CellT &cell); };

19 Language Bindings  Currently C++ only  Thrift Broker

20 Write Ahead Commit Log  Persists all modifications (inserts and deletes)  Written into underlying DFS

21 Range Meta-Operation Log  Facilitates Range meta operation  Loads  Splits  Moves  Part of Master and RangeServer  Ensures Range state and location consistency

22 hypertable.org Compression  Cell Stores store compressed blocks of key/value pairs  Commit Log stores compressed blocks of updates  Supported Compression Schemes  zlib (--best and --fast)  lzo  quicklz  bmz  none

23 hypertable.org Caching  Block Cache  Caches CellStore blocks  Blocks are cached uncompressed  Query Cache  Caches query results  TBD

24 Bloom Filter  Negative Cache  Probabilistic data structure  Indicates if key is not present

25 hypertable.org Scaling (part I)

26 hypertable.org Scaling (part II)

27 hypertable.org Scaling (part III)

28 hypertable.org Access Groups  Provides control of physical data layout -- hybrid row/column oriented  Improves performance by minimizing I/O CREATE TABLE crawldb { Title MAX_VERSIONS=3, Content MAX_VERSIONS=3, PageRank MAX_VERSIONS=10, ClickRank MAX_VERSIONS=10, ACCESS GROUP default (Title, Content), ACCESS GROUP ranking (PageRank, ClickRank) };

29 hypertable.org Filesystem Broker Architecture  Hypertable can run on top of any distributed filesystem (e.g. Hadoop, KFS, etc.)

30 Keys To Performance  C++  Asynchronous communication

31 hypertable.org C++ vs. Java  Hypertable is CPU intensive  Manages large in-memory key/value map  Alternate compression codecs (e.g. BMZ)  Hypertable is memory intensive  Java uses 2-3 times the amount of memory to manage large in-memory map (e.g. TreeMap)  Poor processor cache performance

32 hypertable.org Performance Test (AOL Query Logs)  75,274,825 inserted cells  8 node cluster  1 1.8 GHz Dual-core Opteron  4 GB RAM  3 x 7200 RPM SATA drives  Average row key: 7 bytes  Average value: 15 bytes  Replication factor: 3  4 simultaneous insert clients  500K random inserts/s  680K scanned cells/s

33 hypertable.org Performance Test II  Simulated AOL query log data  1TB data  9 node cluster  1 2.33 GHz quad-core Intel  16 GB RAM  3 x 7200 RPM SATA drives  Average row key: 9 bytes  Average value: 18 bytes  Replication factor: 3  4 simultaneous insert clients  Over 1M random inserts/s (sustained)

34 hypertable.org Weaknesses  Range data managed by a single range server  Though no data loss, can cause periods of unavailability  Can be mitigated with client-side cache or memcached

35 hypertable.org Project Status  Currently in “alpha”  Just released version 0.9.0.7  Will release “beta” version end of August  Waiting on Hadoop JIRA 1700

36 License  GPL 2.0  Why not Apache?

37 hypertable.org Questions?  www.hypertable.org


Download ppt "Hypertable Doug Judd Zvents, Inc.. hypertable.org Background."

Similar presentations


Ads by Google