Christopher Shain Software Development Lead Tresata.

Slides:



Advertisements
Similar presentations
Chen Zhang Hans De Sterck University of Waterloo
Advertisements

HBase. OUTLINE Basic Data Model Implementation – Architecture of HDFS Hbase Server HRegionServer 2.
CS525: Special Topics in DBs Large-Scale Data Management HBase Spring 2013 WPI, Mohamed Eltabakh 1.
Map/Reduce in Practice Hadoop, Hbase, MongoDB, Accumulo, and related Map/Reduce- enabled data stores.
COLUMN-BASED DBS BigTable, HBase, SimpleDB, and Cassandra.
HBase Presented by Chintamani Siddeshwar Swathi Selvavinayakam
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
-A APACHE HADOOP PROJECT
The Hadoop Stack, Part 2 Introduction to HBase CSE – Cloud Computing – Fall 2014 Prof. Douglas Thain University of Notre Dame.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Distributed storage for structured data
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Gowtham Rajappan. HDFS – Hadoop Distributed File System modeled on Google GFS. Hadoop MapReduce – Similar to Google MapReduce Hbase – Similar to Google.
Thanks to our Sponsors! To connect to wireless 1. Choose Uguest in the wireless list 2. Open a browser. This will open a Uof U website 3. Choose Login.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.
The Multiple Uses of HBase Jean-Daniel Cryans, DB Berlin Buzzwords, Germany, June 7 th,
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Hadoop Basics -Venkat Cherukupalli. What is Hadoop? Open Source Distributed processing Large data sets across clusters Commodity, shared-nothing servers.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
LOGO Discussion Zhang Gang 2012/11/8. Discussion Progress on HBase 1 Cassandra or HBase 2.
Data storing and data access. Plan Basic Java API for HBase – demo Bulk data loading Hands-on – Distributed storage for user files SQL on noSQL Summary.
BigTable and Accumulo CMSC 461 Michael Wilson. BigTable  This was Google’s original distributed data concept  Key value store  Meant to be scaled up.
1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.
Performance Evaluation on Hadoop Hbase By Abhinav Gopisetty Manish Kantamneni.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Bigtable: A Distributed Storage System for Structured Data 1.
HBase. OUTLINE Basic Data Model Implementation – Architecture of HDFS Hbase Server HRegionServer 2.
+ Hbase: Hadoop Database B. Ramamurthy. + Motivation-0 Think about the goal of a typical application today and the data characteristics Application trend:
Key/Value Stores CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
1 HBase Intro 王耀聰 陳威宇
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Data storing and data access. Adding a row with Java API import org.apache.hadoop.hbase.* 1.Configuration creation Configuration config = HBaseConfiguration.create();
Distributed Networks & Systems Lab Distributed Networks and Systems(DNS) Lab, Department of Electronics and Computer Engineering Chonnam National University.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Introduction to Hbase. Agenda  What is Hbase  About RDBMS  Overview of Hbase  Why Hbase instead of RDBMS  Architecture of Hbase  Hbase interface.
HBase Elke A. Rundensteiner Fall 2013
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Nov 2006 Google released the paper on BigTable.
NOSQL DATABASE Not Only SQL DATABASE
Cloudera Kudu Introduction
Bigtable: A Distributed Storage System for Structured Data
1 HBASE – THE SCALABLE DATA STORE An Introduction to HBase XLDB Europe Workshop 2013: CERN, Geneva James Kinley EMEA Solutions Architect, Cloudera.
Data Model and Storage in NoSQL Systems (Bigtable, HBase) 1 Slides from Mohamed Eltabakh.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Apache Accumulo CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
BIG DATA/ Hadoop Interview Questions.
Bigtable A Distributed Storage System for Structured Data.
Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
and Big Data Storage Systems
Amit Ohayon, seminar in databases, 2017
HBase Mohamed Eltabakh
Hadoop.
Software Systems Development
How did it start? • At Google • • • • Lots of semi structured data
INTRODUCTION TO PIG, HIVE, HBASE and ZOOKEEPER
CSE-291 (Cloud Computing) Fall 2016
NOSQL.
Gowtham Rajappan.
NOSQL databases and Big Data Storage Systems
Introduction to Apache
Cloud Computing for Data Analysis Pig|Hive|Hbase|Zookeeper
Presentation transcript:

Christopher Shain Software Development Lead Tresata

 Hadoop for Financial Services  First completely Hadoop-powered analytics application  Widely recognized as “Big Data Startup to Watch”  Winner of the 2011 NCTA Award for Emerging Tech Company of the Year  Based in Charlotte NC  We are hiring!

 Software Development Lead at Tresata  Background in Financial Services IT  End-User Applications  Data Warehousing  ETL  

 What is HBase?  From: ▪ “HBase is the Hadoop database.”  I think this is a confusing statement

 ‘Database’, to many, means:  Transactions  Joins  Indexes  HBase has none of these  More on this later

 HBase is a data storage platform designed to work hand-in-hand with Hadoop  Distributed  Failure-tolerant  Semi-structured  Low latency  Strictly consistent  HDFS-Aware  “NoSQL”

 Need for a low-latency, distributed datastore with unlimited horizontal scale  Hadoop (MapReduce) doesn’t provide low- latency  Traditional RDBMS don’t scale out horizontally

 November 2006: Google BigTable whitepaper published:  February 2007: Initial HBase Prototype  October 2007: First ‘usable’ HBase  January 2008: HBase becomes Apache subproject of Hadoop  March 2009: HBase  May 10th, 2010: HBase becomes Apache Top Level Project

 Web Indexing  Social Graph  Messaging ( etc.)

 HBase is written almost entirely in Java  JVM clients are first-class citizens HBase Master RegionServer Proxy (Thrift or REST) Proxy (Thrift or REST) JVM Clients Non-JVM Clients

 All data is stored in Tables  Table rows have exactly one Key, and all rows in a table are physically ordered by key  Tables have a fixed number of Column Families (more on this later!)  Each row can have many Columns in each column family  Each column has a set of values, each with a timestamp  Each row:family:column:timestamp combination represents coordinates for a Cell

 Defined by the Table  A Column Family is a group of related columns with it’s own name  All columns must be in a column family  Each row can have a completely different set of columns for a column family Row:Column Family:Columns: Chris Friends Friends:Bob BobFriends:ChrisFriends:James JamesFriends:Bob

 Not exactly the same as rows in a traditional RDBMS  Key: a byte array (usually a UTF-8 String)  Data: Cells, qualified by column family, column, and timestamp (not shown here) Row Key:Column Families : (Defined by the Table) Columns: (Defined by the Row) (May vary between rows) Cells: (Created with Columns) Chris AttributesAttributes:Age30 Attributes:Height68 FriendsFriends:Bob1 (Bob’s a cool guy) Friends:Jane0 (Jane and I don’t get along)

 All cells are created with a timestamp  Column family defines how many versions of a cell to keep  Updates always create a new cell  Deletes create a tombstone (more on that later)  Queries can include an “as-of” timestamp to return point-in-time values

 HBase deletes are a form of write called a “tombstone”  Indicates that “beyond this point any previously written value is dead”  Old values can still be read using point-in-time queries TimestampWrite TypeResulting ValuePoint-In-Time Value “as of” T+1 T + 0PUT (“Foo”)“Foo” T + 1PUT (“Bar”)“Bar” T + 2DELETE “Bar” T + 3PUT (“Foo Too”)“Foo Too”“Bar”

 Requirement: Store real-time stock tick data  Requirement: Accommodate many simultaneous readers & writers  Requirement: Allow for reading of current price for any ticker at any point in time TickerTimestampSequenceBidAsk IBM09:15:03: MSFT09:15:04: GOOG09:15:04: IBM09:15:04:

KeysColumnDataType PK TickerVarchar TimestampDateTime Sequence_NumberInteger Bid_PriceDecimal Ask_PriceDecimal KeysColumnDataType PKTickerVarchar Bid_PriceDecimal Ask_PriceDecimal Latest Prices: Historical Prices:

Row KeyFamily:Column [Ticker].[Rev_Timestamp].[Rev_Sequence_Number] Prices:Bid Prices:Ask  HBase throughput will scale linearly with # of nodes  No need to keep separate “latest price” table  A scan starting at “ticker” will always return the latest price row

 HBase scales horizontally  Needs to split data over many RegionServers  Regions are the unit of scale

 All HBase tables are broken into 1 or more regions  Regions have a start row key and an end row key  Each Region lives on exactly one RegionServer  RegionServers may host many Regions  When RegionServers die, Master detects this and assigns Regions to other RegionServers

TableRegion Server Users “Aaron” – “George”Node01 “George” – “Matthew”Node02 “Matthew” – “Zachary”Node01 Row Keys in Region “Aaron” – “George” “Aaron” “Bob” “Chris” Row Keys in Region “George” – “Matthew” “George” Row Keys in Region “Matthew” – “Zachary” “Matthew” “Nancy” “Zachary” -META- Table “Users” Table

Deceptively simple

HBase Master RegionServer Proxy (Thrift or REST) Proxy (Thrift or REST) JVM Clients Non-JVM Clients Backup HBase Master ZooKeeper Cluster

 ZooKeeper  Keeps track of which server is the current HBase Master  HBase Master  Keeps track of Region/RegionServer mapping  Manages the -ROOT- and.META. tables  Responsible for updating ZooKeeper when these change

 RegionServer  Stores table regions  Clients  Need to be smarter than RDBMS clients  First connect to ZooKeeper to get RegionServer for a given Table/Region  Then connect directly to RegionServer to interact with the data  All connections over Hadoop RPC – non-JVM clients use proxy (Thrift or REST (Stargate))

-ROOT- Table.META.[region] info:regioninfo info:server info:serverstartcode.META. Table [table],[region start key],[region id] info:regioninfo info:server info:serverstartcode Points to DataNode hosting.META. region.… Regular User Table … whatever …… Points to DataNode hosting table region.

 HBase Master is not necessarily a single point of failure (SPOF)  Multiple masters can be running  Current ‘active’ Master controlled via ZooKeeper  Make sure you have enough ZooKeeper nodes!  Master is not needed for client connectivity  Clients connect directly to ZooKeeper to find Regions  Everything Master does can be put off until one is elected

ZooKeeper Quorum ZooKeeper Node HBase Master (Current) HBase Master (Standby)

 HBase tolerates RegionServer failure when running on HDFS  Data is replicated by HDFS (dfs.replication setting)  Lots of issues around fsync, failure before data is flushed - some probably still not fixed  Thus, data can still be lost if node fails after a write  HDFS NameNode is still SPOF, even for HBase

 Similar to log in many RDBMS  All operations by default written to log before considered ‘committed’ (can be overridden for ‘disposable fast writes’)  Log can be replayed when region is moved to another RegionServer  One WAL per RegionServer Writes WAL MemStore HFile Flushed periodically (10s by default) Flushed when MemStore gets too big

RegionServer Region Log HDFS Client Block HDFS DataNode Block Store StoreFile HFile MemStore Store StoreFile HFile MemStore Store StoreFile HFile MemStore Block HDFS DataNode Block HDFS DataNode Block HDFS DataNode Block

 A RegionServer is not guaranteed to be on the same physical node as it’s data  Compaction causes RegionServer to write preferentially to local node  But this is a function of HDFS Client, not HBase

 All data is in memory initially (memstore)  HBase is a write-only system  Modifications and deletes are just writes with later timestamps  Function of HDFS being append-only  Eventually old writes need to be discarded  2 Types of Compactions:  Minor  Major

 All HBase edits are initially stored in memory (memstore)  Flushes occur when memstore reaches a certain size  By default 67,108,864 bytes  Controlled by hbase.hregion.memstore.flush.size configuration property  Each flush creates a new HFile

 Triggered when a certain number of HFiles are created for a given Region Store (+ some other conditions)  By default 3 HFiles  Controlled by hbase.hstore.compactionThreshold configuration property  Compacts most recent HFiles into one  By default, uses RegionServer-local HDFS node  Does not eliminate deletes  Only touches most recent HFiles  NOTE: All column families are compacted at once (this might change in the future)

 Triggered every 24 hours (with random offset) or manually  Large HBase installations usually leave this for manual operators  Re-writes all HFiles into one  Processes deletes  Eliminates tombstones  Erases earlier entries

 HBase does not have transactions  However:  Row-level modifications are atomic: All modifications to a row will succeed or fail as a unit  Gets are consistent for a given point in time ▪ But Scans may return 2 rows from different points in time  All data read has been ‘durably stored’ ▪ Does NOT mean flushed to disk- can still be lost!

 DO: Design your schema for linear range scans on your most common queries.  Scans are the most efficient way to query a lot of rows quickly  DON’T: Use more than 2 or 3 column families.  Some operations (flushing and compacting) operate on the whole row  DO: Be aware of the relative cardinality of column families  Wildly differing cardinality leads to sparsity and bad scanning results.

 DO: Be mindful of the size of your row and column keys  They are used in indexes and queries, can be quite large!  DON’T: Use monotonically increasing row keys  Can lead to hotspots on writes  DO: Store timestamp keys in reverse  Rows in a table need to be read in order, usually you want most recent

 DO: Query single rows using exact-match on key (Gets) or Scans for multiple rows  Scans allow efficient I/O vs. multiple gets  DON’T: Use regex-based or non-prefix column filters  Very inefficient  DO: Tune the scan cache and batch size parameters  Drastically improves performance when returning lots of rows

 Deceptively simple HBase Master RegionServer Proxy (Thrift or REST) Proxy (Thrift or REST) JVM Clients Non-JVM Clients

ZooKeeper Quorum ZooKeeper Node HBase Master (Current) HBase Master (Standby)

RegionServer Region Log HDFS Client Block HDFS DataNode Block Store StoreFile HFile MemStore Store StoreFile HFile MemStore Store StoreFile HFile MemStore Block HDFS DataNode Block HDFS DataNode Block HDFS DataNode Block

 Requirement: Store an arbitrary set of preferences for all users  Requirement: Each user may choose to store a different set of preferences  Requirement: Preferences may be of different data types (Strings, Integers, etc)  Requirement: Developers will add new preference options all the time, so we shouldn’t need to modify the database structure when adding them

 One possible RDBMS solution:  Key/Value table  All values as strings  Flexible, but wastes space Keys:Column:Data Type: PK UserIDInt PreferenceNameVarchar PreferenceValueVarchar

 Store all preferences in the Preferences column family  Preference name as column name, preference value as (serialized) byte array:  HBase client library provides methods for serializing many common data types Row Key:Family:Column:Value: ChrisPreferences Age30 Hometown“Mineola, NY” JoePreferencesBirthdate11/13/1987