Download presentation
Presentation is loading. Please wait.
Published byBathsheba Hawkins Modified over 6 years ago
1
HBase on MapR Lohit VijayaRenu, MapR Technologies, Inc.
HBase contributor day at Yahoo, June
2
Who am I? Lohit VijayaRenu, Software Engineer at MapR Technologies MapR Combines the best of the Hadoop community contributions with significant internally financed infrastructure development to provide complete distribution for Apache Hadoop (
3
HBase on MapR Backups using Snapshots Performance on MapR
Highly available MapR MapR Control System
4
HBase Backups "We're trying to come up with right strategy for backing up HBase tables ...Currently, we're employing exports (writing onto HDFS of another cluster directly), but is taking too long (~5 hours to export ~5GB of data)...” Manoj Murumkar "...Recently I encountered a problem about data loss of HBase. So it comes to the question that how to backup HBase data to recover table records...What about copy the directory of HBase to another directory in HDFS?... " Liu Xianglong Source: hbase-user group Available options Export/Import CopyTable Distcp Backup from Mozilla Cluster Replication Table Snapshots Source:
5
MapR Snapshots HBASE Entire /hbase can be snapshotted while HBase is running Snapshots are consistent Saves space by sharing blocks Lightning fast Zero performance loss on writing to original Scheduled, or on-demand REST API for creation and deletion of snapshots READ / WRITE /hbase /hbase/.snapshot/Snapshot /hbase/.snaphsot/Snapshot /hbase/.snaphsot/Snapshot3 MapR REDIRECT ON WRITE FOR SNAPSHOT Data Blocks A B C C’ D Snapshot Snapshot Snapshot 3
6
MapR Snapshots HBase table in DFS Take snapshot on running HBase
Restore from snapshot
7
MapR Control System Snapshot information Snapshot Schedules
All UI operations have REST APIs More info at
8
MapR Mirroring Mirror is physical copy of data
Consistent, point-in-time data replication to different cluster Differential deltas are updated Compressed and check-summed Scheduled or on-demand REST API for setup, start and stop mirror Production Backup Datacenter 1 Datacenter 2 WAN
9
HBase performance "...Initially, when the table was empty I was getting around 300 inserts per second with 50 writing threads. Then, when the region split and a second server was added the rate suddenly jumped to 3000 inserts/sec per server, so ~6000 for the two servers...“ Eran Kutner "...My scenario is similar, we need under 10k rows, columns and which can have thousands of version with value not greater than 300 bytes...Can we get 40-50k records/sec insertion speed in HBase??...“ Gaurav Vashishth Source: hbase-user group
10
YCSB setup Modified YCSB to use ZooKeeper to have co- ordinated start.
HMaster and RegionServer running on MapR YCSB Client running on RS nodes ZooKeeper YCSB YCSB YCSB YCSB RS RS RS RS Master MapR
11
YCSB operations from nodes
YCSB Clients doing inserts from all cluster nodes. Throughput rates were similar from all nodes All operations in cluster completed around same time.
12
Insert performance Dataset: 1B rows Row size: 1K 10 RS, 11 2TB @7200
8 Cores, 24GB RAM, 2Gbps 3 Replication, No compression Ops Seconds Insert (one node)
13
Read performance Dataset: 0.9B rows Row size: 1K 9 RS, 5 500G @7200
8 cores, 24GB RAM, 2Gbps Ops Seconds Read (one node)
14
HBase High Availability
"...In HBase 0.90 I have seen that it has a fault tolerant behavior of triggering lease recovery and closing the file when the writer dies in the middle. Yet does hbase have any workaround/recovery when NameNode is restarted in the middle of the file write(possibly the HLog file , after some syncs)???..." Gokulakannan M source: hbase-user group
15
MapR High Availability
No single point of failure Distributed NameNode Automatic and transparent failover Better performance Replicated and persisted to disk Fully distributed and highly scalable Real time HBase on MapR HBASE READ / WRITE MapR (No Single Point of Failure) Node Node Node NN NN NN Node Node Node NN NN NN
16
MapR Heatmap™ Intuitive Insightful Comprehensive One node or thousands
More at
17
Credits More Information http://www.mapr.com
Michael Stack and Ryan Rawson for their valuable feedback. Brian Cooper and Adam Silberstein for their help with YCSB Active and helpful HBase community Follow Download and try from
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.