MongoDB Replica,Shard Cluster 中央大學電算中心 楊素秋 2014-05-05.

Slides:



Advertisements
Similar presentations
Introduction to MongoDB
Advertisements

 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Graph databases …the other end of the NoSQL spectrum. Material taken from NoSQL Distilled and Seven Databases in Seven Weeks.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Installing and Setting up mongoDB replica set PREPARED BY SUDHEER KONDLA SOLUTIONS ARCHITECT.
MongoDB Sharding and its Threats
Jeff Lemmerman Matt Chimento Medtronic Confidential 1 9th Annual CodeFreeze Symposium Medtronic Energy and Component Center.
Running Your Database in the Cloud Eran Levin VP R&D - Xeround.
©2012 Microsoft Corporation. All rights reserved..
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Distributed Databases Dr. Lee By Alex Genadinik. Distributed Databases? What is that!?? Distributed Database - a collection of multiple logically interrelated.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
1 The Google File System Reporter: You-Wei Zhang.
Software Engineer, #MongoDBDays.
High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Goodbye rows and tables, hello documents and collections.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Partitioning and Replication.
MongoDB is a database management system designed for web applications and internet infrastructure. The data model and persistence strategies are built.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Databases Illuminated
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
The Art of Database Sharding Maxym Kharchenko Amazon.com.
VMware vSphere Configuration and Management v6
High Availability in DB2 Nishant Sinha
Department of Computing, School of Electrical Engineering and Computer Sciences, NUST - Islamabad KTH Applied Information Security Lab Secure Sharding.
)1()1( Presenter: Noam Presman Advanced Topics in Storage Systems – Semester B 2013 Authors: A.Cidon, R.Stutsman, S.Rumble, S.Katti,
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Replication Store it in multiple places.... Literature Colouris, Dollimore, Kindberg, 2000 –Gets deep into the details of reliable communication, byzantine.
Senior Solutions Architect, MongoDB Inc. Massimo Brignoli #MongoDB Introduction to Sharding.
NoSQL Cheng Lei Department of Electrical and Computer Engineering University of Victoria Mar 05, 2015.
Document Oriented Database Compiled from many sourcess.
Mick Badran Using Microsoft Service Fabric to build your next Solution with zero downtime – Lvl 300 CLD32 5.
Hiearchial Caching in Traffic Server. Hiearchial Caching  A set of techniques and mechanisms to increase the size and performance of network caches.
(Part 2) Josh Wells. Topics  Quick Review  Aggregation  Sharding  MongoDB Users.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
CSE-291 (Distributed Systems) Winter 2017 Gregory Kesden
and Big Data Storage Systems
Understanding Solutions
Building Scalable Resilient Websites in Azure
Workload Distribution Architecture
Hadoop Aakash Kag What Why How 1.
Ops Manager API, Puppet and OpenStack – Fully automated orchestration from scratch! MongoDB World 2016.
MongoDB Distributed Write and Read
Network Load Balancing
Learning MongoDB ZhangGang
Maximum Availability Architecture Enterprise Technology Centre.
NOSQL.
Senior Solutions Architect, MongoDB Inc.
MongoDB Connection in Husky
NOSQL databases and Big Data Storage Systems
CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
Ministry of Higher Education
Distributed computing deals with hardware
Cloud Computing Architecture
Cloud Computing Architecture
Key Manager Domains February, 2019.
Network management system
Presentation transcript:

MongoDB Replica,Shard Cluster 中央大學電算中心 楊素秋

OUTLINE 1. MongoDB Replica 2. Deploy a Replica Set 3. Sharing Cluster 4. Deploy a Sharded Cluster 5. Conclusion

1. MongoDB Replica Provides redundancy –protects a database from loss of a single server Increases data availability –recover from hardware failure service interruptions

2. Deploy a Replica Set 安裝 / 啟動 mongoDB on each hosts – 設定 /etc/mongod.conf service mongod restart 在 primary host ( ) –rs.initiate() –rs.add(“ :27017”) –rs.add(“ :27017”) –rs.addArb(“ :27017”)

Priority –cfg = rs.conf() –cfg.members[0].priority = 2.0 –cfg.members[1].priority = 0.5 –rs.reconfig(cfg) 2. Deploy a Replica Set (cont.)

Slave Node –MongoDBManager.java 2. Deploy a Replica Set (cont.) public static synchronized DB getDB() throws Exception { if(mongo == null) { mongo = new Mongo(); mongo.slaveOk(); } return mongo.getDB("fdns"); } public static synchronized Mongo getMongo() throws Exception { if(mongo == null) { mongo = new Mongo(); mongo.slaveOk(); } return mongo; }

Slave Node –mongo shell use fdns rs.slaveOk() show collections –mongo shell db.collectionName.remove() // FAIL db.collectionName.drop() 2. Deploy a Replica Set (cont.)

3. Sharing Cluster Single machine challenges –High query rates exhaust CPU capacity –Larger data sets exceed the storage capacity Referances – –

3. Sharing Cluster(cont.) Sharded Cluster Components –Shards holds a subset of a collection’s data a single mongod instance, or a replica setmongodreplica set –Config Servers a mongod instancemongod –holds metadata about the cluster –metadata maps chunks to shardschunks

3. Sharing Cluster (cont.) –Routing Instances a mongos instancemongos –routes the reads and writes from applications to the shards Applications do not access the shards directly

3. Sharing Cluster (cont.) Vertical scaling –adds more CPU and storage resources to increase capacity horizontal scaling (Sharding) divides the data set distributes data over multiple servers(shards) –Each shard is an independent database –shards make up a single logical database

Range based partitioning –MongoDB divides the data set into ranges determined by the shard key values to provide

Hash Based Sharding MongoDB computes a hash of a field’s value –uses these hashes to create chunks.

4. Deploy a Shard Cluster – y-shard-cluster/

5. Conclusion Replica Set ** –Primary : service –Secondary : Data Mining Apache Mahout: cluster, classification Mining flooding, attacks traffic –Arbiter Sharing Cluster –Load balance –Scalability