Christian Stark and Odbayar Badamjav

Slides:



Advertisements
Similar presentations
Introduction to MongoDB
Advertisements

2 Proprietary & Confidential What is Sharding Benefits of Sharding Alternatives of Sharding When to start Sharding Agenda.
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
NoSQL Databases: MongoDB vs Cassandra
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
NoSQL Database.
An introduction to MongoDB Rácz Gábor ELTE IK, febr. 10.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.
AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE10 Mars 2014Epitech Nantes.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
WTT Workshop de Tendências Tecnológicas 2014
© , OrangeScape Technologies Limited. Confidential 1 Write Once. Cloud Anywhere. Building Highly Scalable Web applications BASE gives way to ACID.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Introduction to Hbase. Agenda  What is Hbase  About RDBMS  Overview of Hbase  Why Hbase instead of RDBMS  Architecture of Hbase  Hbase interface.
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NOSQL DATABASE Not Only SQL DATABASE
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
NoSQL databases A brief introduction NoSQL databases1.
Introduction to NoSQL Databases Chyngyz Omurov Osman Tursun Ceng,Middle East Technical University.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
NoSql An alternative option in the DevEvenings ORM Smackdown Tarn Barford
Amirhossein Saberi May CASSANDRA NAME A daughter of the Trojan king Priam, who was given the gift of prophecy by Apollo. When she cheated him, however,
Why NO-SQL ?  Three interrelated megatrends  Big Data  Big Users  Cloud Computing are driving the adoption of NoSQL technology.
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
Neo4j: GRAPH DATABASE 27 March, 2017
Platform as a Service (PaaS)
CS 405G: Introduction to Database Systems
NoSQL Know Your Enemy Shelly Noll Learning Care Group, Novi, MI
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
NoSQL: Graph Databases
and Big Data Storage Systems
NoSQL Databases Cloudant & Redis Nikolay Tomitov
Cloud Computing and Architecuture
Platform as a Service (PaaS)
CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya
NoSQL Know Your Enemy Shelly Noll SRT Solutions, Ann Arbor, MI
CS122B: Projects in Databases and Web Applications Winter 2017
A free and open-source distributed NoSQL database
Based on: NoSQL Databases Based on:
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
NoSQL Know Your Enemy Shelly Noll SRT Solutions, Ann Arbor, MI
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
NoSQL Database and Application
Modern Databases NoSQL and NewSQL
NOSQL.
NOSQL databases and Big Data Storage Systems
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
Arrested by the CAP Handling Data in Distributed Systems
NOSQL and CAP Theorem.
NoSQL Databases An Overview
NoSQL Databases Antonino Virgillito.
NoSQL Not Only SQL University of Kurdistan Faculty of Engineering
April 13th – Semi-structured data
Transaction Properties: ACID vs. BASE
Introduction to NoSQL Database Systems
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL & Document Stores
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

Christian Stark and Odbayar Badamjav NoSQL: Dynamic DB Christian Stark and Odbayar Badamjav

History of NoSQL Eric Evans of Rackspace, a committer on the Cassandra project, introduced the term NoSQL in 2009. Amazon released research paper on Amazon Dynamo in 2007. MongoDB started in 2007 as a part of an open source cloud Facebook's open source Cassandra project (now maintained by Apache) in 2008. The advent of distributed and parallel computing gave rise to alternatives to relational databases. As cloud computing became more affordable and mainstream, home grown NoSQL DBs went open source.

What is NoSQL?

What is NoSQL? Not to be confused with the NoSQL database system (a RDBMS). "NoSQL" or "not only SQL" is a class of databases that are more broad and encompassing than SQL-based databases. Most relational databases are subsets of functionality that "NoSQL" databases can offer.

More about NoSQL It doesn't have a fixed data model, or predefined schema. Is not necessarily a "one-fits-all" solution. Leaves room for more tailored solutions, on a per- application basis. Many different types of implementations SQL NoSQL tables Collections Rows Documents Columns Fields

Usually favorable in distributed, large data settings Why use NoSQL? Usually favorable in distributed, large data settings High availability High fault tolerance Open source Easy to implement Wide variety of solutions both as a service, and as different implementations The appeal of NoSQL is that it handles mass quantities of data, quickly, across a cluster of servers that share resources, making it both fast and reliable. The fact that it's open source keeps costs down, and it's easier to use than conventional databases

Brewer's CAP Theorem Consistency Availability Partition Tolerance The theorem states that it is impossible to have all three of these aspects present in a distributed database system. It states that, at most, two may be present if compromises are made in the other. For NoSQL, the trade off is consistency for partition tolerance and availability. Most solutions lie somewhere in a continuum between ACID and BASE.

Brewer CAP Theorem

ACID or BASE? BASE (NoSQL) Basic Availability -- Be able to expect a timely, or quick response Soft-state -- What the database replies is good enough for now. Eventual Consistency -- Code is written to handle each type of inconsistency as they are discovered. ACID (RDBMS) Atomicity -- All or nothing, per transaction Consistency -- Transactions leave in consistent state Isolation -- One transaction does not interfere with another. Durability -- Transactions persist restarts, other interruptions in database engine

Amazon Stock Quantity Example If two users were to place the same item in their carts, and purchase the same product within a short period of time when there was only one of the item left, what should happen? RDBMS would ensure consistency, a process that would take more time than is tolerable for a customer. If it returned an answer in time, the answer it would return would be a success to one, and failure to another, and possibly offer to backorder the item for the customer.

Continued Amazon Example A NoSQL/Eventual Consistency approach would most likely accept both purchases. When the system discovers the inconsistency in the data, it alert the customer that ordered the item last that it has been placed on back order. Companies have found that there can be severe penalties for future traffic when there are delays in making these types of transactions.

Real world usage What companies or organizations use NoSQL? Google (BigTable) eBay (Hadoop) Amazon (Dynamo) Twitter (FlockDB, a graph-type db, and Cassandra) Yahoo (Hadoop) Facebook (Hadoop) Craigslist (MongoDB) Netflix (Apache's Cassandra) Many companies use NoSQL and RDBMS together for different parts of applications.

How it works: A look at MongoDB Features: Dynamic schemas, JSON-style documents Full indexing for all fields/attributes Scales horizontally Fast, in-place updates to data GridFS, store files of any size, distributed Ad-hoc queries allow for dynamic queries that are similar to those of RDBMS "Sharding" -- Auto scaling for balancing and fault tolerance Official Drivers exist for Java, Ruby, PHP, Perl, C, C++, Erlang, Haskell, Javascript, Python, and Scala. Many community supported drivers available.

Database Name CouchDB MongoDB MySQL Data Model Document-Oriented (JSON) Document-Oriented (BSON) Relational Data Types string,number,boolean,array,object string, int, double, boolean, date, bytearray, object, array, others link Large Objects (Files) Yes (attachments) Yes (GridFS) Blobs Horizontal partitioning scheme CouchDB Lounge Auto-sharding Partitioning Replication Master-master (with developer supplied conflict resolution) Master-slave and replica sets Master-slave, multi-master, and circular replication Object(row) Storage One large repository Collection-based Table-based Query Method Map/reduce of javascript functions to lazily build an index per query Dynamic; object-based query language Dynamic; SQL Secondary Indexes Yes

Database Name CouchDB MongoDB MySQL Interface REST Native drivers ; REST add-on Native drivers Server-side batch data manipulation ? Map/Reduce, server-side javascript Yes (SQL) Written in Erlang C++ Concurrency Control MVCC Update in Place Geospatial Indexes GeoCouch Yes Spatial extensions Distributed Consistency Model Eventually consistent (master-master replication with versioning and version reconciliation) Strong consistency. Eventually consistent reads from secondaries are available. Atomicity Single document Single document Yes - advanced

Available Hosted Services Amazon's DynamoDB - pay for what you use Amazon's SimpleDB - pay for what you use MongoLab (MongoDB) - free plan IrisCouch (CouchDB) - free for modest use Cloudant (CouchDB) - free plan Many more are available with free starter plans as well. Amazon and most others allow you to pay for only what you use, and save costs.

NoSQL Projects Apache's CouchDB -- Document-store type, incremental replication with "bi-directional conflict detection and resolution" Apache's Cassandra -- linear scalability and high availability mongoDB -- "scalable, high performance open source, NoSQL database" Apache's HBase -- sits on Hadoop/HDFS Redis -- In memory, distributed key/value store, with optional persistence Google's BigTable -- Available as Google App Engine Datastore; tabular (3-dimensional mapping)

Conclusion NoSQL can be a valuable tool for large, distributed data sets that need to scale and have high read/write ability. NoSQL is not a replacement for RDBMS, but a supplement for it. Most applications use both for different use cases. NoSQL can be a simple, resilient database that is easy to deploy. More information available. See Wikipedia.