NoSQL W2013 CSCI 2141.

Slides:



Advertisements
Similar presentations
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
Advertisements

Jennifer Widom NoSQL Systems Overview (as of November 2011 )
NoSQL Databases: MongoDB vs Cassandra
Reporter: Haiping Wang WAMDM Cloud Group
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CS346: Advanced Databases
NoSQL Database.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
A Comparison of SQL and NoSQL Databases
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
A Study in NoSQL & Distributed Database Systems John Hawkins.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Databases with Scalable capabilities Presented by Mike Trischetta.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
NoSQL Not Only SQL Edel Sherratt. What is NoSQL? Not Only SQL Large volumes of data No schema Partition tolerance – scale by adding more commodity servers.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Himanshu Grover.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Exam and Lecture Overview.
SLIDE 1IS 257 – Fall 2014 NewSQL and VoltDB University of California, Berkeley School of Information IS 257: Database Management.
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Nov 2006 Google released the paper on BigTable.
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
NoSQL: Graph Databases. Databases Why NoSQL Databases?
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
CMPE 226 Database Systems May 3 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
1 Ahmed K. Ezzat, Tradeoffs Between SQL and NoSQL Data Mining and Big Data.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.
Neo4j: GRAPH DATABASE 27 March, 2017
CSCI5570 Large Scale Data Processing Systems
CS 405G: Introduction to Database Systems
and Big Data Storage Systems
CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya
CS122B: Projects in Databases and Web Applications Winter 2017
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
Modern Databases NoSQL and NewSQL
NOSQL.
NOSQL databases and Big Data Storage Systems
A Comparison of SQL and NoSQL Databases
A Comparison of SQL and NoSQL Databases
NoSQL Systems Overview (as of November 2011).
MongoDB Introduction, Installation & Execution
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
NOSQL and CAP Theorem.
NoSQL Databases An Overview
A Comparison of SQL and NoSQL Databases
NoSQL W2013 CSCI 2141.
Database Systems Summary and Overview
Transaction Properties: ACID vs. BASE
Introduction to NoSQL Database Systems
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Pig Hive HBase Zookeeper
Presentation transcript:

NoSQL W2013 CSCI 2141

OLTP vs. OLAP We can divide IT systems into transactional (OLTP) and analytical (OLAP). In general we can assume that OLTP systems provide source data to data warehouses, whereas OLAP systems help to analyze it .

Challenges of Scale Differ

A Comparison of SQL and NoSQL Databases Slides from: Keith W. Hare Metadata Open Forum More reading: http://martinfowler.com/articles/nosqlKeyPoints.html Metadata Open Forum

Abstract NoSQL databases (either no-SQL or Not Only SQL) are currently a hot topic in some parts of computing. In fact, one website lists over a hundred different NoSQL databases. This presentation reviews the features common to the NoSQL databases and compares those features to the features and capabilities of SQL databases. BIG DATA! 20 April 2017

20 April 2017

SQL Characteristics Data stored in columns and tables Relationships represented by data Data Manipulation Language Data Definition Language Transactions Abstraction from physical layer 20 April 2017

SQL Physical Layer Abstraction Applications specify what, not how Query optimization engine Physical layer can change without modifying applications Create indexes to support queries In Memory databases 20 April 2017

Data Manipulation Language (DML) Data manipulated with Select, Insert, Update, & Delete statements Select T1.Column1, T2.Column2 … From Table1, Table2 … Where T1.Column1 = T2.Column1 … Data Aggregation Compound statements Functions and Procedures Explicit transaction control 20 April 2017

Data Definition Language Schema defined at the start Create Table (Column1 Datatype1, Column2 Datatype 2, …) Constraints to define and enforce relationships Primary Key Foreign Key Etc. Triggers to respond to Insert, Update , & Delete Stored Modules Alter … Drop … Security and Access Control 20 April 2017

Transactions – ACID Properties Atomic – All of the work in a transaction completes (commit) or none of it completes Consistent – A transaction transforms the database from one consistent state to another consistent state. Consistency is defined in terms of constraints. Isolated – The results of any changes made during a transaction are not visible until the transaction has committed. Durable – The results of a committed transaction survive failures 20 April 2017

NewSQL: more OLTP throughput, real-time analytics ) SQL as the primary mechanism for application interaction 2) ACID support for transactions 3) A non-locking concurrency control mechanism so real- time reads will not conflict with writes, and thereby cause them to stall. 4) An architecture providing much higher per-node performance than available from the traditional "elephants” 5) A scale-out, shared-nothing architecture, capable of running on a large number of nodes without bottlenecking

NoSQL Definition From www.nosql-database.org: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open- source and horizontal scalable. The original intention has been modern web-scale databases. The movement began early 2009 and is growing rapidly. Often more characteristics apply as: schema-free, easy replication support, simple API, eventually consistent / BASE (not ACID), a huge data amount, and more. 20 April 2017

NoSQL Products/Projects http://www.nosql-database.org/ lists 122 NoSQL Databases Cassandra CouchDB Hadoop & Hbase MongoDB StupidDB Etc. 20 April 2017

NoSQL Products/Projects http://www.nosql-database.org/ lists 122 NoSQL Databases Cassandra CouchDB Hadoop & Hbase MongoDB StupidDB Etc. 20 April 2017

NoSQL Distinguishing Characteristics Large data volumes Google’s “big data” Scalable replication and distribution Potentially thousands of machines Potentially distributed around the world Queries need to return answers quickly Mostly query, few updates Asynchronous Inserts & Updates Schema-less ACID transaction properties are not needed – BASE CAP Theorem Open source development 20 April 2017

BASE Transactions Acronym contrived to be the opposite of ACID Basically Available, Soft state, Eventually Consistent Characteristics Weak consistency – stale data OK Availability first Best effort Approximate answers OK Aggressive (optimistic) Simpler and faster 20 April 2017

Brewer’s CAP Theorem A distributed system can support only two of the following characteristics: Consistency Availability Partition tolerance 20 April 2017

NoSQL Database Types Discussing NoSQL databases is complicated because there are a variety of types: Column Store – Each storage block contains data from only one column Document Store – stores documents made up of tagged elements Key-Value Store – Hash table of keys 20 April 2017

Other Non-SQL Databases XML Databases Graph Databases Codasyl Databases Object Oriented Databases Etc… Will not address these today 20 April 2017

Storing and Modifying Data Syntax varies HTML Java Script Etc. Asynchronous – Inserts and updates do not wait for confirmation Versioned Optimistic Concurrency 20 April 2017

Retrieving Data Syntax Varies Application specifies retrieval path No set-based query language Procedural program languages such as Java, C, etc. Application specifies retrieval path No query optimizer Quick answer is important May not be a single “right” answer 20 April 2017

Open Source Small upfront software costs Suitable for large scale distribution on commodity hardware 20 April 2017

NoSQL Summary NoSQL databases reject: Programmer responsible for Overhead of ACID transactions “Complexity” of SQL Burden of up-front schema design Declarative query expression Yesterday’s technology Programmer responsible for Step-by-step procedural language Navigating access path 20 April 2017

Summary SQL Databases NoSQL Database Predefined Schema Standard definition and interface language Tight consistency Well defined semantics NoSQL Database No predefined Schema Per-product definition and interface language Getting an answer quickly is more important than getting a correct answer 20 April 2017

Web References “NoSQL -- Your Ultimate Guide to the Non - Relational Universe!” http://nosql-database.org/links.html “NoSQL (RDBMS)” http://en.wikipedia.org/wiki/NoSQL PODC Keynote, July 19, 2000. Towards Robust. Distributed Systems. Dr. Eric A. Brewer. Professor, UC Berkeley. Co-Founder & Chief Scientist, Inktomi . www.eecs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf “Brewer's CAP Theorem” posted by Julian Browne, January 11, 2009. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem “How to write a CV” Geek & Poke Cartoon http://geekandpoke.typepad.com/geekandpoke/2011/01/nosql.html 20 April 2017

Web References “Exploring CouchDB: A document-oriented database for Web applications”, Joe Lennon, Software developer, Core International. http://www.ibm.com/developerworks/opensource/library/os- couchdb/index.html “Graph Databases, NOSQL and Neo4j” Posted by Peter Neubauer on May 12, 2010  at: http://www.infoq.com/articles/graph-nosql-neo4j “Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison”, Kristóf Kovács. http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis “Distinguishing Two Major Types of Column-Stores” Posted by Daniel Abadi onMarch 29, 2010 http://dbmsmusings.blogspot.com/2010/03/distinguishing-two- major-types-of_29.html 20 April 2017

Web References “MapReduce: Simplified Data Processing on Large Clusters”, Jeffrey Dean and Sanjay Ghemawat, December 2004. http://labs.google.com/papers/mapreduce.html “Scalable SQL”, ACM Queue, Michael Rys, April 19, 2011 http://queue.acm.org/detail.cfm?id=1971597 “a practical guide to noSQL”, Posted by Denise Miura on March 17, 2011 at http://blogs.marklogic.com/2011/03/17/a-practical-guide- to-nosql/ 20 April 2017