A free and open-source distributed NoSQL database

Slides:

Advertisements

Similar presentations

CASSANDRA-A Decentralized Structured Storage System Presented By Sadhana Kuthuru.

Advertisements

Large Scale Computing Systems

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.

NoSQL Databases: MongoDB vs Cassandra

 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)

NoSQL Database.

Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.

Cloud Storage: All your data belongs to us! Theo Benson This slide includes images from the Megastore and the Cassandra papers/conference slides.

Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.

A Study in NoSQL & Distributed Database Systems John Hawkins.

1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.

Databases with Scalable capabilities Presented by Mike Trischetta.

AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.

Zhang Gang Big data High scalability One time write, multi times read …….(to be add )

Distributed Data Stores and No SQL Databases S. Sudarshan Perry Hoekstra (Perficient) with slides pinched from various sources such as Perry Hoekstra (Perficient)

Modern Databases NoSQL and NewSQL Willem Visser RW334.

High Throughput Computing on P2P Networks Carlos Pérez Miguel

LOGO Discussion Zhang Gang 2012/11/8. Discussion Progress on HBase 1 Cassandra or HBase 2.

Apache Cassandra - Distributed Database Management System Presented by Jayesh Kawli.

The Lightning Way XIV Encontro da comunidade SQLPort LX

Changwon Nati Univ. ISIE 2001 CSCI5708 NoSQL looks to become the database of the Internet By Lawrence Latif Wed Dec Nhu Nguyen and Phai Hoang CSCI.

NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.

NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.

Introduction to Hbase. Agenda  What is Hbase  About RDBMS  Overview of Hbase  Why Hbase instead of RDBMS  Architecture of Hbase  Hbase interface.

CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.

NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...

NOSQL DATABASE Not Only SQL DATABASE

HDB++: High Availability with

{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.

Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.

Introduction to NoSQL Databases Chyngyz Omurov Osman Tursun Ceng,Middle East Technical University.

Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.

BIG DATA/ Hadoop Interview Questions.

Amirhossein Saberi May CASSANDRA NAME A daughter of the Trojan king Priam, who was given the gift of prophecy by Apollo. When she cheated him, however,

1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.

NoSQL: Graph Databases

Neo4j: GRAPH DATABASE 27 March, 2017

Plan for Final Lecture What you may expect to be asked in the Exam?

CS 405G: Introduction to Database Systems

NO SQL for SQL DBA Dilip Nayak & Dan Hess.

NoSQL: Graph Databases

and Big Data Storage Systems

Cloud Computing and Architecuture

Hadoop Aakash Kag What Why How 1.

Introduction to Distributed Platforms

How did it start? • At Google • • • • Lots of semi structured data

CS122B: Projects in Databases and Web Applications Winter 2017

MongoDB Er. Shiva K. Shrestha ME Computer, NCIT

NoSQL Database and Application

Azure Cosmos DB Venitta J Microsoft Connect /6/2018 4:36 PM

Modern Databases NoSQL and NewSQL

The NoSQL Column Store used by Facebook

Dineesha Suraweera.

Christian Stark and Odbayar Badamjav

Introduction to NewSQL

NOSQL databases and Big Data Storage Systems

A Comparison of SQL and NoSQL Databases

NoSQL Systems Overview (as of November 2011).

Storage Systems for Managing Voluminous Data

Massively Parallel Cloud Data Storage Systems

1 Demand of your DB is changing Presented By: Ashwani Kumar

NoSQL Databases An Overview

NoSQL Databases Antonino Virgillito.

Overview of big data tools

NoSQL W2013 CSCI 2141.

CSE 482 Lecture 5: NoSQL.

Transaction Properties: ACID vs. BASE

NoSQL databases An introduction and comparison between Mongodb and Mysql document store.

Presentation transcript:

A free and open-source distributed NoSQL database Computer Clusters and Grids Prof. Andrey Shevel A free and open-source distributed NoSQL database Carl Kugblenu ITMO University – St. Petersbury

What is NoSQL? Stands for Not Only SQL Class of non-relational data storage systems Usually do not require a fixed table schema nor do they use the concept of joins -> NoSQL was a term coined by Eric Evans. He states that ‘… but the whole point of seeking alternatives is that you need to solve a problem that relational databases are a bad fit for. … -> This is why people are continually interpreting nosql to be anti-RDBMS, it's the only rational conclusion when the only thing some of these projects share in common is that they are not relational databases.’ -> Emil Elfrem stated it is not a ‘NO’ SQL but more of a ‘Not Only” SQL.

Distributed Database

Eventual Consistency When no updates occur for a long period of time, eventually all updates will propagate through the system and all the nodes will be consistent Known as BASE (Basically Available, Soft state, Eventual consistency), as opposed to ACID -> http://en.wikipedia.org/wiki/Eventual_consistency -> http://www.allthingsdistributed.com/2008/12/eventually_consistent.html The types of large systems based on CAP aren't ACID they are BASE (http://queue.acm.org/detail.cfm?id=1394128): * Basically Available - system seems to work all the time * Soft State - it doesn't have to be consistent all the time * Eventually Consistent - becomes consistent at some later time Everyone who builds big applications builds them on CAP and BASE: Google, Yahoo, Facebook, Amazon, eBay, etc

Cassandra Originally developed at Facebook Follows the BigTable data model: column-oriented Uses the Dynamo Eventual Consistency model Written in Java Open-sourced and exists within the Apache family

Why Use Cassandra

Features Fault Tolerant Architecture: Cassandra has no single point of failure. Every node in the cluster is identical and can run on commodity hardware. Performant: Cassandra supports very high throughput. In particular the Cassandra write path allows for high performance, robust writes. Resilience: Data is automatically replicated to multiple nodes Scalable: Cassandra is horizontally scalable. Flexible data storage: Cassandra accommodates all possible data formats including: structured, semi-structured, and unstructured.

Architecture presentation title

Cassandra: Requests Flow presentation title

Internode Communication Cassandra uses Gossip communication protocol in which nodes periodically exchange state information about themselves and other nodes they know about Gossip messages are versioned During a gossip exchange, older information is overwritten with the most current state for a particular node

Data Model

Advantages Write Speeds Multi-DC Replication Tunable Consistency

Drawbacks No Ad-Hoc Queries No Aggregations Unpredictable Performance JVM Based

Use Cases Immutable data where updates and deletions are the exception rather than the rule. High-throughput firehoses of immutable events such as personalization, fraud detection, time series and IOT/sensors.

Demo Deployment

References http://cassandra.apache.org/ http://www.datastax.com/dev/blog/multi-datacenter-replication https://www.edureka.co/blog/apache-cassandra-advantages/ https://en.wikipedia.org/wiki/Apache_Cassandra https://www.digitalocean.com/community/tutorials/how-to-install-cassandra-and-run-a-single-node-cluster-on-ubuntu-14-04 https://opencredo.com/fulfilling-promise-apache-cassandra/ https://www.quora.com/What-are-the-pros-and-cons-of-using-the-Cassandra-database