Non-traditional Databases

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

What is a Database By: Cristian Dubon.
NoSQL Databases: MongoDB vs Cassandra
In 10 minutes Mohannad El Dafrawy Sara Rodriguez Lino Valdivia Jr.
Introduction to Backend James Kahng. Install Node.js.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Databases with Scalable capabilities Presented by Mike Trischetta.
Systems analysis and design, 6th edition Dennis, wixom, and roth
NoSQL continued CMSC 461 Michael Wilson. MongoDB  MongoDB is another NoSQL solution  Provides a bit more structure than a solution like Accumulo  Data.
WTT Workshop de Tendências Tecnológicas 2014
Goodbye rows and tables, hello documents and collections.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NOSQL DATABASE Not Only SQL DATABASE
Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015.
V 1.0 DBMAN 10 Non-traditional Databases 1.
NoSQL databases A brief introduction NoSQL databases1.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
CMPE 226 Database Systems May 3 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.
Neo4j: GRAPH DATABASE 27 March, 2017
Non-traditional Databases List Of Questions
CSCI5570 Large Scale Data Processing Systems
CS 405G: Introduction to Database Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
NoSQL: Graph Databases
and Big Data Storage Systems
CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya
CS122B: Projects in Databases and Web Applications Winter 2017
A free and open-source distributed NoSQL database
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
Every Good Graph Starts With
Modern Databases NoSQL and NewSQL
NOSQL.
CMPE 280 Web UI Design and Development October 17 Class Meeting
Dineesha Suraweera.
CHAPTER 3 Architectures for Distributed Systems
NOSQL databases and Big Data Storage Systems
Database Performance Tuning and Query Optimization
NoSQL Systems Overview (as of November 2011).
Massively Parallel Cloud Data Storage Systems
1 Demand of your DB is changing Presented By: Ashwani Kumar
MANAGING DATA RESOURCES
NoSQL Databases An Overview
Intro to NoSQL Databases
Intro to NoSQL Databases
NoSQL Databases Antonino Virgillito.
Overview of big data tools
another noSql customization for the HDB++ archiving system
Intro to NoSQL Databases
CSE 482 Lecture 5: NoSQL.
CS5220 Advanced Topics in Web Programming Introduction to MongoDB
Transaction Properties: ACID vs. BASE
Chapter 11 Database Performance Tuning and Query Optimization
Introduction to NoSQL Database Systems
NoSQL Overview + Elasticsearch Quick Dive
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL & Document Stores
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Intro to NoSQL Databases
Working with GEOLocation Data
Presentation transcript:

Non-traditional Databases DBMAN 10 Non-traditional Databases SzaboZs

Non-Traditional = schema-less Many implementations with each working very differently and serving a specific need Common aim: less complex storage, more flexibility, faster access, server clustering Usually also: arbitrary type of relations, or even relation-less structures NoSql = Not Sql = Not Only Sql SzaboZs

Comparison Structure and type of data being kept: SQL/Relational databases require a structure with defined attributes to hold the data, unlike NoSQL databases which usually allow free-flow operations. Querying: Regardless of their licences, relational databases all implement the SQL standard to a certain degree and thus, they can be queried using the Structured Query Language (SQL). NoSQL databases, on the other hand, each implement a unique way to work with the data they manage. SzaboZs

Comparison Scaling: Both solutions are easy to scale vertically (i.e. by increasing system resources). However, being more modern (and simpler) applications, NoSQL solutions usually offer much easier means to scale horizontally (i.e. by creating a cluster of multiple machines). Reliability: When it comes to data reliability and safe guarantee of performed transactions, SQL databases are still the better bet. SzaboZs

Comparison Support: Relational database management systems have decades long history. They are extremely popular and it is very easy to find both free and paid support. If an issue arises, it is therefore much easier to solve than recently-popular NoSQL databases -- especially if said solution is complex in nature (e.g. MongoDB). Complex data keeping and querying needs: By nature, relational databases are the go-to solution for complex querying and data keeping needs. They are much more efficient and excel in this domain. SzaboZs

NoSQL advantages – “BIG DATA” – VVVC Velocity, Variety, Volume, Complexity High data velocity – lots of data coming in very quickly, possibly from different locations. Data variety – storage of data that is structured, semi-structured and unstructured. Data volume – data that involves many terabytes or petabytes in size. Data complexity – data that is stored and managed in different locations or data centers. SzaboZs

Typical “BIG DATA” use cases Flexible Data Models A NoSQL database is able to accept all types of data – structured, semi-structured, and unstructured – much more easily than a relational database which rely on a predefined schema Faster operations on semi-structured / unstructured data Analytics and Business Intelligence Ability to mine the data that is being collected “real-time data warehousing”: no need for middle-layer: no group by and join queries SzaboZs

Typical “BIG DATA” requirements – CAP Modern Transactional Capabilities (vs ACID!) Consistency (all nodes see the same data at the same time – not the same as the “C” in ACID ) Availability (a guarantee that every request receives a response, even if not up-to date) Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures) P is a minimum  in case of a communication error (partition) we usually must choose between A and C Some people say that CA without P = RDBMS, C+A+P = NoSQL SzaboZs

Typical “BIG DATA” requirements - PACELC In case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) Else (when no partitioning occurs: E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C) SzaboZs

NoSQL types Key-value Graph database Wide column Document storage (Search engines) LOTS of various implementation, extremely rapidly changing features and applications ... Especially with document databases and the chaotic world of JS in the past 2-3 years: “How it feels to learn JavaScript in 2016” https://hackernoon.com/how-it-feels-to-learn-javascript-in-2016-d3a717dd577f SzaboZs

NoSQL types Key-Value store least complex, like Dictionary<string, string> in C# Memcached vs MemcacheDB (cache vs storage), REDIS Oracle NoSQL Database (Eventually-Consistent) When to use Caching Queueing: Distributing information / tasks Keeping live information SzaboZs

NoSQL types Graph database Data is stored in graphs OrientDB, Neo4J, etc. Google: Linked data, Linked open data RDF = Resource Description Framework When to use Handling complex relational information Modelling and handling classifications SzaboZs

NoSQL types Column store / wide-column stores sounds like the inverse of a standard database very high performance and a highly scalable architecture Apache Cassandra  fastest When to use Keeping unstructured, non-volatile information: Scaling SzaboZs

Document Storage formats HTML = document descriptor, bad for storage, not strict XML, YAML = data descriptor, strict format JSON = object descriptor, strict format SzaboZs

Storage formats XML  many tools, fast and memory efficient, hard to read, hard to parse, XPATH is official YAML  few tools, 2x memory, easy to read, easy to parse JSON  more and more tools, 2x memory, hard to read, easy to parse, JSONPATH is unofficial SzaboZs

NoSQL types Document database expands on the basic idea of key-value stores “documents” contain data (using a storage format!) and each document is assigned a unique key Apache CouchDB, Lotus Notes JSON: MongoDB XML and JSON field formats exist in most modern relational databases When to use Nested information JSON: JavaScript/programming language friendly (MEAN = MongoDB, Express.js, Angular, Node.js) SzaboZs

Document databases: XML databases “XML Enabled” vs “native XML” databases The “big four” supports the XML/JSON type in CLOB fields Typically an XML enabled database is best suited where the majority of data are non-XML For datasets where the majority of data are XML/JSON, a native database is better suited. Native XML database: BDB – not really used any more... Defines a (logical) model for an XML document Has an XML document as its fundamental unit of (logical) storage not required to have any particular underlying physical storage model SzaboZs

Search engines RDBMS systems have been weak in varchar indexing Especially in fulltext indexing (list of words) In the past 5 years, things are getting better, but still… Elasticsearch, Splunk, Solr Elasticsearch in C#: NEST https://www.c-sharpcorner.com/article/getting-started-with-elasticsearch/ LINQ-similar methods & syntax SzaboZs

Document databases: JSON / MongoDB JSON-like documents (called BSON: Binary JSON) “Libbson expects that you are always working with UTF-8 encoded text” Ad hoc queries that can include user-defined JavaScript functions Indexing is similar to RDBMSes Replication/Load balancing: high availability with replica sets; one replica set = two or more copies of the data VERY efficient storage and operations for HIGHLY flexible data SzaboZs

document or BSON document SQL MongoDB database table collection row document or BSON document column field index table joins embedded documents and linking primary key (automatically set to the _id field) aggregation (e.g. group by) aggregation pipeline SzaboZs

MongoDB vs SQL SQL: CREATE TABLE MongoDB: Imlicitly created when inserting the first record db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) (later, we can insert users with more attributes) or: db.createCollection("users") SzaboZs

MongoDB vs SQL SQL: ALTER TABLE ... ADD ... MongoDB: db.users.update( { }, { $set: { join_date: new Date() } }, { multi: true } ) SzaboZs

MongoDB vs SQL SQL: ALTER TABLE ... DROP COLUMN ... MongoDB: db.users.update( { }, { $unset: { join_date: "" } }, { multi: true } ) SzaboZs

MongoDB vs SQL SQL: INSERT INTO / UPDATE / DELETE MongoDB: db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) db.users.remove( { status: "D" } ) db.users.update( { age: { $gt: 25 } }, { $set: { status: "C" } }, { multi: true } ) db.users.update( { status: "A" } , { $inc: { age: 3 } }, { multi: true } ) SzaboZs

MongoDB vs SQL SQL: SELECT ... FROM ... WHERE MongoDB: db.users.find() -- where + fields db.users.find().limit(5).skip(10) db.users.find( { }, { user_id: 1, status: 1 } ) db.users.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } ) db.users.find( { age: { $gt: 25, $lte: 50 } } ) db.users.find( { user_id: /^bc/ } ) SzaboZs

SQL MongoDB WHERE $match GROUP BY $group HAVING SELECT $project ORDER BY $sort LIMIT $limit SUM() $sum COUNT() join No direct corresponding operator; use $unwind for somewhat similar functionality SzaboZs

MongoDB vs SQL SQL: SELECT COUNT(*) ... MongoDB: db.users.count() or db.users.find().count() db.orders.aggregate( [ { $group: { _id: null, total: { $sum: "$price" } } } ] ) db.orders.aggregate( [ { $group: { _id: "$cust_id", total: { $sum: "$price" } } }, { $sort: { total: 1 } } ] ) SzaboZs

MongoDB Usually works well if bi-directional speed / reports are not that important Practice showed, that a RamDisk + LOAD DATA INFILE + MyISAM can still be faster than the thousands of insert into commands on a MongoDB (everything depends on the design: a badly designed db with a fast db engine is always SLOWER AND WORSE than a well-designed db with a slow db engine) When using NoSql, “jury rigging” is harder Must be very careful about the configuration of the individual components SzaboZs

NoSQL in C# Redis MongoDB: ServiceStack.Redis.Complete RedisClient.Set(key, value); RedisClient.Get<type>(key) MongoDB: MongoDB.Bson + MongoDB.Driver.Core + MongoDB.Driver Entity objects with [BsonElement] (code-first, with _id) MongoClient.GetDatabase(dbName) IMongoDatabase.CreateCollection(collName) IMongoDatabase.GetCollection<BsonDocument>(collName) IMongoCollection<Type>.InsertOne() / InsertMany() + …Async() https://www.codementor.io/pmbanugo/working-with-mongodb-in-net-1-basics-g4frivcvz - LINQ SUPPORT SzaboZs

SzaboZs

SzaboZs