Non-traditional Databases

Non-traditional Databases
DBMAN 10 Non-traditional Databases SzaboZs

Non-Traditional = schema-less
Many implementations with each working very differently and serving a specific need Common aim: less complex storage, more flexibility, faster access, server clustering Usually also: arbitrary type of relations, or even relation-less structures NoSql = Not Sql = Not Only Sql SzaboZs

Comparison Structure and type of data being kept:
SQL/Relational databases require a structure with defined attributes to hold the data, unlike NoSQL databases which usually allow free-flow operations. Querying: Regardless of their licences, relational databases all implement the SQL standard to a certain degree and thus, they can be queried using the Structured Query Language (SQL). NoSQL databases, on the other hand, each implement a unique way to work with the data they manage. SzaboZs

Comparison Scaling: Both solutions are easy to scale vertically (i.e. by increasing system resources). However, being more modern (and simpler) applications, NoSQL solutions usually offer much easier means to scale horizontally (i.e. by creating a cluster of multiple machines). Reliability: When it comes to data reliability and safe guarantee of performed transactions, SQL databases are still the better bet. SzaboZs

Comparison Support: Relational database management systems have decades long history. They are extremely popular and it is very easy to find both free and paid support. If an issue arises, it is therefore much easier to solve than recently-popular NoSQL databases -- especially if said solution is complex in nature (e.g. MongoDB). Complex data keeping and querying needs: By nature, relational databases are the go-to solution for complex querying and data keeping needs. They are much more efficient and excel in this domain. SzaboZs

NoSQL advantages – “BIG DATA” – VVVC
Velocity, Variety, Volume, Complexity High data velocity – lots of data coming in very quickly, possibly from different locations. Data variety – storage of data that is structured, semi-structured and unstructured. Data volume – data that involves many terabytes or petabytes in size. Data complexity – data that is stored and managed in different locations or data centers. SzaboZs

Typical “BIG DATA” use cases
Flexible Data Models A NoSQL database is able to accept all types of data – structured, semi-structured, and unstructured – much more easily than a relational database which rely on a predefined schema Faster operations on semi-structured / unstructured data Analytics and Business Intelligence Ability to mine the data that is being collected “real-time data warehousing”: no need for middle-layer: no group by and join queries SzaboZs

Typical “BIG DATA” requirements – CAP
Modern Transactional Capabilities (vs ACID!) Consistency (all nodes see the same data at the same time – not the same as the “C” in ACID ) Availability (a guarantee that every request receives a response, even if not up-to date) Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures) P is a minimum  in case of a communication error (partition) we usually must choose between A and C Some people say that CA without P = RDBMS, C+A+P = NoSQL SzaboZs

Typical “BIG DATA” requirements - PACELC
In case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) Else (when no partitioning occurs: E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C) SzaboZs

NoSQL types Key-value Graph database Wide column Document storage
(Search engines) LOTS of various implementation, extremely rapidly changing features and applications ... Especially with document databases and the chaotic world of JS in the past 2-3 years: “How it feels to learn JavaScript in 2016” SzaboZs

NoSQL types Key-Value store
least complex, like Dictionary<string, string> in C# Memcached vs MemcacheDB (cache vs storage), REDIS Oracle NoSQL Database (Eventually-Consistent) When to use Caching Queueing: Distributing information / tasks Keeping live information SzaboZs

NoSQL types Graph database Data is stored in graphs
OrientDB, Neo4J, etc. Google: Linked data, Linked open data RDF = Resource Description Framework When to use Handling complex relational information Modelling and handling classifications SzaboZs

NoSQL types Column store / wide-column stores
sounds like the inverse of a standard database very high performance and a highly scalable architecture Apache Cassandra  fastest When to use Keeping unstructured, non-volatile information: Scaling SzaboZs

Document Storage formats
HTML = document descriptor, bad for storage, not strict XML, YAML = data descriptor, strict format JSON = object descriptor, strict format SzaboZs

Storage formats XML  many tools, fast and memory efficient, hard to read, hard to parse, XPATH is official YAML  few tools, 2x memory, easy to read, easy to parse JSON  more and more tools, 2x memory, hard to read, easy to parse, JSONPATH is unofficial SzaboZs

NoSQL types Document database
expands on the basic idea of key-value stores “documents” contain data (using a storage format!) and each document is assigned a unique key Apache CouchDB, Lotus Notes JSON: MongoDB XML and JSON field formats exist in most modern relational databases When to use Nested information JSON: JavaScript/programming language friendly (MEAN = MongoDB, Express.js, Angular, Node.js) SzaboZs

Document databases: XML databases
“XML Enabled” vs “native XML” databases The “big four” supports the XML/JSON type in CLOB fields Typically an XML enabled database is best suited where the majority of data are non-XML For datasets where the majority of data are XML/JSON, a native database is better suited. Native XML database: BDB – not really used any more... Defines a (logical) model for an XML document Has an XML document as its fundamental unit of (logical) storage not required to have any particular underlying physical storage model SzaboZs

Search engines RDBMS systems have been weak in varchar indexing
Especially in fulltext indexing (list of words) In the past 5 years, things are getting better, but still… Elasticsearch, Splunk, Solr Elasticsearch in C#: NEST LINQ-similar methods & syntax SzaboZs

Document databases: JSON / MongoDB
JSON-like documents (called BSON: Binary JSON) “Libbson expects that you are always working with UTF-8 encoded text” Ad hoc queries that can include user-defined JavaScript functions Indexing is similar to RDBMSes Replication/Load balancing: high availability with replica sets; one replica set = two or more copies of the data VERY efficient storage and operations for HIGHLY flexible data SzaboZs

document or BSON document
SQL MongoDB database table collection row document or BSON document column field index table joins embedded documents and linking primary key (automatically set to the _id field) aggregation (e.g. group by) aggregation pipeline SzaboZs

MongoDB vs SQL SQL: CREATE TABLE
MongoDB: Imlicitly created when inserting the first record db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) (later, we can insert users with more attributes) or: db.createCollection("users") SzaboZs

MongoDB vs SQL SQL: ALTER TABLE ... ADD ...
MongoDB: db.users.update( { }, { $set: { join_date: new Date() } }, { multi: true } ) SzaboZs

MongoDB vs SQL SQL: ALTER TABLE ... DROP COLUMN ...
MongoDB: db.users.update( { }, { $unset: { join_date: "" } }, { multi: true } ) SzaboZs

MongoDB vs SQL SQL: INSERT INTO / UPDATE / DELETE
MongoDB: db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) db.users.remove( { status: "D" } ) db.users.update( { age: { $gt: 25 } }, { $set: { status: "C" } }, { multi: true } ) db.users.update( { status: "A" } , { $inc: { age: 3 } }, { multi: true } ) SzaboZs

MongoDB vs SQL SQL: SELECT ... FROM ... WHERE
MongoDB: db.users.find() -- where + fields db.users.find().limit(5).skip(10) db.users.find( { }, { user_id: 1, status: 1 } ) db.users.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } ) db.users.find( { age: { $gt: 25, $lte: 50 } } ) db.users.find( { user_id: /^bc/ } ) SzaboZs

SQL MongoDB WHERE $match GROUP BY $group HAVING SELECT $project
ORDER BY $sort LIMIT $limit SUM() $sum COUNT() join No direct corresponding operator; use $unwind for somewhat similar functionality SzaboZs

MongoDB vs SQL SQL: SELECT COUNT(*) ... MongoDB:
db.users.count() or db.users.find().count() db.orders.aggregate( [ { $group: { _id: null, total: { $sum: "$price" } } } ] ) db.orders.aggregate( [ { $group: { _id: "$cust_id", total: { $sum: "$price" } } }, { $sort: { total: 1 } } ] ) SzaboZs

MongoDB Usually works well if bi-directional speed / reports are not that important Practice showed, that a RamDisk + LOAD DATA INFILE + MyISAM can still be faster than the thousands of insert into commands on a MongoDB (everything depends on the design: a badly designed db with a fast db engine is always SLOWER AND WORSE than a well-designed db with a slow db engine) When using NoSql, “jury rigging” is harder Must be very careful about the configuration of the individual components SzaboZs

NoSQL in C# Redis MongoDB: ServiceStack.Redis.Complete
RedisClient.Set(key, value); RedisClient.Get<type>(key) MongoDB: MongoDB.Bson + MongoDB.Driver.Core + MongoDB.Driver Entity objects with [BsonElement] (code-first, with _id) MongoClient.GetDatabase(dbName) IMongoDatabase.CreateCollection(collName) IMongoDatabase.GetCollection<BsonDocument>(collName) IMongoCollection<Type>.InsertOne() / InsertMany() + …Async() - LINQ SUPPORT SzaboZs

SzaboZs

Non-traditional Databases

Similar presentations

Presentation on theme: "Non-traditional Databases"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Non-traditional Databases

Similar presentations

Presentation on theme: "Non-traditional Databases"— Presentation transcript:

Similar presentations

About project

Feedback