V 1.0 DBMAN 10 Non-traditional Databases 1.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

Management Information Systems, Sixth Edition
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
NoSQL Databases: MongoDB vs Cassandra
Midterm Review Lecture 14b. 14 Lectures So Far 1.Introduction 2.The Relational Model 3.Disks and Files 4.Relational Algebra 5.File Org, Indexes 6.Relational.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Working with SQL and PL/SQL/ Session 1 / 1 of 27 SQL Server Architecture.
A Social blog using MongoDB ITEC-810 Final Presentation Lucero Soria Supervisor: Dr. Jian Yang.
A Study in NoSQL & Distributed Database Systems John Hawkins.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Database Management System Lecture 2 Introduction to Database management.
Databases with Scalable capabilities Presented by Mike Trischetta.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Systems analysis and design, 6th edition Dennis, wixom, and roth
NoSQL continued CMSC 461 Michael Wilson. MongoDB  MongoDB is another NoSQL solution  Provides a bit more structure than a solution like Accumulo  Data.
WTT Workshop de Tendências Tecnológicas 2014
Goodbye rows and tables, hello documents and collections.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
NOSQL DATABASES Please remember to read the NOSQL Distilled book and the Seven Databases book.
Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Himanshu Grover.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Relational Databases Database Driven Applications Retrieving Data Changing Data Analysing Data What is a DBMS An application that holds the data manages.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
MongoDB is a database management system designed for web applications and internet infrastructure. The data model and persistence strategies are built.
XML and Database.
Introduction to MongoDB
Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Session 1 Module 1: Introduction to Data Integrity
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
NOSQL DATABASE Not Only SQL DATABASE
Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015.
Introduction to MongoDB. Database compared.
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
NoSQL databases A brief introduction NoSQL databases1.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
CMPE 226 Database Systems May 3 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.
Non-traditional Databases List Of Questions
CS 405G: Introduction to Database Systems
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
and Big Data Storage Systems
CS122B: Projects in Databases and Web Applications Winter 2017
Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
Modern Databases NoSQL and NewSQL
NOSQL.
CMPE 280 Web UI Design and Development October 17 Class Meeting
Dineesha Suraweera.
NOSQL databases and Big Data Storage Systems
NoSQL Systems Overview (as of November 2011).
1 Demand of your DB is changing Presented By: Ashwani Kumar
Non-traditional Databases
NoSQL Databases Antonino Virgillito.
Transaction Properties: ACID vs. BASE
Introduction to NoSQL Database Systems
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

V 1.0 DBMAN 10 Non-traditional Databases 1

V 1.0 Non-Traditional = schema-less Many implementations with each working very differently and serving a specific need Common aim: less complex storage, more flexibility, faster access Usually also: arbitrary type of relations, or even relation- less structures 2

V 1.0 Comparison Structure and type of data being kept: –SQL/Relational databases require a structure with defined attributes to hold the data, unlike NoSQL databases which usually allow free-flow operations. Querying: –Regardless of their licences, relational databases all implement the SQL standard to a certain degree and thus, they can be queried using the Structured Query Language (SQL). NoSQL databases, on the other hand, each implement a unique way to work with the data they manage. 3

V 1.0 Comparison Scaling: –Both solutions are easy to scale vertically (i.e. by increasing system resources). However, being more modern (and simpler) applications, NoSQL solutions usually offer much easier means to scale horizontally (i.e. by creating a cluster of multiple machines). Reliability: –When it comes to data reliability and safe guarantee of performed transactions, SQL databases are still the better bet. 4

V 1.0 Comparison Support: –Relational database management systems have decades long history. They are extremely popular and it is very easy to find both free and paid support. If an issue arises, it is therefore much easier to solve than recently-popular NoSQL databases -- especially if said solution is complex in nature (e.g. MongoDB). Complex data keeping and querying needs: –By nature, relational databases are the go-to solution for complex querying and data keeping needs. They are much more efficient and excel in this domain. 5

V 1.0 NoSQL advantages – “BIG DATA” VVVC = Velocity, Variety, Volume, Complexity High data velocity – lots of data coming in very quickly, possibly from different locations. Data variety – storage of data that is structured, semi- structured and unstructured. Data volume – data that involves many terabytes or petabytes in size. Data complexity – data that is stored and managed in different locations or data centers. 6

V 1.0 Typical “BIG DATA” requirements Continuous Data Availability & Real Location Independence & Better Architecture –Easy replication, because of the simpler data format Modern Transactional Capabilities (CAP vs ACID!) –Consistency (all nodes see the same data at the same time) –Availability (a guarantee that every request receives a response about whether it succeeded or failed) – Partition tolerance (the system continues to operate despite arbitrary partitioning due to network failures) –Pick TWO! 7

V 1.0 Typical “BIG DATA” requirements Flexible Data Models –A NoSQL database is able to accept all types of data – structured, semi-structured, and unstructured – much more easily than a relational database which rely on a predefined schema –Faster operations on semi-structured / unstructured data Analytics and Business Intelligence –ability to mine the data that is being collected –“real-time data warehousing”: no need for middle- layer: no group by and join queries NO NEED FOR GROUP BY AND JOIN? HOW? 8

V 1.0 Storage formats HTML = document descriptor XML, YAML = data descriptor JSON = object descriptor 9

V 1.0 Storage formats XML  many tools, fast and memory efficient, hard to read, hard to parse, XPATH is official YAML  few tools, 2x memory, easy to read, easy to parse JSON  more and more tools, 2x memory, hard to read, easy to parse, JSONPATH is unofficial 10

V 1.0 NoSQL types Graph database –Data is stored in graphs –OrientDB, Neo4J, etc. –Google: Linked data, Linked open data –RDF = Resource Description Framework When to use –Handling complex relational information –Modelling and handling classifications 11

V 1.0 NoSQL types Column store / wide-column stores –sounds like the inverse of a standard database –very high performance and a highly scalable architecture –Apache Cassandra  fastest When to use –Keeping unstructured, non-volatile information: –Scaling 12

V 1.0 NoSQL types Key-Value store –least complex, like Dictionary in C# –Memcached vs MemcacheDB (cache vs storage) –Oracle NoSQL Database (Eventually-Consistent) When to use –Caching –Queueing: –Distributing information / tasks –Keeping live information 13

V 1.0 NoSQL types Document database –expands on the basic idea of key-value stores –“documents” contain data (using a storage format!) and each document is assigned a unique key –Apache CouchDB, Lotus Notes –MongoDB –XML databases (DB2, SQL Server, Oracle, PostgreSQL) When to use –Nested information –JavaScript/programming language friendly 14

V 1.0 MongoDB features JSON-like documents (called BSON: Binary JSON) Ad hoc queries that can include user-defined JavaScript functions Indexing is similar to RDBMSes Replication/Load balancing: high availability with replica sets; one replica set = two or more copies of the data Biggest problem: only byte-wise comparison, no UTF8 order by / replace features! 15

V 1.0 SQLMongoDB database tablecollection row document or BSON document columnfield index table joins embedded documents and linking primary key (automatically set to the _id field) aggregation (e.g. group by)aggregation pipeline 16

V 1.0 MongoDB vs SQL SQL: CREATE TABLE MongoDB: Imlicitly created when inserting the first record db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) or: db.createCollection("users") 17

V 1.0 MongoDB vs SQL SQL: ALTER TABLE... ADD... MongoDB: db.users.update( { }, { $set: { join_date: new Date() } }, { multi: true } ) 18

V 1.0 MongoDB vs SQL SQL: ALTER TABLE... DROP COLUMN... MongoDB: db.users.update( { }, { $unset: { join_date: "" } }, { multi: true } ) 19

V 1.0 MongoDB vs SQL SQL: INSERT INTO / UPDATE / DELETE MongoDB: db.users.insert( { user_id: "abc123", age: 55, status: "A" } ) db.users.remove( { status: "D" } ) db.users.update( { age: { $gt: 25 } }, { $set: { status: "C" } }, { multi: true } ) db.users.update( { status: "A" }, { $inc: { age: 3 } }, { multi: true } ) 20

V 1.0 MongoDB vs SQL SQL: SELECT... FROM MongoDB: db.users.find() db.users.find().limit(5).skip(10) db.users.find( { }, { user_id: 1, status: 1 } ) db.users.find( { status: "A" }, { user_id: 1, status: 1, _id: 0 } ) db.users.find( { age: { $gt: 25, $lte: 50 } } ) db.users.find( { user_id: /^bc/ } ) 21

V SQLMongoDB WHERE$match GROUP BY$group HAVING$match SELECT$project ORDER BY$sort LIMIT$limit SUM()$sum COUNT()$sum join No direct corresponding operator; use $unwind for somewhat similar functionality

V 1.0 MongoDB vs SQL SQL: SELECT COUNT(*)... MongoDB: db.users.count() or db.users.find().count() db.orders.aggregate( [ { $group: { _id: null, total: { $sum: "$price" } } } ] ) db.orders.aggregate( [ { $group: { _id: "$cust_id", total: { $sum: "$price" } } }, { $sort: { total: 1 } } ] ) 23

V 1.0 MongoDB Usually works well if bi-directional speed / reports are not that important Practice showed, that a RamDisk + LOAD DATA INFILE + MyISAM is still faster than the thousands of insert into commands on a MongoDB With MongoDB, we lose the power of joins, and we lose the performance gain of group by queries... We have to be careful when using it! 24

V 1.0 XML databases “XML Enabled” vs “native XML” databases The “big four” supports the XML type in CLOB fields –Typically an XML enabled database is best suited where the majority of data are non-XML –For datasets where the majority of data are XML, a native XML database is better suited. Native XML database: BDB –Defines a (logical) model for an XML document –Has an XML document as its fundamental unit of (logical) storage –not required to have any particular underlying physical storage model 25

V 1.0 XML databases Querying documents via Xquery / Xpath Results are transformed via XSLT It is usually not widespread any more, so I am not talking about it more... 26

V 1.0 List of questions Types of data models Basic units of the RDBMS systems Types of relations between entities/tables in an RDBMS Types of keys in RDBMS tables Elements of an ER diagram Purpose of normalization, anomalies Normalization levels: base model.. BCNF Verification of dependency preserving decomposition Verification of lossless decomposition SQL: list of sub-languages, commands, dialects, suffixes 27

V 1.0 List of questions SQL: Purpose and usage of group by, CUBE, ROLLUP SQL: Types of table joins (inner, right outer, left outer, full outer) SQL: object types (table, view, user, role, cursor, procedure, function, trigger) SQL: Constraint types OLTP vs OLAP comparison: usage OLTP vs OLAP comparison: normalized structure vs star Transaction management, definition of the ACID principles Storage models of an RDBMS 28

V 1.0 List of questions RAID levels Heap file vs indexed file Index properties Relational algebra and calculus: approach, principle OOP vs RDBMS: examples to HAS-A, PART-OF, IS-A, INSTANCE-OF Mapping OOP inheritance to tables Principles of the ORM layer and the Repository Pattern NoSQL avantages/requirements NoSQL database types Document Storage formats 29

V 1.0 Never forget! “The requirement for consistency is the convergence of the coherency” 30

V

32