MongoDB Introduction © 2014 - Zoran Maksimovic www.agile-code.comwww.agile-code.com.

Slides:

Advertisements

Similar presentations

Introduction to MongoDB

Advertisements

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.

Jennifer Widom NoSQL Systems Overview (as of November 2011 )

Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.

In 10 minutes Mohannad El Dafrawy Sara Rodriguez Lino Valdivia Jr.

BUSINESS DRIVEN TECHNOLOGY

CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.

Jeff Lemmerman Matt Chimento Medtronic Confidential 1 9th Annual CodeFreeze Symposium Medtronic Energy and Component Center.

A Social blog using MongoDB ITEC-810 Final Presentation Lucero Soria Supervisor: Dr. Jian Yang.

1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.

AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.

SQL vs NOSQL Discussion

Systems analysis and design, 6th edition Dennis, wixom, and roth

6-1 DATABASE FUNDAMENTALS Information is everywhere in an organization Information is stored in databases –Database – maintains information about various.

MongoDB An introduction. What is MongoDB? The name Mongo is derived from Humongous To say that MongoDB can handle a humongous amount of data Document.

WTT Workshop de Tendências Tecnológicas 2014

Goodbye rows and tables, hello documents and collections.

NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University

Methodological Foundations of Biomedical Informatics (BMSC-GA 4449) Himanshu Grover.

Information Systems Today (©2006 Prentice Hall) 3-1 CS3754 Class Note 12 Summery of Relational Database.

© Copyright 2013 STI INNSBRUCK

Introduction to Databases Trisha Cummings. What is a database? A database is a tool for collecting and organizing information. Databases can store information.

McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.

Logical Database Design Chapter 4 G. Green 1. Agenda Evolution of Data Models Chapter 1 pgs 25 – 28 Chapter 9 pgs 409 – 418 Relational Database Model.

Introduction to MongoDB

Lecture 8: Databases and Data Infrastructure CS 6071 Big Data Engineering, Architecture, and Security Fall 2015, Dr. Rozier.

NOSQL Implementation and examples Maciej Matuszewski.

NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...

Nov 2006 Google released the paper on BigTable.

NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.

NOSQL DATABASE Not Only SQL DATABASE

Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015.

Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.

NoSQL: Graph Databases. Databases Why NoSQL Databases?

Introduction to MongoDB. Database compared.

Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.

NoSQL databases A brief introduction NoSQL databases1.

CMPE 226 Database Systems May 3 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak

Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.

CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.

Data Tier Options NWEN304 Advanced Network Applications.

Group members: Phạm Hoàng Long Nguyễn Huy Hùng Lê Minh Hiếu Phan Thị Thanh Thảo Nguyễn Đức Trí 1 BIG DATA & NoSQL Topic 1:

Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,

1 Analysis on the performance of graph query languages: Comparative study of Cypher, Gremlin and native access in Neo4j Athiq Ahamed, ITIS, TU-Braunschweig.

COMP 430 Intro. to Database Systems MongoDB. What is MongoDB? “Humongous” DB NoSQL, no schemas DB Lots of similarities with SQL RDBMs, but with more flexibility.

Dive into NoSQL with Azure Niels Naglé Hylke Peek.

NoSql An alternative option in the DevEvenings ORM Smackdown Tarn Barford

2 Phase Commit Protocol In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC) is a type of atomic commitment.

CS 405G: Introduction to Database Systems

NO SQL for SQL DBA Dilip Nayak & Dan Hess.

NoSQL: Graph Databases

and Big Data Storage Systems

CSE 775 – Distributed Objects Bekir Turkkan & Habib Kaya

Introduction In the computing system (web and business applications), there are enormous data that comes out every day from the web. A large section of.

MongoDB Er. Shiva K. Shrestha ME Computer, NCIT

CMPE 280 Web UI Design and Development October 17 Class Meeting

Christian Stark and Odbayar Badamjav

NOSQL databases and Big Data Storage Systems

Massively Parallel Cloud Data Storage Systems

NOSQL and CAP Theorem.

NoSQL Databases Antonino Virgillito.

NoSQL Not Only SQL University of Kurdistan Faculty of Engineering

CSE 482 Lecture 5: NoSQL.

relational thoughts on NoSql

Introduction to NoSQL Database Systems

CMPE 280 Web UI Design and Development March 14 Class Meeting

NoSQL databases An introduction and comparison between Mongodb and Mysql document store.

Presentation transcript:

MongoDB Introduction © Zoran Maksimovic

MongoDB is a scalable, high- performance, open source, schema-free schema-free, document-oriented database © Zoran Maksimovic

History First developed (by 10gen) Become Open Source Considered production ready (v 1.4 > ) MongoDB Closes $150 Million in Funding Latest stable version (v 2.6) Today- More than $231 million in total investment since 2007 MongoDB inc. valuated $1.2B. © Zoran Maksimovic

© Zoran Maksimovic

NoSQL Breakdown NoSQL encompasses a wide variety of different database technologies and were developed in response to a rise in the volume of data Document databases pair each key with a complex data structure known as a document (MongoDB, Couchbase Server, CouchDB ) Key-value stores are the simplest NoSQL databases. Every single item in the database is stored as an attribute name (or "key"), together with its value (DynamoDB, Windows Azure Table Storage, Riak, Redis, LevelDB, Dynomite ) Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows. Graph stores are used to store information about networks, such as social connections. Graph stores include Neo4J and HyperGraphDB. © Zoran Maksimovic

NoSQL made by big vendors Oracle NoSQL Database (Key-Value store) Microsoft Azure Table Storage (Key-Value store) Google: BigTable (proprietary) Google: LevelDB (Open Source key-value store) Amazon: SimpleDB (Wide Column store) Amazon: DynamoDB (Key-Value store) Apache: HBase, Riak, … Facebook: Cassandra (Wide column store) © Zoran Maksimovic

MongoDB in a nutshell Document-Oriented Storage » JSON-style documents with dynamic schemas offer simplicity and power. Full Index Support »Index on any attribute, just like you're used to. Replication & High Availability » Mirror across LANs and WANs for scale and peace of mind. Auto-Sharding » Scale horizontally without compromising functionality. Querying » Rich, document-based queries. Fast In-Place Updates »Atomic modifiers for contention-free performance. Map/Reduce »Flexible aggregation and data processing. GridFS »Store files of any size without complicating your stack. MongoDB Management Service »Monitoring and backup designed for MongoDB. Professional Support by MongoDB »Enterprise class support, training, and consulting available. © Zoran Maksimovic

MongoDB is a Document oriented database Think of “documents” as database records. No Schema! Documents are basically just JSON objects that Mongo stores in binary (BSON) format © Zoran Maksimovic

MongoDB database structure © Zoran Maksimovic

Embedded Data Model © Zoran Maksimovic When to use: “contains” relationships between entities. one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents. Retrieving data in one query Data redundancy.

Document oriented database – Normalized data model May, Zoran Maksimovic When to use: When embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication. To represent more complex many-to-many relationships. To model large hierarchical data sets. Multiple queries!

Indexing All indexes in MongoDB are B-Tree indexes Index Types: Single field index Compound Index: more than one field in the collection Multikey index: index on array fields Geospatial index and queries. Text index: Index TTL index: (Time to live) index will contain entities for a limited time. Unique index: the entry in the field has to b unique. Sparse index: stores an index entry only for entities with the given field. © Zoran Maksimovic

Security Authentication: MongoDB’s default UserName/Password authentication x509 certificate authentication LDAP proxy authentication Kerberos authentication Authorization Role based access control © Zoran Maksimovic

Replication Replication provides redundancy and increases data high availability © Zoran Maksimovic

Sharding (Horizontal scaling) Sharding is a method for storing data across multiple machines When HDD, CPU or RAM limits are reached. Vertical Scaling vs Horizontal Scaling. Range based vs Hash based sharding © Zoran Maksimovic

How to access MongoDB? Drivers: Administration interfaces: © Zoran Maksimovic

C# code example var connectionString = "mongodb://localhost"; var client = new MongoClient(connectionString); var server = client.GetServer(); var database = server.GetDatabase("test"); Entity var collection = database.GetCollection ("entities"); //insert a new entity var entity = new Entity { Name = "Tom" }; collection.Insert(entity); var id = entity.Id; //Retrieve var query = Query.EQ(e => e.Id, id); entity = collection.FindOne(query); //Save (Update) -> Sends the full content of the entity to be updated. entity.Name = “Nick"; collection.Save(entity); //Update -> Sends partial content of the entity to be updated. var update = Update.Set(e => e.Name, "Harry"); collection.Update(query, update); //Deleting the entity collection.Remove(query); public class Entity { public ObjectId Id { get; set; } public string Name { get; set; } } { _id: “ ”, Name: “Tom” } { _id: “ ”, Name: “Nick” } { _id: “ ”, Name: “Nick” } © Zoran Maksimovic

Some of the MongoDB Shell methods db.inventory.find( { type: "snacks" } ) db.inventory.find( { type: 'food', price: { $lt: 9.95 } } ) db.inventory.insert ( { _id: 10, type: "misc", item: "card", qty: 15 } ) db.inventory.find( { type: 'food' } ).explain() { "cursor": "BtreeCursor type_1", "isMultiKey": false, "n": 5, "nscannedObjects": 5, "nscanned": 5, "nscannedObjectsAllPlans": 5, "nscannedAllPlans": 5, "scanAndOrder": false, "indexOnly": false, "nYields": 0, "nChunkSkips": 0, "millis" : 0, "indexBounds": { "type" : [ [ "food", "food" ] ] }, "server": "mongodbo0.example.net:27017" } © Zoran Maksimovic

What is missing (from the RDBMS perspective) No JOINS support No complex transaction support No constrains support (have to be implemented at the application level) © Zoran Maksimovic

Where/When to use? A main drivers: Big amount of data (Twitter: ~12TB of data per day!) Develop more easily (according to surveys)! impedance mismatch problem! In general: Content Management and Delivery: serve content, as well as the associated metadata (attachments, images, binary) Big Data too diverse, fast-changing, or massive… These include a wide variety of apps such as genomics, clickstream analysis, customer Sentiment analysis, log data collection etc… Analytics and Reporting (data warehouse) Market Data Management © Zoran Maksimovic

Problems Maturity!!! Skillset? Organizational change? What’s about the future? © Zoran Maksimovic

Q&A © Zoran Maksimovic