NoSQL continued CMSC 461 Michael Wilson. MongoDB  MongoDB is another NoSQL solution  Provides a bit more structure than a solution like Accumulo  Data.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

CS525: Special Topics in DBs Large-Scale Data Management MapReduce High-Level Langauges Spring 2013 WPI, Mohamed Eltabakh 1.
1 More MongoDB: Ch 3- 8, plus a little Hadoop CSSE 533 Week 2, Spring, 2015.
Relational Database Alternatives NoSQL. Choosing A Data Model Relational database underpin legacy applications and meet business needs However, companies.
Evaluation of distributed open source solutions in CERN database use cases HEPiX, spring 2015 Kacper Surdy IT-DB-DBF M. Grzybek, D. L. Garcia, Z. Baranowski,
Hive: A data warehouse on Hadoop
CS525: Big Data Analytics MapReduce Languages Fall 2013 Elke A. Rundensteiner 1.
What is MongoDB? Developed by 10gen It is a NoSQL database A document-oriented database It uses BSON format.
A Social blog using MongoDB ITEC-810 Final Presentation Lucero Soria Supervisor: Dr. Jian Yang.
Hive: A data warehouse on Hadoop Based on Facebook Team’s paperon Facebook Team’s paper 8/18/20151.
Database Lecture # 1 By Ubaid Ullah.
CPSC 203 Introduction to Computers T59 & T64 By Jie (Jeff) Gao.
Oracle Data Block Oracle Concepts Manual. Oracle Rows Oracle Concepts Manual.
MongoDB An introduction. What is MongoDB? The name Mongo is derived from Humongous To say that MongoDB can handle a humongous amount of data Document.
Database Solutions for Storing and Retrieving XML Documents.
1 Overview of Databases. 2 Content Databases Example: Access Structure Query language (SQL)
Database Technical Session By: Prof. Adarsh Patel.
Introduction to SQL Steve Perry
Hive : A Petabyte Scale Data Warehouse Using Hadoop
Getting Biologists off ACID Ryan Verdon 3/13/12. Outline Thesis Idea Specific database Effects of losing ACID What is a NoSQL database Types of NoSQL.
WTT Workshop de Tendências Tecnológicas 2014
Databases. Database A database is an organized collection of related data.
Penwell Debug Intel Confidential BRIEF OVERVIEW OF HIVE Jonathan Brauer ESE 380L Feb
Hive Facebook 2009.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
© Copyright 2013 STI INNSBRUCK
An Introduction to HDInsight June 27 th,
DAY 12: DATABASE CONCEPT Tazin Afrin September 26,
Object Persistence (Data Base) Design Chapter 13.
When bet365 met Riak and discovered a true, “always on” database.
® Microsoft Access 2010 Tutorial 9 Using Action Queries and Advanced Table Relationships.
Database Management Systems.  Database management system (DBMS)  Store large collections of data  Organize the data  Becomes a data storage system.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Database Objective Demonstrate basic database concepts and functions.
NOSQL Implementation and examples Maciej Matuszewski.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
MongoDB First Light. Mongo DB Basics Mongo is a document based NoSQL. –A document is just a JSON object. –A collection is just a (large) set of documents.
Session 1 Module 1: Introduction to Data Integrity
Nov 2006 Google released the paper on BigTable.
Impala. Impala: Goals General-purpose SQL query engine for Hadoop High performance – C++ implementation – runtime code generation (using LLVM) – direct.
Some notes on NoSQL, in particular MongoDB Bettina Berendt (with thanks to Matthijs van Leeuwen for some of the slides) 8 December 2015.
Chapter 3: Relational Databases
Introduction to MongoDB. Database compared.
NoSQL databases A brief introduction NoSQL databases1.
Database Overview What is a database? What types of databases are there? How are databases more powerful than spreadsheets?
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
SQL Basics Review Reviewing what we’ve learned so far…….
1 Section 1 - Introduction to SQL u SQL is an abbreviation for Structured Query Language. u It is generally pronounced “Sequel” u SQL is a unified language.
Introduction to Database Programming with Python Gary Stewart
Understanding Core Database Concepts Lesson 1. Objectives.
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
and Big Data Storage Systems
Database Access with SQL
Table spaces.
Hadoop.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
NOSQL.
ITD1312 Database Principles Chapter 5: Physical Database Design
Dineesha Suraweera.
javascript for your data
NOSQL databases and Big Data Storage Systems
Database Management  .
MongoDB for Developers
MongoDB for SQL Developers
MS Access Database Connection
CSE 482 Lecture 5: NoSQL.
Creating and Managing Database Tables
Understanding Core Database Concepts
Presentation transcript:

NoSQL continued CMSC 461 Michael Wilson

MongoDB  MongoDB is another NoSQL solution  Provides a bit more structure than a solution like Accumulo  Data is stored as BSON (Binary JSON)  Binary encoded JSON, extends JSON  Allows storage of large amounts of data

SQL vs. MongoDB  SQL has databases, tables, rows, columns  Monbo has databases, collections, documents, fields  Both have primary keys, indexes  Collection structures are not enforced heavily  Inserts automatically create schemas

Interacting with MongoDB  Multiple databases within MongoDB  Switch databases  use newDb  New databases will be stored after an insert  Create collection  db.createCollection(“collectionName”)  Not necessary, collections are implicitly created on insert

BSON  MongoDB uses BSON very heavily  Binary JSON  Like JSON with a binary serialization method  Has extensions so that it can represent data types that JSON cannot  Used to represent documents, provide input to queries

Selects/queries  In MongoDB, querying typically consists of providing an appropriately crafted BSON  SELECT * FROM collectionName  db.collectionName.find()  SELECT * FROM collectionName WHERE field = value  db.collectionName.find( {field: value} )  SELECT * FROM collectionName WHERE field > 5  db.collectionName.find( {field: {$gt: 5} } )  Other functions that take a query argument have queries that are formatted this way

Interacting with MongoDB  Insert  db.collectionName.insert( {queryBSON} )  Update  db.collectionName.update( {queryBSON}, {updateBSON}, {optionBSON} )  updateBSON  Set field to 5: {$set: {field: 5}}  Increment field by 1 {$inc: {field: 1}}  optionBSON  Options that determine whether or not to create new documents, update more than one document, write concerns

Interacting with MongoDB  Delete  db.collectionName.remove( {queryBSON} )

Apache Hive  Also runs on Hadoop, uses HDFS as a data store  Queryable like SQL  Using an SQL-inspired language, HiveQL

Hive data organization  Databases  Tables  Partitions  Tables are broken down into partitions  Partition keys allow data to be stored into separate data files on HDFS  Can query on particular partitions  Buckets  Can bucket by column to sample data

Purpose of Hive  Provide analytics, query large volumes of data  NOT to be used for real time queries like Postgres or Oracle  Hive queries take forever  Partitions and buckets can help reduce this amount of time

Hive queries  Hive queries actually generate MapReduce jobs  MapReduce jobs take a while to set up and run  MapReduce jobs can be run manually, but for structured data and analytics, Hive can be used