NoSQL, JSON and BSON Overview

Slides:



Advertisements
Similar presentations
Power BI Sites and Mobile BI. What You Will Learn Sharing and Collaboration Introducing Power BI Exploring Power BI Features and Services Partner Opportunities.
Advertisements

HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
A Fast Growing Market. Interesting New Players Lyzasoft.
NoSQL Databases: MongoDB vs Cassandra
In 10 minutes Mohannad El Dafrawy Sara Rodriguez Lino Valdivia Jr.
T Sponsors Paul Larsen Principal Program Manager, Microsoft Integrating cloud with existing IBM Systems BizTalk Summit 2015 – London ExCeL London | April.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Overview Distributed vs. decentralized Why distributed databases
Presentation by Krishna
Chapter 1 Introduction to Databases
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Passage Three Introduction to Microsoft SQL Server 2000.
A Study in NoSQL & Distributed Database Systems John Hawkins.
Cross Platform Mobile Backend with Mobile Services James
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Software Engineer, #MongoDBDays.
Overview of SQL Server Alka Arora.
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
Goodbye rows and tables, hello documents and collections.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Introduction to Hadoop and HDFS
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
WINDOWS AZURE STORAGE SERVICES A brief comparison and overview of storage services offered by Microsoft.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.
MongoDB is a database management system designed for web applications and internet infrastructure. The data model and persistence strategies are built.
Securely Synchronize and Share Enterprise Files across Desktops, Web, and Mobile with EasiShare on the Powerful Microsoft Azure Cloud Platform MICROSOFT.
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Senior Solutions Architect, MongoDB Inc. Massimo Brignoli #MongoDB Introduction to Sharding.
IAnywhere Solutions Mobile Computing on Linux Eyun Lindberg
ViaSQL Technical Overview. Viaserv, Inc. 2 ViaSQL Support for S/390 n Originally a VSE product n OS/390 version released in 1999 n Identical features.
Microsoft Azure and DataStax: Start Anywhere and Scale to Any Size in the Cloud, On- Premises, or Both with a Leading Distributed Database MICROSOFT AZURE.
JSON C# Libraries Parsing JSON Files “Deserialize” OR Generating JSON Files “Serialize” JavaScriptSerializer.NET Class JSON.NET.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
 Cloud Computing technology basics Platform Evolution Advantages  Microsoft Windows Azure technology basics Windows Azure – A Lap around the platform.
BIG DATA/ Hadoop Interview Questions.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
DreamFactory for Microsoft Azure Is an Open Source REST API Platform That Enables Mobilization of Data in Minutes across Frameworks and Storage Methods.
Microsoft Ignite /28/2017 6:07 PM
Course: Cluster, grid and cloud computing systems Course author: Prof
NO SQL for SQL DBA Dilip Nayak & Dan Hess.
and Big Data Storage Systems
Data Platform and Analytics Foundational Training
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Organizations Are Embracing New Opportunities
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
MongoDB Distributed Write and Read
Learning MongoDB ZhangGang
NOSQL.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
Twitter & NoSQL Integration with MVC4 Web API
NOSQL databases and Big Data Storage Systems
What is database? Types and Examples
CS5220 Advanced Topics in Web Programming Introduction to MongoDB
Database Management Systems
CMPE 280 Web UI Design and Development March 14 Class Meeting
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Server & Tools Business
Presentation transcript:

NoSQL, JSON and BSON Overview Informix Chat With The Lab NoSQL, JSON and BSON Overview John F. Miller III Lead Architect for Informix

Outline Technical Opportunities / Motivation Quick Overview of JSON Hybrid Storage and Application Hybrid Analytics What is Sharding? Programming Compatibility

New Era in Application Requirements Store data from web/mobile application in their native form New web applications use JSON for storing and exchanging information Very lightweight – write more efficient applications It is also the preferred data format for mobile application back-ends Move from development to production in no time! Ability to create and deploy flexible JSON schema Gives power to application developers by reducing dependency on IT Ideal for agile, rapid development and continuous integration

What is a NoSQL Document Store? Not Only SQL or NOt allowing SQL A non-relational database management systems Flexible schema Avoids join operations Scales horizontally Eventually consistent (no ACID) Good with distributing data and fast application development Provides a mechanism for storage and retrieval of data while providing horizontal scaling.

Partnership with IBM and MongoDB MongoDB and IBM announced a partnership in June 2013 There are many common use cases of interest addressed by the partnership Accessing JSON Data in DB2, Informix MongoDB using JSON query Schema-less JSON Data for variety of applications Making JSON data available for variety of applications Securing JSON Data IBM and MongoDB are collaborating in 3 areas: Open Governance: Standards and Open Source Technology areas of mutual interest Products and solutions

IBM Use Case Characteristics for JSON Application not constrained by fixed pre-defined schema Ability to handle a mix of structured and unstructured data Schema flexibility and development agility Rapid horizontal scalability Ability to add or delete nodes dynamically in the Cloud/Grid Application transparent elasticity Dynamic elasticity 24x7x365 availability Online maintenance operations Ability to upgrade hardware or software without down time Continuous availability Ability to handle thousands of users Typically millisecond response time Consistent low latency, even under high loads Commonly available hardware (Windows & Linux,…) Low cost infrastructure Ease of deployment Install, configure add to exiting environment in minutes Reduced administration and maintenance

Example of Supported JSON Types There are 6 types of JSON Values Example of each JSON type Mongo-specific JSON types in blue date { “Key”:”Value” "string":"John", "number":123.45, "boolean":true, "array":[ "a", "b", "c" ], "object: { "str":"Miller", "num":711 }, "value": NULL, "date": ISODate("2013-10-01T00:33:14.000Z") }

Basic Translation Terms/Concepts {"name":"John","age":21} {"name":"Tim","age":28} {"name":"Scott","age":30} Collection Name Age John 21 Tim 28 Scott 30 Key Value Document

Building a real Life Application

IOD Attendee Photo Application Allow conference attendee to take and share photo! Web application geared for smart devices allowing attendees to take and view photos View the most popular pictures View the pictures you took See what pictures are trending Allow users to ask for more information

Technology Highlights Create a hybrid application using NoSQL, traditional SQL, timeseries mobile web application Utilizing both JSON collections, SQL tables and timeseries Utilize IBM Dojo Mobile tools to build a mobile application Leverage new mongo client side drivers for fast application development and deployment Demonstrate scale-out using sharding with over 10 nodes Cloud based solution using SoftLayer Can be deployed on PureFlex or Amazon Cloud Provide real-time analytics on all forms of data Leverage existing popular analytic front-end IBM-Congos Utilize an in-memory columnar database accelerator to provide real- time trending analytics on data

Mobile Device Application Architecture IOD Photo App - UPLOAD Van Gogh tag Apache Web Server Photo Application IBM Dojo Mobile Informix Photo collection Informix JSON Listener User Table

Photo Application Schema TimeSeries NoSQL Collections activity_photos activity_data timeseries(photo_like) photos Data BSON Contacts Data BSON

Application Considerations Flexible Schema Photo meta-data varies from camera to camera A Picture and all its meta data are stored in-document Pictures are stored in a JSON collection Pre-processing on the phone ensures only reasonable size photos are sent over the network.

Example of Live JSON Photo Data JSON Data {"_id":ObjectId("526157c8112c2fe70cc06a75"), "Make":"NIKON CORPORA TION","Model":"NIKON D60","Orientation":"1","XResolution":"300"," YResolution":"300","ResolutionUnit":"2","Software":"Ver.1.00 ","D ateTime":"2013:05:15 19:46:36","YCbCrPositioning":"2","ExifIFDPoi nter":"216","ExposureTime":"0.005","FNumber":"7.1","ExposureProgr am":"Not defined","ISOSpeedRatings":"100", "Contrast":"Normal","Saturation":"Normal","Sharpness":"Normal", "SubjectDistanceRange":"Unknown","name":"DSC_0078.JPG","img_d ata":"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDABcQ

Why do you need hybrid access? Data model should not restrict Data Access

Access to relational tables & JSON Collections SQL API Standard ODBC, JDBC, .NET, OData, etc. Language SQL. MongoDB API (NoSQL) Mongo APIs for Java, Javascript, C++, C#, etc. Direct SQL Access. Dynamic Views Row types Mongo APIs for Java, Javascript, C++, C#, etc.

Ability for All Clients to Access All Data Models Informix 12.1 SQL APIs JDBC, ODBC SQL Tables MongoDB Drivers JSON Collections TimeSeries MQ Series 18

Enterprise replication + Flexible Grid + Sharding MongoAPI Accessing Both NoSQL and Relational Tables Mongo Application JSON JSON Informix IBM Wire Listener db.customer.find({state:”MO”}) db.partners.find({state:”CA”}) Access JSON Access Relational SELECT bson_new(bson, ‘{}’) FROM customer WHERE bson_value_lvarchar(bson,‘state’)=“MO” SELECT * FROM partners WHERE state=“CA” JSON Collections Tables Customer IDXs Distributed Queries Tables Relational Tables partners Logs IDXs Enterprise replication + Flexible Grid + Sharding

Informix Specific Advantages with Mongo Drivers Traditional SQL tables and JSON collections co-existing in the same database Using the MongoDB client drivers Query, insert, update, delete JSON collections Traditional SQL tables Timeseries data Join SQL tables to JSON collections utilizing indexes Execute business logic in stored procedures Provide a view of JSON collections as a SQL table Allows existing SQL tools to access JSON data Enterprise level functionality

Real Time Analytics Customer Issues Solution Several different models of data (SQL, NoSQL, TimeSeries/Sensor) NoSQL is not strong building relations between collections Most valuable analytics combine the results of all data models Most prominent analytic system written using standard SQL ETL & YAS (Yet Another System) Solution Enables common tools like Cognos Provide a mapping of the required data in SQL form

Analytics on a Hybrid Database Informix Photo collection SQL MongoAPI User Table

Photo Application SQL Mapping of NoSQL PHOTO Collection activity_photos activity_data timeseries(photo_like) photos Data BSON Contacts Data BSON

New Informix NoSQL/JSON Capabilities 24

Two New Data Types JSON and BSON Native JSON and BSON data types Index support for NoSQL data types Native operators and comparator functions allow for direct manipulation of the BSON data type Database Server seamlessly converts to and from JSON  BSON Character data  JSON

Indexing Supports B-Tree indexes on any key-value pairs. Typed indices could be on simple basic type (int, decimal,) Type-less indices could be created on BSON and use BSON type comparison Informix translates ensureIndex() to CREATE INDEX Informix translates dropIndex() to DROP INDEX Mongo Operation SQL Operation db.customers.ensureIndex( {orderDate:1, zip:-1}) CREATE INDEX IF NOT EXISTS v_customer_2 ON customer (bson_extract(data,‘orderDate') ASC, bson_extract(data,‘zip') DESC) USING BSON {orderDate:1},{unique:true}) CREATE UNIQUE INDEX IF NOT EXISTS v_customer_3 ON customer (bson_extract(data,'c1') ASC USING BSON

Sharding Dynamic Elasticity Rapid horizontal scalability Ability for the application to grow by adding low cost hardware to the solution Ability to add or delete nodes dynamically Ability rebalance the data dynamically Application transparent elasticity Sharding

Difference between Sharding Data VS Replication Shard Key state= “CA” Shard Key state= “WA” Sharding Replication Each node holds a portion of the data Hash Expression Same data on each node Inserted data is placed on the correct node Data is copied to all nodes Operations are shipped to applicable nodes Work on local copy and modification are propagated Shard Key state= “OR”

Scaling Out Using Sharded Inserts Shard Key state= “CA” Shard Key state= “WA” Row state = “OR” Insert row sent to your local shard Automatically forward the data to the proper shard Shard Key state= “OR” 29

Scaling Out Adding a Shard Shard Key state= “CA” Shard Key state= “WA” Shard Key state= “OR” Shard Key state= “NV” Send command to local node New shard dynamically added, data re-distributed (if required) Command Add Shard “NV”

Sharding is not for Data Availability Sharding is for growth, not availability Redundancy of a node provides high availability for the data Both Mongo and Informix allow for multiple redundant nodes Mongo refers to this as Replica Sets and the additional nodes slaves Informix refers to this as H/A, and additional secondary nodes Term Description Informix Term Shard A single node or a group of nodes holding the same data (replica set) Instance Replica Set A collection of nodes contain the same data HA Cluster Shard Key The field that dictates the distribution of the documents. Must always exist in a document. Sharded Cluster A group shards were each shard contains a portion of the data. Grid/Region Slave A server which contains a second copy of the data for read only processing. HA Secondary Server

Informix Secondary Servers Features of Informix secondary server: Provide high availability Can have one or more secondary servers Synchronous or asynchronous secondary servers Automatic promotion upon server failure Scale out Execute select Allow Insert/Update/Deletes on the secondary servers Secondary server can have their own disk or share disks with the master node Connection manager routes users connection based on policies and server availability

Informix NoSQL Cluster Architecture Overview Scaling in both directions Secondary server(s) provide HA and scaling Informix Shard 1 Informix/1 Secondary Disk or Diskless Secondary 1 Informix Shard 2 Informix/1 Secondary Disk or Diskless Secondary 2 Informix Shard 3 Informix/1 Secondary Disk or Diskless Secondary 3 Informix Shard 4 Informix/1 Secondary Disk or Diskless Secondary 4 Flexible Grid + Sharding

Provide mongodb Compatible Programming

Client Applications New Wire Protocol Listener supports existing MongoDB drivers Connect to MongoDB or Informix with same application! MongoDB native Client MongoDB web browser Mobile Applications MongoDB Wire Protocol MongoDB driver JDBC Driver IBM NoSQL Wire Protocol Listener Informix DB

MongoDB Application Driver Compatibly Ability to use any of the MongoDB client drivers and frameworks against the Informix Database Server Little to no change required when running MongoDB programs Informix listens on the same default port as mongo, no need to change. Leverage the different programming languages available Support Languages C Perl C# PHP Erland Python Java Ruby JavaScript Scala Node.js Community Languages ActionScript3 Erlang Lua Clojure Factor MatLab ColdFusion Fantom OCaml D F# Opa Dart Go Prolog Delphi Groovy R Entity Lisp Smalltalk << MORE >>

Ability for All Clients to Access All Data Models Informix SQLI Drivers MongoDB Drivers IBM DRDA Drivers Traditional SQL NoSQL - JSON TimeSeries MQ Series

MongoAPI Accessing Both NoSQL and Relational Tables Typically NoSQL does not involve transactions In many cases, a document update is atomic, but not the application statement Example 7 targeted for deletion, but only 4 are removed Informix-NoSQL provides transactions on all application statements Each server operation INSERT, UPDATE, DELETE, SELECT will automatically be committed after each operation. In Informix you can utilize multi-statement transactions Default isolation level is DIRTY READ All standard isolation level support

The Hybrid Solution Informix has the Best of Both Worlds Relational and non-relational data in one system NoSQL/MongoDB Apps can access Informix Relational Tables Distributed Queries Multi-statement Transactions Enterprise Proven Reliability Enterprise Scalability Enterprise Level Availability Informix provides the capability to leverage the abilities of both relational DBMS and document store systems.

Please Visit www.nosqldemo.com And have Fun

The Board of Directors of the International Informix Users Group (IIUG) announce the: 2014 IIUG Informix Conference April 27 – May 1, 2014 J.W. Marriott Hotel (Brickell) Miami, Florida, USA For more details visit the official conference web site www.iiug2014.org Registration is now open. Beat the price increase on January 15, 2014 and save $225!

Questions