Seminar: Deep Dive into Oracle NoSQL Technologies and Solutions Presenter: Zohar Elkayam, CTO, Brillix.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
A Survey of Distributed Database Management Systems Brady Kyle CSC
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1.
NoSQL Databases: MongoDB vs Cassandra
© 2011 Citrusleaf. All rights reserved.1 A Real-Time NoSQL DB That Preserves ACID Citrusleaf Srini V. Srinivasan Brian Bulkowski VLDB, 09/01/11.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Presentation by Krishna
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
NoSQL Database.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
An introduction to MongoDB Rácz Gábor ELTE IK, febr. 10.
Oracle’s Big Data solutions
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.
Databases with Scalable capabilities Presented by Mike Trischetta.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …
Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS Bill Devlin, Jim Cray, Bill Laing, George Spix Microsoft Research Dec
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE10 Mars 2014Epitech Nantes.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Distributed Data Stores and No SQL Databases S. Sudarshan Perry Hoekstra (Perficient) with slides pinched from various sources such as Perry Hoekstra (Perficient)
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Trade-offs in Cloud.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
IMDGs An essential part of your architecture. About me
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Databases Illuminated
Transaction-based Grid Data Replication Using OGSA-DAI Presented by Yin Chen February 2007.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Nov 2006 Google released the paper on BigTable.
NOSQL DATABASE Not Only SQL DATABASE
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Oracle Exalytics & Big Data Update November 8, 2011 Bob Stackowiak, VP Data Systems Architecture & BI, Oracle ESG.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
BIG DATA/ Hadoop Interview Questions.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
CPSC-310 Database Systems
CS 405G: Introduction to Database Systems
Trade-offs in Cloud Databases
Open Source distributed document DB for an enterprise
Maximum Availability Architecture Enterprise Technology Centre.
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Introduction to NewSQL
Introduction of Week 6 Assignment Discussion
Ministry of Higher Education
Massively Parallel Cloud Data Storage Systems
Taming the Big Data Fire Hose
Transaction Properties: ACID vs. BASE
Big DATA.
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Presentation transcript:

Seminar: Deep Dive into Oracle NoSQL Technologies and Solutions Presenter: Zohar Elkayam, CTO, Brillix

2 תוכנית יומית לשבוע אורקל 10:30-09:00 – הרצאה 10:45-10:30 – הפסקת קפה 12:30-10:45 – הרצאה 13:30-12:30 – ארוחת צהריים 14:50-13:30 – הרצאה 15:00-14:50 – הפסקה 16:30-15:00 – הרצאה

3 Agenda Introduction to Big Data and Oracle NoSQL Database – Oracle NoSQL Database Features – Oracle NoSQL Database Architecture Oracle NoSQL Database Planning and Installation Workflow Create and Configure KV Store Manage Memory Operations Optimizing KVStore Performance and Memory Sizing Setting Replication Node Policy Working with Storage Node Parameters Perform Store Backup and Recovery Diagnosing and Troubleshooting the KVStore

Big Data Technology

5 What is Big Data?

6 Volume

7 Velocity

8 Variety

9 And let’s remember our goal: Value!

How do we “do” Big Data? 10

Deep Analytics Agile Development Massive Scalability Real Time Results High Throughput In-Place Preparation All Data Sources/Structures Low, predictable Latency High Transaction Volume Flexible Data Structures Big Data: Infrastructure Requirements AcquireOrganizeAnalyze

Divided Solution Spectrum MapReduce Solutions DBMS (DW) DBMS (OLTP) Advanced Analytics Distributed File Systems Transaction (Key-Value) Stores ETL NoSQL Flexible Specialized Developer Centric SQL Trusted Secure Administered Dynamic Schema Data Variety Schema AcquireAnalyze Organize

Dynamic Schema Data Variety Schema Oracle Integrated Software Solution Stack AcquireAnalyze Organize Oracle Database (DW) Oracle Database (DW) Oracle Database (OLTP) Oracle Database (OLTP) In-DB Analytics “R” Mining Text Graph Spatial In-DB Analytics “R” Mining Text Graph Spatial Oracle BI EE Oracle BI EE Oracle NoSQL DB HDFS Hadoop Oracle Data Integrator Oracle Loader for Hadoop

Short Intro to NoSQL (and basic database theories…) 14

15 The Challenge RDBMS is too generic and doesn’t cut it any more We want scalable, durable, high volume, non-structured, distributed data storage that will fit our specific need.

16 The solution: NoSQL Databases Let’s take some parts of the standatd RDBMS out and replace it with things we actually need. NoSQL databases are designed for a specific uses NoSQL database has been around for ages under different names/solutions

17 Kinds of NoSQL Key-Value stores – Simple K/V lookups (DHT) Column stores – Each key is associated with many attributes (columns) – NoSQL column stores are actually hybrid row/column stores Document stores – Store semi-structured documents (JSON) – Map/Reduce based materialization, sorting, aggregation, etc. Graph databases – Not exactly NoSQL: can’t satisfy the requirements for High Availability and Scalability/Elasticity very well

18 What Is NoSQL Database? What does NoSQL stands for? Is it No SQL or “Not Only” SQL? What does ACID transaction mean? What is the CAP theorem?

19 ACID Transactions RDBMS are built with ACID transactions in mind: Atomicity: All or nothing Consistency: Any transaction will take the DB from one consistent state to another with no broken constraints Isolation: Other operations cannot access data that has been modified during a transaction that has not been completed yet Durability: Ability to recover the commited transaction updates against any kind of system failure (transaction log).

20 ACID Transactions (cont.) ACID is usually implemented by a locking mechanism/manager. Distributed systems central locking would be a bottleneck. Most NoSQL does not use the ACID transactions and replaces it with something else…

21 The CAP Theorem The CAP theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

22 The CAP Theorem properties Consistency – does all nodes see the same data at the same time? Availability – does the system guarantee that every request receives a response about whether it was successful or failed? Partition tolerance – does the system continues to operate despite arbitrary message loss or failure of part of the system?

23 The CAP Theorem (cont.) According to the theorem, a distributed system can satisfy any two of these guarantees at the same time, but not all three NoSQL are often designed to “give up” one of the CAP properties in order to get the ability to be distributed and therefor very scalable

24 C A P Consistency AvailabilityPartition-resilience CA: available, and consistent, unless there is a partition. AP: a reachable replica provides service even in a partition, but may be inconsistent if there is a failure. CP: always consistent, even in a partition, but a reachable replica may deny service without agreement of the others (e.g., quorum). Single site DB Cluster DB (RAC) Distributed DB DNS

What is Oracle NoSQL? 25

26 Oracle NoSQL Oracle NoSQL Database is: A key-value database Written in Java Accessible using Java APIs Built on Oracle Berkeley DB Java Edition The Oracle solution to acquiring big data

27 Benefits of Using Oracle NoSQL Database Oracle NoSQL Database offers the following benefits: It is easy to install and configure. It is highly reliable. It is a general-purpose database system. It has scalable throughput and predictable latency. It has configurable consistency and durability. It has a web console for administration.

28 Supported Data Types

Common uses for the Key-Value Store Large dynamic schema based data repositories Data capture Web applications Online retail Sensor/statistics/network capture/Mobile Devices Data services Scalable authentication Real-time communication (MMS, SMS, routing) Personalization / Localization Social Networks

Oracle NoSQL DB A distributed, scalable key-value database Simple Data Model Key-value pair with major+sub-key paradigm Read/insert/update/delete operations Scalability Dynamic data partitioning and distribution Optimized data access via intelligent driver High availability One or more replicas Disaster recovery through location of replicas Resilient to partition master failures No single point of failure Transparent load balancing Reads from master or replicas Driver is network topology & latency aware Storage Nodes Data Center A Storage Nodes Data Center B NoSQLDB Driver Application NoSQLDB Driver Application

Operation result New Partition Map RepNodeStorageTable information Operation result New Partition Map RepNodeStorageTable information Hash Major Key to determine Partition id Use Partition Map to map Partition id to a Rep Group Use State Table to determine eligible Storage Node(s) within Rep Group Use Load Balancer to select best eligible Rep Node Contact Rep Node directly Client Operation + Key[M,m] + Value + Transaction Policy

Oracle NoSQL DB Differentiation Commercial Grade Software and Support General-purpose Reliable – Based on proven Berkeley DB JE HA Easy to install and configure Scalable throughput, bounded latency Simple Programming and Operational Model Simple Major + Sub key and Value data structure ACID transactions Configurable consistency & durability Easy Management Web-based console, API accessible Manages and Monitors: Topology; Load; Performance; Events; Alerts Completes Oracle large scale data storage offerings

33 More stuff EE or CE? What to choose?

Database components 34

Oracle KVLite 35

Schema Considerations 36

Accessing and manipulating the data using JAVA API 37

Understanding consistency, transactions and versioning 38

Understanding Durability 39

Using the Admin console to configure the KVStore 40

Summary More info will be available on my blog: ZoharElkayam.wordpress.com ZoharElkayam.wordpress.com 41

Questions and Answers 42

Thank You! Zohar Elkayam