Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Globally Distributed Data Management How Oracle NoSQL Database addresses the challenge Ashok Joshi Senior Director, Oracle NoSQL Database development
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. 3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle NoSQL Database Who uses it (and why) Key concepts of distributed systems Datacenter Scenarios 4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | What is Oracle NoSQL Database ReliableFlexibleFastSimple advanced Key-Value database designed as cost effective, high performance solution for simple operations on collections of data with built in high availability and elastic scale-out. less is more
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle NoSQL Database For Developers and IT 6 Setup, Admin, API & Integration Built-in HA, Predictable Performance Parallel Access & Scale-out Simple: Fast: Flexible: Reliable: Flexible schema & Agile development
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Key-Value Features – Elastic – BASE Operations – Tables / JSON / Binary – Online management Differentiators – ACID transactions – Online rolling upgrades – Oracle technology integrated – Engineered systems – Streaming large object support Feature overview 7 Application Storage Nodes Datacenter B Storage Nodes Datacenter A Application NoSQL DB Driver Application NoSQL DB Driver Application – Data Center Support – Secondary Indexes – Secure Access – Flexible schema
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Query NoSQL data from Oracle Database Access NoSQL data from Hadoop for DW and analytics Share data with Oracle Coherence for extensible in-memory cache grid Persist history & event streams for processing with Oracle Event Processing Store & query RDF data using Oracle RDF for NoSQL Replicate changes in Oracle Database to NoSQL DB using Oracle Golden Gate Monitor your NoSQL cluster using Oracle Enterprise Manager Enterprise ready -- Integrated out of the box 8
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Big Data Management System 9 Data WarehouseData Reservoir + Oracle Big Data Connectors Oracle Big Data SQL Oracle Advanced Analytics Oracle Database Oracle Spatial & Graph Cloudera Hadoop Oracle R Distribution Oracle NoSQL Database Oracle Industry Models Oracle GoldenGate Oracle Data Integrator Oracle Event Processing Apache Flume Oracle GoldenGate Oracle Advanced Analytics Oracle Database Oracle Spatial & Graph Oracle Industry Models
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | YCSB on SSD-backed commodity servers – 1.25M ops/sec – 2 billion records – 2 TB of data – 95% read, 5% update – Low latency, High Scalability 10 What’s the big deal – Twitter sees ~500M tweets/day This is 6K tweets/sec Capture all tweets with 3 commodity servers
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle NoSQL Database Who uses it (and why) Key concepts of distributed systems Datacenter Scenarios 11
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | NoSQL for Fraud Scoring Financial Services coordinated theft prevention Objectives Solution Benefits Application Data Ingestion Transaction Authorization Processor Transaction Authorization Processor Combine data sources for complex scoring Detect, alert analyst with low latency Handle burst seasonal transaction volumes Oracle Coherence cluster for real time transaction object management Oracle NoSQL Database for fraud model and customer profile management Oracle Database for statistics and fraud modeling-related data Simple data model, ACID transactions Scalability, Reliability, Low Latency Elasticity of sharded data repositories Easy configuration and administration NoSQL DB Driver
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Benefits Scalable customer portal Predictable performance for all operations Reduced time to market Easy application evolution NoSQL for Customer Loyalty Coupon redemption, Vendor recommendation Objectives Scalable customer loyalty portal New multi-channel consumer model Improve operational efficiency Solution Personalized multi-channel coupon generation and redemption Cross-promote affiliated vendors Scale system with customers and participating retailers NoSQL DB Driver Application Retail Partners Customer Profiles End Customers Available Coupons Market Segmentation
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | NoSQL for Social, Online Betting Real-time, In-Play Gaming Platform Objectives Scalable in-play sports betting platform Increase new business revenue Improve operational efficiency Solution Match in-play bets with incoming events Promote interaction between customers Scale system with customers and events Feeds MySQL database for revenue tracking and operational reporting James Anthony Chief Technology Officer Passoker “Oracle NoSQL Database enabled the rapid, scalable processing of incoming XML, ensuring high available and guaranteed event ordering.” NoSQL DBMySQL Accounting & Operations Event Capture & Store Customers Real-Time, In-Play Sports Betting Providers XML App
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | NoSQL for Sensor Event Storage & Processing Large scale sensor data capture and analysis Objectives Solution Benefits Increase scalability of data storage Deliver h igher concurrency a nalytic d ata a ccess Scale data loading independently from analysis Commercial support for mission critical system Oracle NoSQL database for high speed storage and range based extraction of time series data. Oracle NoSQL database for agile schema, replaced HDF5 storage format, kept analysis client program Oracle Big Data Appliance for efficient manageability and lowest TCO Hadoop post processing and RDBMS connectivity to Enterprise systems Improve scale of storage for flight test sensor data Increase concurrency of access to data for analysis Improve system availability for analysts by allowing simultaneous data ingestion and analysis B ig Data Appliance NoSQL DB Driver Event Ingestion and Extraction NoSQLDB/ Oracle RDBMS Hadoop/ Oracle RDBMS Oracle or Any third parties SQL/Data Analytics Tools
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | NoSQL for Web Commerce Customer Service Call center routing and context retrieval Objectives Solution Benefits Web commerce customer service Enable call center routing and dispatch Improved product up-sell and cross-sell Oracle NoSQL database for customer profile data capture and access Build repository for unstructured and variable data record formats Deploy distributed database for world-wide access Highly scalable and available database Flexible data formats Transactional key-value access Geographically distributed access NoSQL DB Driver Application Customer Service Customer Profiles End Customers Click-2-Call
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle NoSQL Database Who uses it (and why) Key Concepts of distributed systems Datacenter Scenarios 17
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Distributed Systems: benefits and challenges Benefits: – Improved availability (multiple copies) and disaster resilience – Improved scalability and performance Sharding Read from most appropriate copy (consistency, latency, …) – Better TCO Challenges – Propagation of changes to all copies Server semantics of durability Application semantics for updates and reads – Network/machine/datacenter outage issues – Ease of deployment and manageability 18
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Distributed System Concepts Oracle NoSQL Database context Multiple copies of data (replication factor = number of copies) – Durability and Availability – Read scalability What constitutes durability of a change – Propagated to disk – Propagated to replicas (acks) – Many variations possible – Relaxing notion of durability to improve performance can be risky; extremely difficult for application developer to handle correctly 19
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Logical System Architecture 20 Elastic Auto Sharding Store Shard M R R R R Application NoSQL Driver M Writes to elected node in shard Reads from any node in shard Shard R R M Auto re-balance of data on expansion
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Physical System Architecture 21 M1M1 M2M2 M3M3 R2R2 R2R2 R1R1 R1R1 R3R3 R3R3 Shard Agents A A A Machine1 Machine2 Machine3 Application NoSQL Driver D D D D D D Three Machines Nine replication nodes Replication factor = 3 Intelligent placement of replication nodes
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Benefits NoSQL Database delivers availability – Disk failures – System failures – NoSQL Database enforces durability rules (update locally, and propagate to replicas) NoSQL DB delivers read scalability – Read from any copy that satisfies application requirements NoSQL DB provides per-operation application flexibility – Choice of what constitutes a durable change (stable everywhere, “commit to network” etc.) – Choice of read consistency (latest, any, version, time-lag) for read operations NoSQL Database automatically chooses correct and optimal node to serve request NoSQL Database is sharded – Linear scalability 22
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections “Electable Master” node for each shard – Serves read and write requests Replicas – Serve read requests If a node dies, surviving members of shard automatically hold an election – If current master node is alive, no need for election – If current master node goes down, elect a new master node Elections need majority votes (voting protocol) Elections are transparent to application – NoSQL Database driver automatically “discovers” new master – Application needs to handle transient exceptions during master re-election 23
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections 24 Store Shard M R R R R Application NoSQL Driver M Shard R R M
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections… 25 Store Shard M R R R R Application NoSQL Driver M Shard R R M Replica not available No effect on shard
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections 26 Store Shard M R R R R Application NoSQL Driver M Shard R R M Current master not available Hold an election
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections… 27 Store Shard M R R M R Application NoSQL Driver M Shard R R M Elect new master
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections… 28 Store Shard M R R M R Application NoSQL Driver M Shard R R M Shard continues processing
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections… 29 Store Shard R R R M R Application NoSQL Driver M Shard R R M Old node is repaired and restarted Rejoins shard as replica, sync’s up and serves requests Note that election is localized to shard; other shards are not affected
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Elections… 30 Store Shard M R R R R Application NoSQL Driver M Shard R R M Old node is repaired and restarted Rejoins shard as replica, sync’s up and serves read requests NoSQL Database might choose to move master back to original node to “balance” the cluster
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Original topology 31 M1M1 M2M2 M3M3 R2R2 R2R2 R1R1 R1R1 R3R3 R3R3 Shard1 Shard2 Shard3 Agents A A A Machine1 Machine2 Machine3 Application NoSQL Driver D D D Master and replicas placed on different machines to avoid SPOF
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | After machine failure 32 M1M1 M2M2 M3M3 R2R2 R2R2 R1R1 M1M1 R3R3 R3R3 Shard1 Shard2 Shard3 Agents A A A Machine1 Machine2 Machine3 Application NoSQL Driver D D D New master elected; placed on machine 3 Each shard now has two copies System fully available
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Re-balanced topology 33 M1M1 M2M2 M3M3 R2R2 R2R2 R1R1 R1R1 R3R3 R3R3 Shard1 Shard2 Shard3 Agents A A A Machine1 Machine2 Machine3 Application NoSQL Driver D D D Once Machine 1 is repaired, shard 1 master moved back to machine 1 Transparent rebalancing to distribute load evenly System fully available
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Summary Multiple copies ensure much better availability (eliminate SPOF) NoSQL Database automatically enforces durability semantics Elections need majority votes Application has per-operation durability and read consistency choices Failure of a node does not disrupt availability of system NoSQL Database automatically re-balances cluster for improved performance 34
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle NoSQL Database Who uses it Fundamental of distributed systems Datacenter Scenarios 35
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Creating the cluster 1.Provision hardware (could be in different locations) – Specify capacity of each machine – Start Storage Node Agent (listener) on each machine 2.Name the cluster 3.Identify zones (groups of machines for fault isolation, RO applications etc) – Specify replication factor for each zone (cumulative RF = RF of the cluster) – Specify type of zone (primary or secondary) 4.Associate provisioned hardware with each zone 5.Deploy the server instances (replication nodes) on the provisioned hardware 36
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Provision Storage Nodes 37 Start storage node agent (“listener”) on each storage node Specify capacity of each storage node. In this example, capacity of each SN is 1.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Name the cluster 38 Name the store (this will create the administration database on the node that was connected to) MyStore Admin Client machine At this point, You have a set of machines Each machine has a capacity Each machine is running the storage node agent You have named the store and created the admin database
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Start Admin process and define zone 39 Zone characterized by Name Replication factor MyStore Zone1 with Replication factor (RF) = 2 administration process for Zone1 Zone 1 (RF= 2)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Storage Node Pool 40 Create Zone1Pool Assign Storage nodes to pool MyStore Zone 1 Zone1Pool RF= 2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Multi-zone configuration 41 Create Zone2, Zone2Pool Assign Storage nodes to pool MyStore Zone 1 Zone 2 Zone1Pool RF= 2Zone2Pool RF= 1
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Deploy replication nodes (server instances) 42 MyStore Zone 1 Zone 2 Zone1 RF= 2 Zone2 RF= 1 Two Shards (Two copies in Zone 1, one copy in Zone 2): Blue shard Green Shard Note: At least one copy of each shard in each zone. Six machines, each with capacity =1. Total capacity = 6. Total RF for cluster (zone1 + zone2) = 3. Capacity/RF = 2 shards.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Illegal topology 43 MyStore Zone 1 Zone 2 Zone1 RF= 2 Zone2 RF= 1 Every zone must have at least one copy of the data. NoSQL Database enforces this rule NOT ALLOWED
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Unused capacity 44 MyStore Zone 1 Zone 2 Zone1 RF= 2 = Zone2 RF= 1 Seventh machine not assigned to cluster Seven machines, each with capacity = 1 Total capacity = 7. Total RF for cluster (zone1 + zone2) = 3. Capacity/RF = 2 shards.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Double the capacity 45 MyStore Zone 1 Zone 2 Zone1 RF= 2 Zone2 RF= 1 Four Shards (Two copies in Zone 1, one copy in Zone 2) Six machines, each with capacity = 2 Total capacity = 12. Total RF for cluster (zone1 + zone2) = 3. Capacity/RF = 4 shards
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Notes Zone are used for fault isolation – In the previous examples, the two zones might be located on different floors in the same datacenter, or in two different datacenters. Each zone may have independent power supply, network connection etc. Primary and secondary zones – Primary zone members can host electable masters – Secondary zones cannot host electable masters (can serve read requests) Intention of secondary zones is to segregate R/W and RO workloads Flavors of Replication factor (number of copies) – Store replication factor (total for store) – Zone replication factor (Zone specific) – Primary Zone replication factor (sum of RFs for all primary zones) Used to determine majority – Secondary Zone replication factor (sum of RFs for all secondary zones) 46
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Two Zone scenario 47
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Two Primary Zones; RF=3 48 Store M R R R R Application NoSQL Driver M R R M Zone 1Zone 2 Shard 1 Shard 2 Shard 3 Zone1 hosts two copies; Zone2 hosts 1 copy Application accesses all nodes Master may “move” from one zone to another Loss of any single machine => full availability Zone2 down => full availability Zone1 down => read availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Three Primary Zones 49
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Three Primary Zones; RF=3 50 Store M R R R R Application NoSQL Driver M R R M Zone 1Zone 2 Shard 1 Shard 2 Shard 3 Each zone hosts a single copy Application accesses all nodes Master may “move” from one zone to another Loss of any single machine => full availability Single zone down => full availability Any two zones down => read availability Zone 3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Secondary Zones 51
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Secondary Zones 52 Store Application NoSQL Driver Secondary zones intended for read-intensive workloads Secondary Zone nodes never chosen as master Kept up-to-date – asynch replication Read-intensive workloads can be directed to secondary zone Zone 1- primaryZone 2 - secondary Application NoSQL Driver Configuration parameter to direct read requests exclusively to specified zones (set_read_zone)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Secondary Zones. RF=3 53 Store M R R R R Application NoSQL Driver M R R M Zone 1- primaryZone 2 - secondary Shard 1 Shard 2 Shard 3 Zone1 hosts two copies; Zone2 hosts 1 copy Application accesses all nodes Master may “move” from one node to another only within zone1 Loss of a machine in Zone2 => full availability Loss of a machine in Zone1 => RO for the shard Zone2 down => full availability Zone1 down => read availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Conclusion NoSQL Database for enterprise-grade performance, mixed workloads and business continuity NoSQL Database addresses many common distributed system scenarios – Preserves correctness – Provides application flexibility NoSQL Database provides enterprise grade performance and supports mixed workloads and business continuity for a wide variety of scenarios 54
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle NoSQL Database presentations at Open World TitleDate/Time/Location THT Real Time with Oracle NoSQL Sep 29/4:00 pm/Mos. South Big Data Showcase HOL9349. Oracle NoSQL Database for Application Developers Sep 29/5:45 PM &Sep 30/10:15 AM Hotel Nikko - Peninsula CON 8060: Oracle NoSQL Database: What’s new – Functionality and use casesSep 30/noon/Mos. South 303 MTE9350. Oracle’s Big Data Management SystemSep 30/6:00 pm/Mos. South 303 MTE9328. Oracle NoSQL Database: Meet the ExpertsSep 30/7:00 pm/Mos. South 303 CON 8062: Oracle NoSQL Database: A Practical IntroductionOct 1/10:15 AM/Mos. South 305 CON 8082: REST and Oracle NoSQL DatabaseOct 1/ 2:00 PM/Mos. South 305 HOL9327. Oracle NoSQL Database for Administrators. Oct 1/ 2:45 PM &Oct 2/2:30 PM Hotel Nikko – Peninsula 55
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |56