Ignite in Sberbank: In-Memory Data Fabric for Financial Services

Slides:



Advertisements
Similar presentations
From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen OSullivan, Manager of Data and Grid Technologies April 2009.
Advertisements

A Ridiculously Easy & Seriously Powerful SQL Cloud Database Itamar Haber AVP Ops & Solutions.
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
Real-Time Big Data Use Cases John Leach CTO, Splice Machine.
Spark: Cluster Computing with Working Sets
Discretized Streams: Fault-Tolerant Streaming Computation at Scale Wenting Wang 1.
Paula Ta-Shma, IBM Haifa Research 1 “Advanced Topics on Storage Systems” - Spring 2013, Tel-Aviv University Big Data and.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
--What is a Database--1 What is a database What is a Database.
© 2011 Citrusleaf. All rights reserved.1 A Real-Time NoSQL DB That Preserves ACID Citrusleaf Srini V. Srinivasan Brian Bulkowski VLDB, 09/01/11.
GridGain In-Memory Data Fabric:
Microsoft SQL Server x 46% 900+ For Hosting Service Providers
Mihai Pintea. 2 Agenda Hadoop and MongoDB DataDirect driver What is Big Data.
© 2014 ScaleArc. All Rights Reserved. 1 Creating an Agile Data Environment for Apps in the Cloud Summer 2014.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Next Generation of Apache Hadoop MapReduce Arun C. Murthy - Hortonworks Founder and Architect Formerly Architect, MapReduce.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
BI Technical Infrastructure Approach
Keith Bodell, Account Executive - Federal Matt Campbell, Systems Engineer - Federal NetezzaTwinFin.
Chapter 2 Database System Architecture. An “architecture” for a database system. A specification of how it will work, what it will “look like.” The “ANSI/SPARC”
Goodbye rows and tables, hello documents and collections.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Hive Facebook 2009.
IMDGs An essential part of your architecture. About me
Introduction to Hadoop Programming Bryon Gill, Pittsburgh Supercomputing Center.
SQL Server 2014: Overview Phil ssistalk.com.
An Introduction to HDInsight June 27 th,
Introduction to Hbase. Agenda  What is Hbase  About RDBMS  Overview of Hbase  Why Hbase instead of RDBMS  Architecture of Hbase  Hbase interface.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Matthew Winter and Ned Shawa
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Microsoft Azure and DataStax: Start Anywhere and Scale to Any Size in the Cloud, On- Premises, or Both with a Leading Distributed Database MICROSOFT AZURE.
Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.
Mick Badran Using Microsoft Service Fabric to build your next Solution with zero downtime – Lvl 300 CLD32 5.
1 HBASE – THE SCALABLE DATA STORE An Introduction to HBase XLDB Europe Workshop 2013: CERN, Geneva James Kinley EMEA Solutions Architect, Cloudera.
Next Generation of Apache Hadoop MapReduce Owen
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Apache Spark Osman AIDEL.
SQL Basics Review Reviewing what we’ve learned so far…….
1 Case Study: Business Intelligence & Customer Data Customer Support Web-based Dashboard VP Marketing SQL XSLT XML Data Grid Customer Data Customer Order.
BIG DATA/ Hadoop Interview Questions.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Microsoft Ignite /28/2017 6:07 PM
Apache Ignite Compute Grid Research Corey Pentasuglia.
Apache Ignite Data Grid Research Corey Pentasuglia.
Table General Guidelines for Better System Performance
Data Platform and Analytics Foundational Training
Machine Learning Library for Apache Ignite
Introduction to Distributed Platforms
Open Source distributed document DB for an enterprise
Improving App Availability and Performance in the Cloud
Operational & Analytical Database
In-Memory Performance
Introduction to Spark.
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
20 Questions with Azure SQL Data Warehouse
Ch 4. The Evolution of Analytic Scalability
Table General Guidelines for Better System Performance
Overview of big data tools
ANTOL, HOLIČ, KEDA PA195 SPRING 2016
SQL Server 2019 Bringing Apache Spark to SQL Server
SQL Server 2016 High Performance Database Offer.
Pig Hive HBase Zookeeper
Presentation transcript:

Ignite in Sberbank: In-Memory Data Fabric for Financial Services Dmitriy Setrakyan Committer & PMC, Apache Ignite Founder and Chief Product Officer, GridGain www.gridgain.com #gridgain or #dsetrakyan

Agenda Ignite In-Memory Data Fabric Data Consistency Query Capabilities Compute Capabilities Aggressive SLAs Financial Services Use Cases Sberbank Use Case Big Data & Fast Data Disk & Memory Converged Data Platform Q & A

Apache Ignite: We Are Hiring! Very Active Community Great Way to Learn Distributed Computing How To Contribute: https://ignite.apache.org/community/contr ibute.html#contribute https://cwiki.apache.org/confluence/displa y/IGNITE/How+to+Contribute Direct API for Fork/Join

In-Memory Data Fabric: Strategic Approach to IMC Supports Applications of various types and languages Open Source – Apache 2.0 Simple Java APIs 1 JAR Dependency High Performance & Scale Automatic Fault Tolerance Management/Monitoring Runs on Commodity Hardware Supports existing & new data sources No need to rip & replace

In-Memory Data Fabric: More Than Data Grid

Use Case: Financial Services High Performance and Throughput Low Latencies Multiple Ways to Talk to Data Strong Data Consistency ACID Transactions Collocation of Compute & Data Strong Querying Capabilities Aggressive SLAs

Apache Ignite Data Grid Need to Move Money on top of Data Grid Fault Tolerance and Scale ACID Transactions ANSI-99 SQL Querying Integration with RDBMS No Need to Replace the Database Read-Through and Write-Through Write-From DB Deadlock Management Deadlock-Free Transactions Deadlock Detection

Data Grid: Distributed Caching Partitioned Cache Replicated Cache

Data Grid: Query Capabilities ANSI-99 SQL ODBC & JDBC Drivers Always Consistent Fault Tolerant In-Memory Indexes (On-Heap and Off-Heap) Automatic Group By, Aggregations, Sorting Cross-Cache Joins, Unions, etc. Ad-Hoc SQL Support Collocated and Non-Collocated Joins

SQL Cross-Cache GROUP BY Example

Compute Grid: Collocate Compute & Data Direct API for ForkJoin Distributed Closures Execute Logic on the Cluster Zero Deployment Custom Cluster Groups Load Balancing Automatic Failover Direct API for Fork/Join

Apache Ignite for BI and Analytics

Share RDDs Across Spark Jobs IgniteRDD Deployment Modes Share RDD across tasks on the host Share RDD across tasks in the application Share RDD globally Embedded vs External Deployments Faster SQL In-Memory Indexes SQL on top of Shared RDD Direct API for Fork/Join

Ignite In-Memory File System Ignite In-Memory File System (IGFS) Hadoop-compliant Easy to Install On-Heap and Off-Heap Caching Layer for HDFS Write-through and Read-through HDFS Performance Boost Direct API for Fork/Join

Sberbank Use Case

Use Case: Jointly Developing Sberbank Requirements Why Apache Ignite Largest bank in Russia and Eastern Europe, and the third largest in Europe Sberbank Requirements Migrate to data grid architecture Move to open source Scale & Collocation Distributed SQL & ACID Transactions Multiple Ways to Talk to Data Less 5 min data center restarts Why Apache Ignite More than a Data Grid Best performance 10+ competitors evaluated Fault tolerance & scalability ANSI-99 SQL Support Transactional consistency Jointly Developing Disk-Only Data Sets Query Disk & Memory Together 130 Million Customers Deposit Withdrawal Statement Disk Store 1000+ Servers GridGain Security

Fast Data vs Big Data Fast Data OLTP mostly Smaller Operational Data Set High Throughput (ops/sec) Low Latencies Consistent or Transactional Big Data OLAP mostly Larger Historical Data Set Read-Mostly Throughput Not Important Low Query Latencies Good-enough for interactive analytics Direct API for Fork/Join

How To Bridge The Gap? Fast Data Add Disk-Only Data Sets Add Disk-Only Processing Big Data Add Shared Memory Store Add Transactions Direct API for Fork/Join

What is a Converged Data Platform? Fast Data + Big Data Distributed and Scalable Real Time Data-To-Action Hybrid Transactional and Analytical Processing (HTAP) Combine RAM and SSD/Flash No ETL Fast Data in Memory Big Data on Disk Query Historical and Analytical Data Transactions on Historical and Analytical Data Direct API for Fork/Join

Lambda Architecture: Fast Data + Big Data Converged Data Platform Lambda Architecture Transactions + Analytics (HTAP) Distributed and Scalable Fast Data in Memory Big Data On Disk Query Historical and Analytical Data Transactions on Historical and Analytical Data Aggressive SLAs Full Cluster Restart in Under 5 mins Direct API for Fork/Join

Ignite 2.0 – Path to Converged Data Platform More SQL Non-Collocated Joins Data Modification Language (DML) Dada Definition Language (DDL) More Disk Disk-only Data Sets Query Across Disk & Memory Backups & Snapshots Instantaneous Restarts Direct API for Fork/Join

Thank you for joining us. Follow the conversation. ANY Questions? Thank you for joining us. Follow the conversation. ignite.apache.org www.gridgain.com #dsetrakyan #gridgain