GridGain In-Memory Data Fabric:

Slides:

Advertisements

Similar presentations

From Startup to Enterprise A Story of MySQL Evolution Vidur Apparao, CTO Stephen OSullivan, Manager of Data and Grid Technologies April 2009.

Advertisements

A Ridiculously Easy & Seriously Powerful SQL Cloud Database Itamar Haber AVP Ops & Solutions.

MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.

HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.

A Survey of Distributed Database Management Systems Brady Kyle CSC

10 REASONS Why it makes a good option for your DB IN-MEMORY DATABASES Presenter #10: Robert Vitolo.

 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)

MS CLOUD DB - AZURE SQL DB Fault Tolerance by Subha Vasudevan Christina Burnett.

Microsoft SQL Server x 46% 900+ For Hosting Service Providers

NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.

Next Generation of Apache Hadoop MapReduce Arun C. Murthy - Hortonworks Founder and Architect Formerly Architect, MapReduce.

SQL on Hadoop. Todays agenda Introduction Hive – the first SQL approach Data ingestion and data formats Impala – MPP SQL.

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc

Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.

Windows Azure SQL Database and Storage Name Title Organization.

A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.

GridGain – Java Grid Computing Made Simple Dmitriy Setrakyan

GridGain – Java Grid Computing Made Simple Nikita Ivanov

Word Wide Cache Distributed Caching for the Distributed Enterprise.

Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.

MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …

H ADOOP DB: A N A RCHITECTURAL H YBRID OF M AP R EDUCE AND DBMS T ECHNOLOGIES FOR A NALYTICAL W ORKLOADS By: Muhammad Mudassar MS-IT-8 1.

Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)

1 NETE4631 Using Google Web Services and Using Microsoft Cloud Services Lecture Notes #7.

Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.

Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.

Introduction to Hadoop and HDFS

Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.

IMDGs An essential part of your architecture. About me

An Introduction to HDInsight June 27 th,

The exponential growth of data –Challenges for Google,Yahoo,Amazon & Microsoft in web search and indexing The volume of data being made publicly available.

Applications Web et bases de données en grappe Séminaire InTech 3 Février 2005 – Grenoble.

The Memory B. Ramamurthy C B. Ramamurthy1. Topics for discussion On chip memory On board memory System memory Off system/online storage/ secondary memory.

Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!

Highly available database clusters with JDBC

Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*

CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.

Copyright © 2015, SAS Institute Inc. All rights reserved. THE ELEPHANT IN THE ROOM SAS & HADOOP.

What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.

Scalable data access with Impala Zbigniew Baranowski Maciej Grzybek Daniel Lanza Garcia Kacper Surdy.

Features Scalability Manage Services Deliver Features Faster Create Business Value Availability Latency Lifecycle Data Integrity Portability.

1 HBASE – THE SCALABLE DATA STORE An Introduction to HBase XLDB Europe Workshop 2013: CERN, Geneva James Kinley EMEA Solutions Architect, Cloudera.

Next Generation of Apache Hadoop MapReduce Owen

Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.

INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.

The best of WF 4.0 and AppFabric Damir Dobric MVP-Connected System Developer Microsoft Connected System Division Advisor Visual Studio Inner Circle member.

BIG DATA/ Hadoop Interview Questions.

Ignite in Sberbank: In-Memory Data Fabric for Financial Services

Microsoft Ignite /28/2017 6:07 PM

Apache Ignite Compute Grid Research Corey Pentasuglia.

1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.

Apache Ignite Data Grid Research Corey Pentasuglia.

Hadoop Aakash Kag What Why How 1.

Database Services Katarzyna Dziedziniewicz-Wojcik On behalf of IT-DB.

Machine Learning Library for Apache Ignite

Introduction to Distributed Platforms

Open Source distributed document DB for an enterprise

Operational & Analytical Database

Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.

Windows Azure Migrating SQL Server Workloads

Data Platform and Analytics Foundational Training

Introducing YugaByte DB

In-Memory Performance

Central Florida Business Intelligence User Group

Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.

Hadoop Technopoints.

Overview of big data tools

The Memory B. Ramamurthy C B. Ramamurthy.

Big-Data Analytics with Azure HDInsight

Presentation transcript:

Founder & EVP Engineering @dsetrakyan GridGain In-Memory Data Fabric: Ultimate Speed and Scale for Transactions and Analytics Dmitriy Setrakyan www.gridgain.com #gridgain Founder & EVP Engineering @dsetrakyan

Agenda Evolution of In-Memory Computing GridGain In-Memory Data Fabric Distributed Cluster & Compute Coding Example Distributed Data Grid Coding Examples Distributed Streaming & CEP Plug-n-Play Hadoop Accelerator

What is In-Memory Computing? High Performance & Low Latencies Faster than Disk and Flash Cost Effective Distributed or Not Caching, Streaming, Computations Data Querying – SQL or Unstructured Volatile and Persistent OLAP and OLTP Use Cases Data Centers are moving to all RAM datacenters $30/MB vs 1 cents per MB RAM costs less than disk over a period 3 years, less cooling, less space, less electricity 5000 to 1m faster than disk

Evolution of In-Memory Computing Clustering & Compute Grid Data Grid Streaming Hadoop Acceleration Database IM options Hadoop accelerators Streaming In-Memory Data Grids IMDBs BI accelerators Distributed Caching Caching

Existing Market is Fragmented Company Product Proprietary/ Open Source Characterization Oracle In-Memory Option for Oracle Database Proprietary Cost Option Times Ten Point Solution IMDB Coherence Point Solution IMDG SAP Hana Point Solution - IMDB Microsoft SQL Server 2014 Feature Upgrade DataBricks Apache Spark Point Solution - Hadoop VoltDB Point Solution – IMDB Aerospike Point Solution – NoSQL DB IBM DB2 with BLU Acceleration Software AG Terracotta Point Solution - IMDG Hazelcast Let me tell you about the market that we simplified In that world having complexity and disparity is not strategic

GridGain In-Memory Data Fabric: Strategic Approach to IMC Supports all Apps Open Source – Apache 2.0 Apache Project - Ignite Simple Java APIs 1 JAR Dependency High Performance & Scale Automatic Fault Tolerance Management/Monitoring Runs on Commodity Hardware Streaming Data Grid Hadoop Acceleration Clustering & Compute Grid Supports existing & new data sources No need to rip & replace © 2014 GridGain Systems, Inc.

Clustering & Compute Direct API for MapReduce Direct API for Fork/Join Zero Deployment Cron-like Task Scheduling State Checkpoints Early and Late Load Balancing Automatic Failover Full Cluster Management Pluggable SPI Design

Automatic Cluster Discovery

Closure Execution

Closure Execution

In-Memory Caching and Data Grid Distributed In-Memory Key-Value Store Replicated and Partitioned TBs of data, of any type On-Heap and Off-Heap Storage Backup Replicas / Automatic Failover Distributed ACID Transactions SQL queries and JDBC driver Collocation of Compute and Data

Cache Operations

Cache Transaction

Distributed Data Structures Distributed Map (cache) Distributed Set Distributed Queue CountDownLatch AtomicLong AtomicSequence AtomicReference Distributed ExecutorService

Client-Server vs Affinity Colocation

In-Memory Streaming & CEP Streaming Data Never Ends Branching Pipelines CEP Sliding Windows Real Time Indexing Real Time Querying At Least Once Guarantee

Plug-n-Play Hadoop Accelerator Up to 100x Acceleration In-Memory Native MapReduce In-Process Data Colocation Eager Push Scheduling GGFS In-Memory File System Pure In-Memory Write-Through to HDFS Read-Through from HDFS Sync and Async Persistence

In-Memory Native MapReduce Zero Code Change Use existing MR code Use existing Hive queries No Name Node No Network Noise In-Process Data Colocation Eager Push Scheduling

DevOps Management and Monitoring - In-Memory

Thank You www.gridgain.com #gridgain @dsetrakyan