Apache Ignite Data Grid Research Corey Pentasuglia.

Slides:



Advertisements
Similar presentations
The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Advertisements

NoSQL Databases: MongoDB vs Cassandra
Cacti Workshop Tony Roman Agenda What is Cacti? The Origins of Cacti Large Installation Considerations Automation The Current.
GridGain In-Memory Data Fabric:
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Passage Three Introduction to Microsoft SQL Server 2000.
CONNECT: Install Webinar for Code-A-Thon April 20th, 2010.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Revolutionizing enterprise web development Searching with Solr.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Data Dependent Routing may not be necessary when using Oracle RAC Ken Gottry Apr-2003 Through Technology Improvements in: Oracle 9i - RAC Oracle 9i - CacheFusion.
CONNECT: Install Webinar for Code-A-Thon April 22nd, 2010.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Maite Barroso - 10/05/01 - n° 1 WP4 PM9 Deliverable Presentation: Interim Installation System Configuration Management Prototype
Features Of SQL Server 2000: 1. Internet Integration: SQL Server 2000 works with other products to form a stable and secure data store for internet and.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Selenium server By, Kartikeya Rastogi Mayur Sapre Mosheca. R
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Oracle 10g Administration Oracle Server Introduction Copyright ©2006, Custom Training Institute.
Ignite in Sberbank: In-Memory Data Fabric for Financial Services
Apache Ignite Compute Grid Research Corey Pentasuglia.
Introduction to Android Programming
Hadoop Introduction. Audience Introduction of students – Name – Years of experience – Background – Do you know Java? – Do you know linux? – Any exposure.
Useful Tools for Testing
Machine Learning Library for Apache Ignite
Hadoop.
Introduction to Distributed Platforms
File System Implementation
CS 540 Database Management Systems
Database System Concepts and Architecture
Triple Stores.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
Steve Ko Computer Sciences and Engineering University at Buffalo
Modern Databases NoSQL and NewSQL
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
Introduction to J2EE Architecture
In-Memory Performance
Meng Cao, Xiangqing Sun, Ziyue Chen May 28th, 2014
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Steve Ko Computer Sciences and Engineering University at Buffalo
System And Application Software
Parallel and Multiprocessor Architectures – Shared Memory
Oracle Architecture Overview
Chapter 6 System and Application Software
Distributed P2P File System
Outline Virtualization Cloud Computing Microsoft Azure Platform
Chapter 2: The Linux System Part 1
Lecture 1: Multi-tier Architecture Overview
Introduction to Apache
IS3440 Linux Security Unit 8 Software Management
Overview of big data tools
Systems Programming Intro
Distributed File Systems
Distributed File Systems
Spark and Scala.
Distributed File Systems
11 Simplex or Multiplex?.
McGraw-Hill Technology Education
Chapter 6 System and Application Software
Chapter 6 System and Application Software
Triple Stores.
Chapter 6 System and Application Software
Web Application Development Using PHP
Pig Hive HBase Zookeeper
Presentation transcript:

Apache Ignite Data Grid Research Corey Pentasuglia

What is Apache Ignite? In Memory Data Fabric An open source Apache Incubator project Started and still mostly maintained by a company named GridGain Ignite contains several key components for high performance computing within a distributed architecture

Data Grid Designed with scalability in mind Data locality is important and priority Data can be accessed via a HashMap of values, however the Grid also has support for standard SQL, including distributed SQL joins Implements the Jcache specification (as defined by Oracle)

Data Grid (Key Benefits) Allows for quick access to large amounts of data With the popularity of “Big Data” comes the necessity to quickly access that data Values are cached in memory When data is accessed on networked machines, the disk no longer needs to be accessed. The cache that’s stored in memory can be accessed much faster The cache is always kept in sync with the data within the DB The Ignite documentation suggests creating indexes to make queries even faster. In fact, Ignite even has an implementation for “In Memory Indexing”

Replicated Vs. Partitioned Cache Replicated Cache Creates a copy on every node of the cluster Slow updates Impacts performance and scalability High availability Partitioned Cache Distributed among nodes Faster updates Very scalable and high performance High availability as the number of nodes increases

Project (Two Parts) Utilize four of the Linux lab machines to test the distributed setup and caching A Java application is written and bundled to run on these machines which allows a distributed cache to be modified Perform a proof a concept on my own machine to test out the distributed cache and automatic persistence Run Ignite with a distributed cache that syncs with a MySQL DB running on my laptop

Project (Development Setup) Apache Ignite is fully “Mavenized”. Meaning all the dependency management and building can be configured in Apache Maven Installed on my laptop: Apache Maven 3.0.5 Netbeans IDE 8.1 Apache Ignite 1.4.0 MySQL Community Edition 5.7.9 Java applications created to run on the school machines will be bundled with dependencies to run independently using the Java –jar command (also known as a Super Jar)

Project (Lab Machines) The lab machines selected can be seen below Demonstration: While the plain Ignite installed can be started and utilized, I have created custom JAR files that contain my code These JARS can be run on any machine that has Java installed I will start my application on Spunky and add to the distributed cache I will then start my app on Goliath and query the cache This will demonstrate the caching portion of the DataGrid

Project (Laptop) This portion of the project was performed on my laptop Automatic Persistence Utilized the bundled Ignite tool to generate Java and XML code Write initialization code Created another Java project to interact with the DB through the cache (using generated code) Create an interface to perform SQL on the distributed DB Create an interface to perform lookups on the cache

Community

Citation https://ignite.apache.org/ (Entire website, documentation, images, and linked videos)