Presentation is loading. Please wait.

Presentation is loading. Please wait.

Apache Ignite Compute Grid Research Corey Pentasuglia.

Similar presentations


Presentation on theme: "Apache Ignite Compute Grid Research Corey Pentasuglia."— Presentation transcript:

1 Apache Ignite Compute Grid Research Corey Pentasuglia

2 What is Apache Ignite? In Memory Data Fabric An open source Apache Incubator project Started and still mostly maintained by a company named GridGain Ignite contains several key components for high performance computing within a distributed architecture

3 Compute Grid Designed for high performance, low latency, and scalability Availability is definitely considered. Jobs will execute as long as there is at least one node Failover Included a load balancer to orchestrate jobs that have failed

4 Compute Grid (Key Benefits) Fault Tolerance If a node fails, jobs will automatically be transferred over to another node (if available) Load Balancing Automatic load balancing will occur to allow an efficient distribution of work among the available nodes Job Scheduling Priority can be set for tasks that run on the grid, however by default tasks will be worked off randomly Direct MapReduce API

5 Ignite Vs. MPI Apache Ignite Grid Any node can be an orchestrator Automatic network association Highly Portable Really just requires Java to execute Runs in virtual environment (has improved) MPI (Message Passing Interface) Beowulf Clustering Has Master Node Requires network configuration Claims portability May be subject to C libraries No overhead of running virtualized

6 Grid Configuration The lab machines selected can be seen below: While the plain Ignite install can be started and utilized, I have created custom JAR files that contain my code These JARS can be run on any machine that has Java installed Machines that will not be the orchestrator can utilize the plain install of Apache Ignite Code is delivered to remote nodes to be executed

7 Closure and Runnable/Callable Closure Essentially a Lambda function Block of code that encloses body and any outside variables Ex. ignite.compute().broadcast(() -> System.out.println("Hello World!")); Runnable/Callable Extends either the Java Runnable or Callable Interfaces Runnable does not return results These can be defined to enclose your logic to be executed and simple passed to Ignite

8 Ignite MapReduce Apache Ignite comes with a simplified in-memory MapReduce Apache Ignite is really able to optimize the MapReduce paradigm by working with data in-memory Personally found most of the Ignite APIs really easy to work with and well developed Configurable result policies WAIT – Waits for remaining jobs to complete REDUCE – Immediately moves the reduce() method FAILOVER – Failover the job to another node

9 Node Sharing Similar to typical local thread shared state Keeps state on a given node

10 Collocated Computing Data Locality Ignite provides the ability to configure jobs to run on nodes where data is local Reduces the need for network IO Utilizes a concept similar to affinity to identify the node to execute on

11 Checkpointing Apache Ignite also provides the ability to “checkpoint” the state of a job that’s running. Protects against failures Ability to restart failed nodes ComputeTaskSession (Class) loadCheckpoint(String) removeCheckpoint(String) saveCheckpoint(String)

12 Example 1 (Hello World) Utilize four of the Linux lab machines to run Hello World in the Ignite Compute Grid A Java application has been written to broadcast “Hello World” code to each of the nodes By utilizing the following code, one could broadcast only to remote nodes ClusterGroup rmts = ignite.cluster().forRemotes(); ignite.compute().broadcast(() -> System.out.println("Hello World!")); Notice the use of the Cluster group This is a method if defining particular nodes to execute on

13 Example 2 (Word Count) Utilize four of the Linux lab machines to run an application in the Ignite Compute Grid A Java application has been written to broadcast a world counting closure to each of the nodes Each node will receive a word to be counted The results will be aggregated at the orchestrating node

14 Masters Project Work Utilize four of the Linux lab machines to run an application in the Ignite Compute Grid Researching Distributed Machine Learning in preparation of Doctoral work Develop a distributed classification application to run in Apache Ignite The application will take a dataset to be used a training data Subsequent datasets can then be classified against the training set using the K-Nearest Neighbors Algorithm Results will be aggregated at the acting masters node

15 Further Work I’d like to explore more examples with the Apache Ignite Compute Grid It would be interesting to compare latency against MPI Working on Master Project utilizing Apache Ignite

16 Community

17 Citation https://ignite.apache.org/ (Entire website, documentation, images, and linked videos) https://ignite.apache.org/ GitHub - https://github.com/pentaschttps://github.com/pentasc


Download ppt "Apache Ignite Compute Grid Research Corey Pentasuglia."

Similar presentations


Ads by Google