© 2014 Pivotal Introducing Spring XD Mark Pollack, Sr. Software Engineer, Pivotal.

Introducing Spring XD Mark Pollack, Sr. Software Engineer, Pivotal

2 2 Spring XD XD = eXtreme Data

3 3 Spring XD “One stop shop for developing and deploying Big Data Applications”

4 4 What is a Big Data Application? Spring XD

5 5 Big Data Architecture Spring XD Stream Processing Analytics Ingest Workflow Orchestration Spring XD Export FILES SOCIAL SENSORS MOBILE XD> MASTER DATASET Predictive Modeling BATCH VIEWS REALTIME VIEWS Spring BOOT Spring BOOT Spring BOOT

6 6 REALTIME VIEWS BATCH VIEWS Spring XD MASTER DATASET Spring BOOT Spring BOOT Spring BOOT FILES SOCIAL SENSORS MOBILE Stream Processing Analytics Ingest Workflow Orchestration Spring XD Export XD> Predictive Modeling Lambda Architecture SPEED LAYER BATCH LAYER SERVING LAYER

7 7 REALTIME VIEWS BATCH VIEWS Spring XD MASTER DATASET Spring BOOT Spring BOOT Spring BOOT FILES SOCIAL SENSORS MOBILE Stream Processing Analytics Ingest Workflow Orchestration Spring XD Export XD> GemFire XD Predictive Modeling GemFire XD SPEED LAYER BATCH LAYER SERVING LAYER

8 8 Spring IO Platform

9 9 Spring XD 10,000 ft view Spring XD FILESSENSORSSOCIALMOBILE

10 10 Streams Spring XD HTTP Tail File Mail Twitter Gemfire Syslog TCP UDP JMS RabbitMQ MQTT Trigger Reactor TCP/UDP Filter Transformer Object-to-JSON JSON-to-Tuple Splitter Aggregator HTTP Client Groovy Scripts Java Code JPMML Evaluator File HDFS JDBC TCP Log Mail RabbitMQ Gemfire Splunk MQTT Dynamic Router Counters

11 11 Streams Spring XD How can we make this easier? http | filter | file

12 12 Taps Spring XD  “Listen” to data on another stream

13 13 Analytics  Counters and Gauges Simple & Field Value Counter How many tweets for #java Aggregate Counter How many tweets for #java in the week/day/hour Gauge & Rich Gauge How many requests per minute?  Abstract API. Implemented in In-Memory Redis  Predictive Models Is this transaction fraudulent?  Based on JPMML Evaluator Wide range of model types  Interoperable with R, Rattle, KNIME, RapidMiner Spring XD

14 14 Jobs Spring XD CSV to JDBC FTP to HDFS JDBC to HDFS HDFS to JDBC HDFS to MongoDB

15 15 Spring XD Runtime Spring XD XD Container Spring XD XD Admin (leader) XD Shell HTTP POST /streams/aStream “M1 | M2” Data Transport ZooKeeper Container State XD Admin

16 16 Spring XD Runtime Spring XD XD Container Spring XD XD Admin (leader) XD Shell HTTP POST /streams/aStream “M1 | M2” Data Transport Spring App Context M1 ZooKeeper Container State XD Admin

17 17 Spring XD Runtime Spring XD XD Container Spring XD XD Admin (leader) XD Shell HTTP POST /streams/aStream “M1 | M2” Data Transport Spring App Context M1 ZooKeeper Container State XD Admin M2

18 18 Predictive Models Spring XD

19 19 Concepts Spring XD  Model Parameterized algorithm  Model Building Derive a parameterized algorithm from the data Slow process. Done offline, as a batch process, due to amount of data involved  Model Scoring Use the model to predict new information Fast process. Can be done as part of stream processing

20 20 PMML Spring XD  Predictive Model Markup Language  XML interchange format for analytical models  From the Data Mining Group http://www.dmg.org  Processing + models  Supported by statistics and data minig tools R/Rattle, SAS Enterprise Miner, SPSS, Weka  Java Evaluator API JPMML-Evaluator project Provides model scoring

21 21 Distributed, Fault Tolerant Runtime Spring XD

22 22 Spring XD – Runtime – Fault Tolerance Spring XD XD Container Spring XD XD Admin (leader) XD Shell HTTP POST /streams/aStream “M1 | M2” Data Transport Spring App Context M1 ZooKeeper Container State XD Admin M2

23 23 XD Container Spring XD – Runtime – Fault Tolerance Spring XD XD Admin (leader) XD Shell HTTP POST /streams/aStream “M1 | M2” Data Transport ZooKeeper Container State XD Admin M2

24 24 XD Container Spring XD – Runtime – Fault Tolerance Spring XD XD Admin (leader) XD Shell HTTP POST /streams/aStream “M1 | M2” Data Transport ZooKeeper Container State XD Admin M2 M1

25 25 XD Container Spring XD – Runtime – Fault Tolerance Spring XD XD Shell Data Transport ZooKeeper Container State XD Admin (leader) M2 M1

26 26 XD Container Spring XD – Runtime – Fault Tolerance Spring XD XD Shell Data Transport ZooKeeper Container State XD Admin (leader) XD Container M2 M1

27 27 Spring XD – Runtime – Fault Tolerance Spring XD XD Shell Data Transport ZooKeeper Container State XD Admin (leader) XD Container M2 M1

28 28 Spring XD – Runtime – Fault Tolerance Spring XD XD Shell Data Transport ZooKeeper Container State XD Admin (leader) XD Admin XD Container M2 M1

29 29 XD Container Spring XD – Runtime – Fault Tolerance Spring XD XD Shell HTTP POST /streams/aStream “M3| M4” Data Transport ZooKeeper Container State XD Admin (leader) XD Container XD Admin M3 XD Container M4 M2 M1

