Download presentation
Presentation is loading. Please wait.
1
Pipe Engineering
2
IdoFriedman.yml Name: Ido Friedman, Past:[SQL Server consultant, Instructor, Team Leader] Present: [Data engineer, Architect] Technologies: [Elasticsearch,CouchBase,MongoDB,Python,SQL …] WorkPlace: Perion
3
Lambda Architecture Lambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch- and stream-processing methods.
4
Lambda Example
5
Lambda Example
6
Data processing Batch Micro batch Streaming
7
Tools of the trade Processing Source/Targets HDFS SQL Server Amazon S3
MongoDB Azure EventHub ELK Stack Azure blob storage AWS Kinesis
8
All very nice BUT… Lots of systems = Lots of issues + Lots of data movements Lots of knowable required
9
ETL vs Streaming ETL Streaming Row by Row (not always)
Shifting data window Supports many data structures in many cases Streaming Batch processing Known data window Known data structure Known and expected behavior patterns Small number of data platforms in one process
10
StreamSets http://www.streamsets.com
Performance Management for Data Flows Not an ETL tool Open Source Many connectors and integration Integration to Kafka / Hadoop at cluster mode No coding is required – But is fully supported VERY simple deployment
11
Piping Challenges Data Enrichment
Windowing Performance Excepted results Data visibility Flexibility Scaling Monitoring
12
StreamSets SDC
13
SDC Cluster and Streaming mode
SDC runs as an application within Spark Streaming SDC runs as an application on top of MapReduce
14
What are we doing with SDC
Error and application Log analysis (ELK = SDC+K) Clickstream Ad Hoc needs
15
What are we connecting Amazon S3 JDBC Consumer ElasticSearch RabbitMQ
AWS Kinesis SQL Server Amazon S3 JDBC Consumer ElasticSearch RabbitMQ File Tail Redis
16
Some numbers 100+M events per day including record level operations On a 2CPU 8GB machine (Centos 7)
18
Recommended reading`
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.