Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pipe Engineering.

Similar presentations


Presentation on theme: "Pipe Engineering."— Presentation transcript:

1 Pipe Engineering

2 IdoFriedman.yml Name: Ido Friedman, Past:[SQL Server consultant, Instructor, Team Leader] Present: [Data engineer, Architect] Technologies: [Elasticsearch,CouchBase,MongoDB,Python,SQL …] WorkPlace: Perion

3 Lambda Architecture Lambda architecture is a data-processing architecture designed to handle massive quantities of data by taking advantage of both batch- and stream-processing methods.

4 Lambda Example

5 Lambda Example

6 Data processing Batch Micro batch Streaming

7 Tools of the trade Processing Source/Targets HDFS SQL Server Amazon S3
MongoDB Azure EventHub ELK Stack Azure blob storage AWS Kinesis

8 All very nice BUT… Lots of systems = Lots of issues + Lots of data movements Lots of knowable required

9 ETL vs Streaming ETL Streaming Row by Row (not always)
Shifting data window Supports many data structures in many cases Streaming Batch processing Known data window Known data structure Known and expected behavior patterns Small number of data platforms in one process

10 StreamSets http://www.streamsets.com
Performance Management for Data Flows  Not an ETL tool Open Source Many connectors and integration Integration to Kafka / Hadoop at cluster mode No coding is required – But is fully supported VERY simple deployment

11 Piping Challenges Data Enrichment
Windowing Performance Excepted results Data visibility Flexibility Scaling Monitoring

12 StreamSets SDC

13 SDC Cluster and Streaming mode
SDC runs as an application within Spark Streaming SDC runs as an application on top of MapReduce

14 What are we doing with SDC
Error and application Log analysis (ELK = SDC+K) Clickstream Ad Hoc needs

15 What are we connecting Amazon S3 JDBC Consumer ElasticSearch RabbitMQ
AWS Kinesis SQL Server Amazon S3 JDBC Consumer ElasticSearch RabbitMQ File Tail Redis

16 Some numbers 100+M events per day including record level operations On a 2CPU 8GB machine (Centos 7)

17

18 Recommended reading`


Download ppt "Pipe Engineering."

Similar presentations


Ads by Google