co-founder / data Artisans

Slides:



Advertisements
Similar presentations
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Advertisements

Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
© Hortonworks Inc Daniel Dai Thejas Nair Page 1 Making Pig Fly Optimizing Data Processing on Hadoop.
Parallel Computing MapReduce Examples Parallel Efficiency Assignment
Spark: Cluster Computing with Working Sets
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica Spark Fast, Interactive,
Data-Intensive Computing with MapReduce/Pig Pramod Bhatotia MPI-SWS Distributed Systems – Winter Semester 2014.
Shark Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Hive on Spark.
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
Distributed Computations
Distributed Computations MapReduce
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Introduction to Hadoop and HDFS
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
MapReduce M/R slides adapted from those of Jeff Dean’s.
Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
SLIDE 1IS 240 – Spring 2013 MapReduce, HBase, and Hive University of California, Berkeley School of Information IS 257: Database Management.
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
Matthew Winter and Ned Shawa
© Hortonworks Inc Hadoop: Beyond MapReduce Steve Loughran, Big Data workshop, June 2013.
Big Data Engineering: Recent Performance Enhancements in JVM- based Frameworks Mayuresh Kunjir.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Big Data Infrastructure Week 3: From MapReduce to Spark (2/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0.
Apache Tez : Accelerating Hadoop Query Processing Page 1.
Ignite in Sberbank: In-Memory Data Fabric for Financial Services
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.
Big thanks to everyone!.
Image taken from: slideshare
Big Data is a Big Deal!.
Some slides adapted from those of Yuan Yu and Michael Isard
Hadoop.
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
Scaling Apache Flink® to very large State
Spark Presentation.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
Streaming Analytics with Apache Flink 1.0
Speculative Region-based Memory Management for Big Data Systems
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
COS 518: Advanced Computer Systems Lecture 11 Michael Freedman
Introduction to Spark.
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
Introduction to PIG, HIVE, HBASE & ZOOKEEPER
February 26th – Map/Reduce
Cse 344 May 4th – Map/Reduce.
Slides prepared by Samkit
Data processing with Hadoop
Charles Tappert Seidenberg School of CSIS, Pace University
Introduction to MapReduce
COS 518: Advanced Computer Systems Lecture 12 Michael Freedman
COS 518: Distributed Systems Lecture 11 Mike Freedman
MapReduce: Simplified Data Processing on Large Clusters
Lecture 29: Distributed Systems
Map Reduce, Types, Formats and Features
Presentation transcript:

co-founder / CTO @ data Artisans Apache Flink Stephan Ewen Flink committer co-founder / CTO @ data Artisans @StephanEwen

Looking back one year

April 16, 2014

Stratosphere Optimizer Pact API (Java) DataSet API (Scala) Stratosphere Optimizer Stratosphere Runtime Iterations, Yarn support, Local execution, accummulators, web frontend, HBase, JDBC, Windows compatibility, mvn central, Local Remote Batch processing on a pipelining engine, with iterations …

Looking at now…

What is Apache Flink? Flink Real-time data streams (master) ETL, Graphs, Machine Learning Relational, … Low latency, windowing, aggregations, ... Event logs Kafka, RabbitMQ, ... Historic data HDFS, JDBC, ...

DataStream (Java/Scala) What is Apache Flink? HDFS Python Gelly Table ML Dataflow SAMOA Dataflow HCatalog HBase DataSet (Java/Scala) DataStream (Java/Scala) Hadoop M/R JDBC Flink Optimizer Stream Builder Kafka Flink Dataflow Runtime RabbitMQ Flume Local Remote Yarn Tez Embedded

Batch / Steaming APIs DataSet API (batch): DataStream API (streaming): case class Word (word: String, frequency: Int) DataSet API (batch): val lines: DataSet[String] = env.readTextFile(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .groupBy("word").sum("frequency") .print() DataStream API (streaming): val lines: DataStream[String] = env.fromSocketStream(...) lines.flatMap {line => line.split(" ") .map(word => Word(word,1))} .window(Count.of(1000)).every(Count.of(100)) .groupBy("word").sum("frequency") .print()

Technology inside Flink case class Path (from: Long, to: Long) val tc = edges.iterate(10) { paths: DataSet[Path] => val next = paths .join(edges) .where("to") .equalTo("from") { (path, edge) => Path(path.from, edge.to) } .union(paths) .distinct() next DataSource orders.tbl Filter Map lineitem.tbl Join Hybrid Hash buildHT probe hash-part [0] GroupRed sort forward Type extraction stack Dataflow Graph Cost-based optimizer Pre-flight (Client) Program deploy operators Memory manager Out-of-core algos Batch & Streaming State & Checkpoints Recovery metadata Task scheduling track intermediate results Master Workers

Flink by Feature / Use Case

Data Streaming Analysis

Life of data streams Create: create streams from event sources (machines, databases, logs, sensors, …) Collect: collect and make streams available for consumption (e.g., Apache Kafka) Process: process streams, possibly generating derived streams (e.g., Apache Flink)

Stream Analysis in Flink More at: http://flink.apache.org/news/2015/02/09/streaming-example.html

Defining windows in Flink Trigger policy When to trigger the computation on current window Eviction policy When data points should leave the window Defines window width/size E.g., count-based policy evict when #elements > n start a new window every n-th element Built-in: Count, Time, Delta policies

Checkpointing / Recovery Flink acknowledges batches of records Less overhead in failure-free case Currently tied to fault tolerant data sources (e.g., Kafka) Flink operators can keep state State is checkpointed Checkpointing and record acks go together Exactly one semantics for state

Checkpointing / Recovery Operator checkpoint starting Pushes checkpoint barriers through the data flow Checkpoint done barrier Data Stream checkpoint in progress After barrier = Not in snapshot Before barrier = part of the snapshot (backup till next snapshot) Checkpoint done Chandy-Lamport Algorithm for consistent asynchronous distributed snapshots

Heavy ETL Pipelines

Heavy Data Pipelines Apology: Graph had to be blurred for online slides, due to confidentiality Complex ETL programs

Sorting, hashing, caching Memory Management Flink contains its own memory management stack. Memory is allocated, de-allocated, and used strictly using an internal buffer pool implementation. To do that, Flink contains its own type extraction and serialization components. User code objects Unmanaged public class WC { public String word; public int count; } Sorting, hashing, caching empty page Managed Shuffling, broadcasts Pool of Memory Pages More at: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=53741525

Smooth out-of-core performance Single-core join of 1KB Java objects beyond memory (4 GB) Blue bars are in-memory, orange bars (partially) out-of-core More at: http://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html

Benefits of managed memory More reliable and stable performance (less GC effects, easy to go to disk)

Table API val customers = envreadCsvFile(…).as('id, 'mktSegment) .filter( 'mktSegment === "AUTOMOBILE" ) val orders = env.readCsvFile(…) .filter( o => dateFormat.parse(o.orderDate).before(date) ) .as('orderId, 'custId, 'orderDate, 'shipPrio) val items = orders .join(customers).where('custId === 'id) .join(lineitems).where('orderId === 'id) .select('orderId,'orderDate,'shipPrio, 'extdPrice * (Literal(1.0f) - 'discount) as 'revenue) val result = items .groupBy('orderId, 'orderDate, 'shipPrio) .select('orderId, 'revenue.sum, 'orderDate, 'shipPrio)

Iterations in Data Flows  Machine Learning Algorithms

Iterate by looping Client Step Step Step Step Step for/while loop in client submits one job per iteration step Data reuse by caching in memory and/or disk

Iterate in the Dataflow

Large-Scale Machine Learning Factorizing a matrix with 28 billion ratings for recommendations (Scale of Netflix or Spotify) More at: http://data-artisans.com/computing-recommendations-with-flink.html

State in Iterations  Graphs and Machine Learning

Iterate natively with deltas Replace initial workset A B workset workset initial partial delta iteration X Y solution solution set r esult other datasets Mer ge deltas

Effect of delta iterations… # of elements updated iteration

… very fast graph analysis Performance competitive with dedicated graph analysis systems … and mix and match ETL-style and graph analysis in one program More at: http://data-artisans.com/data-analysis-with-flink.html

Closing

Flink Roadmap for 2015 Out-of-core state in Streaming Monitoring and scaling for streaming Streaming Machine Learning with SAMOA More additions to the libraries Batch Machine Learning Graph library additions (more algorithms) SQL on top of expression language Master failover

Flink community #unique contributor ids by git commits dev list: 300-400 messages/month. record 1000 messages on

flink.apache.org @ApacheFlink

Backup

Cornerpoints of Flink Design Flexible Data Streaming Engine Robust Algorithms on Managed Memory  No OutOfMemory Errors Scales to very large JVMs Efficient an robust processing Low Latency Steam Proc. Highly flexible windows High-level APIs, beyond key/value pairs Pipelined Execution of Batch Programs Java/Scala/Python (upcoming) Relational-style optimizer Better shuffle performance Scales to very large groups Active Library Development Native Iterations Very fast Graph Processing Stateful Iterations for ML Graphs / Machine Learning Streaming ML (coming)

Program optimization

A simple program val orders = … val lineitems = … val filteredOrders = orders .filter(o => dataFormat.parse(l.shipDate).after(date)) .filter(o => o.shipPrio > 2) val lineitemsOfOrders = filteredOrders .join(lineitems) .where(“orderId”).equalTo(“orderId”) .apply((o,l) => new SelectedItem(o.orderDate, l.extdPrice)) val priceSums = lineitemsOfOrders .groupBy(“orderDate”).sum(“l.extdPrice”);

relative sizes of input files Two execution plans GroupRed GroupRed sort sort hash-part [0,1] Combine Best plan depends on relative sizes of input files forward Join Hybrid Hash Join Hybrid Hash buildHT probe buildHT probe broadcast forward hash-part [0] hash-part [0] Map DataSource lineitem.tbl Map DataSource lineitem.tbl Filter Filter DataSource orders.tbl DataSource orders.tbl

Examples of optimization Task chaining Coalesce map/filter/etc tasks Join optimizations Broadcast/partition, build/probe side, hash or sort-merge Interesting properties Re-use partitioning and sorting for later operations Automatic caching E.g., for iterations

Visualization

Visualization tools

Visualization tools

Visualization tools