A Dataflow Programming Language and Its Compiler for Streaming Systems

Slides:



Advertisements
Similar presentations
Copyright 2001, ActiveState. XSLT and Scripting Languages or…XSLT: what is everyone so hot and bothered about?
Advertisements

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Stream Scheduling.
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
Bottleneck Elimination from Stream Graphs S. M. Farhad The University of Sydney Joint work with Yousun Ko Bernd Burgstaller Bernhard Scholz.
Optimus: A Dynamic Rewriting Framework for Data-Parallel Execution Plans Qifa Ke, Michael Isard, Yuan Yu Microsoft Research Silicon Valley EuroSys 2013.
11 1 Hierarchical Coarse-grained Stream Compilation for Software Defined Radio Yuan Lin, Manjunath Kudlur, Scott Mahlke, Trevor Mudge Advanced Computer.
CIS 101: Computer Programming and Problem Solving Lecture 8 Usman Roshan Department of Computer Science NJIT.
Phased Scheduling of Stream Programs Michal Karczmarek, William Thies and Saman Amarasinghe MIT LCS.
Multiprocessors Andreas Klappenecker CPSC321 Computer Architecture.
Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://
Context-sensitive Analysis, II Ad-hoc syntax-directed translation, Symbol Tables, andTypes.
Chapter 13 Reduced Instruction Set Computers (RISC) Pipelining.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
February 12, 2009 Center for Hybrid and Embedded Software Systems Model Transformation Using ERG Controller Thomas H. Feng.
CSE S. Tanimoto Syntax and Types 1 Representation, Syntax, Paradigms, Types Representation Formal Syntax Paradigms Data Types Type Inference.
CS294-6 Reconfigurable Computing Day 23 November 10, 1998 Stream Processing.
General Computer Science for Engineers CISC 106 Lecture 26 Dr. John Cavazos Computer and Information Sciences 04/24/2009.
ASP.NET Programming with C# and SQL Server First Edition
Seoul National University Memory Efficient Software Synthesis from Dataflow Graph Wonyong Sung, Junedong Kim, Soonhoi Ha Codesign and Parallel Processing.
Building An Interpreter After having done all of the analysis, it’s possible to run the program directly rather than compile it … and it may be worth it.
SS ZG653Second Semester, Topic Architectural Patterns Pipe and Filter.
SEC(R) 2008 Intel® Concurrent Collections for C++ - a model for parallel programming Nikolay Kurtov Software and Services.
Mapping Stream Programs onto Heterogeneous Multiprocessor Systems [by Barcelona Supercomputing Centre, Spain, Oct 09] S. M. Farhad Programming Language.
Topics in Software Dynamic White-box Testing Part 2: Data-flow Testing
CMPE 511 DATA FLOW MACHINES Mustafa Emre ERKOÇ 11/12/2003.
High Performance Architectures Dataflow Part 3. 2 Dataflow Processors Recall from Basic Processor Pipelining: Hazards limit performance  Structural hazards.
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta.
Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:
Hinde Bouziane – CBHPC’08 – October 2008 Marco ALDINUCCI and Marco DANELUTTO UNIPI - University of Pisa (Italy) Hinde Lilia BOUZIANE and Christian.
Automated Design of Custom Architecture Tulika Mitra
1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft.
Static Translation of Stream Programs S. M. Farhad School of Information Technology The University of Sydney.
StreamX10: A Stream Programming Framework on X10 Haitao Wei School of Computer Science at Huazhong University of Sci&Tech.
C++ History C++ was designed at AT&T Bell Labs by Bjarne Stroustrup in the early 80's Based on the ‘C’ programming language C++ language standardised in.
Pipes & Filters Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Programming for Beginners Martin Nelson Elizabeth FitzGerald Lecture 6: Object-Oriented Programming.
CS333 Intro to Operating Systems Jonathan Walpole.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
BEGINNING PROGRAMMING.  Literally – giving instructions to a computer so that it does what you want  Practically – using a programming language (such.
Copyright Curt Hill Variables What are they? Why do we need them?
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
RUBY by Ryan Chase.
Programming for Performance CS 740 Oct. 4, 2000 Topics How architecture impacts your programs How (and how not) to tune your code.
VEAL: Virtualized Execution Accelerator for Loops Nate Clark 1, Amir Hormati 2, Scott Mahlke 2 1 Georgia Tech., 2 U. Michigan.
High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.
1 Iterative Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program.
Memory-Aware Compilation Philip Sweany 10/20/2011.
Operational Semantics Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Compilation of XSLT into Dataflow Graphs for Web Service Composition Peter Kelly Paul Coddington Andrew Wendelborn.
Linear Analysis and Optimization of Stream Programs Masterworks Presentation Andrew A. Lamb 4/30/2003 Professor Saman Amarasinghe MIT Laboratory for Computer.
Architectural Mismatch: Why reuse is so hard? Garlan, Allen, Ockerbloom; 1994.
Static Translation of Stream Program to a Parallel System S. M. Farhad The University of Sydney.
Memory Management in Java Mr. Gerb Computer Science 4.
Apache Tez : Accelerating Hadoop Query Processing Page 1.
Aryeh Tasher Brian Ramos Qijun Zhong Michael Li Tian Zhang.
The Object-Oriented Thought Process Chapter 03
Functional Programming
Efficient Evaluation of XQuery over Streaming Data
Parallel processing is not easy
Objects as a programming concept
Parallel Programming By J. H. Wang May 2, 2017.
Linear Filters in StreamIt
ESE532: System-on-a-Chip Architecture
MXNet Internals Cyrus M. Vahid, Principal Solutions Architect,
Binding Times Binding is an association between two things Examples:
Chapter 12 Pipelining and RISC
Architectural Mismatch: Why reuse is so hard?
CS 201 Compiler Construction
Presentation transcript:

A Dataflow Programming Language and Its Compiler for Streaming Systems Haitao Wei, Stephane Zuckerman, Xiaoming Li, and Guang Gao University of Delaware

Why a new dataflow programming language?

Why a new dataflow programming language?

Why a new dataflow programming language?

Why a new dataflow programming language? Arbitrary dataflow graph with multiple input/output operators Beyond static mapping True asynchronous processing Primitive and compositional coding Code reuse: using subgraph template Easy to construct dataflow graph: using explicit variable/name passing to connect the computation node Efficiency under dynamic program configurations Automatic software pipelining Optimized resource allocation and mapping

COStream Programming Language Multiple I/O Operators VS Single I/O Operator Split Join Fork-join style dataflow graph in Streamit with single input/output and split-join structure Arbitrary Dataflow graph in COSTream with multiple Input/Output Operator and stream

COStream Programming Language composite Connect operators to construct a dataflow subgraph Can be instantiated to reuse the code to explore the parallelism

COStream Programming Language Stream variable and dataflow graph construct Split composite Main(){ graph stream<float x> S=A(){… } stream<float x> T=B(){…} stream<float x> P=C(T,S){…} A B A B Join C Costream: Easy to construct and make it simple C StreamIt: Introduce useless split-join nodes which makes programming hard and brings the overhead

Example: Moving Average Source composite Main(){ Int N=10; graph stream<float x> S=Source(){ int x; init{x=0;} work{ S[0].x=x; x++; } //initialize an instance of composite MovAve stream<float x> P=MovAver(S)(N); Sink(P){ print(P[0].x); }} composite MovAver(output O, input rawIn){ param int N; graph stream<float x> O=Aver(rawIn){ float w[N]; init{ for(i=0;i<N;i++) w[i]=i;} work{ int sum=0, i; sum += rawIn[i].x*w[i]; O[0].i=sum/N; } window{ rawIn sliding(N,1) O tumbling(1);} }} … N Sliding Window Average tumbling Window 1 Sink

Example: Moving Average Source composite Main(){ Int N=10; graph stream<float x> S=Source(){ int x; init{x=0;} work{ S[0].x=x; x++; } //initialize an instance of composite MovAve stream<float x> P=MovAver(S)(N); Sink(P){ print(P[0].x); }} composite MovAver(output O, input rawIn){ param int N; graph stream<float x> O=Aver(rawIn){ float w[N]; init{ for(i=0;i<N;i++) w[i]=i;} work{ int sum=0, i; sum += rawIn[i].x*w[i]; O[0].i=sum/N; } window{ rawIn sliding(N,1) O tumbling(1);} }} … N Sliding Window Average tumbling Window 1 Sink

Example: Moving Average Source composite Main(){ Int N=10; graph stream<float x> S=Source(){ int x; init{x=0;} work{ S[0].x=x; x++; } //initialize an instance of composite MovAve stream<float x> P=MovAver(S)(N); Sink(P){ print(P[0].x); }} composite MovAver(output O, input rawIn){ param int N; graph stream<float x> O=Aver(rawIn){ float w[N]; init{ for(i=0;i<N;i++) w[i]=i;} work{ int sum=0, i; sum += rawIn[i].x*w[i]; O[0].i=sum/N; } window{ rawIn sliding(N,1) O tumbling(1);} }} … N Sliding Window Average tumbling Window 1 Sink

Example: Moving Average Source composite Main(){ Int N=10; graph stream<float x> S=Source(){ int x; init{x=0;} work{ S[0].x=x; x++; } //initialize an instance of composite MovAve stream<float x> P=MovAver(S)(N); Sink(P){ print(P[0].x); }} composite MovAver(output O, input rawIn){ param int N; graph stream<float x> O=Aver(rawIn){ float w[N]; init{ for(i=0;i<N;i++) w[i]=i;} work{ int sum=0, i; sum += rawIn[i].x*w[i]; O[0].i=sum/N; } window{ rawIn sliding(N,1) O tumbling(1);} }} … N Sliding Window Average tumbling Window 1 Sink

Example: Moving Average Source composite Main(){ Int N=10; graph stream<float x> S=Source(){ int x; init{x=0;} work{ S[0].x=x; x++; } //initialize an instance of composite MovAve stream<float x> P=MovAver(S)(N); Sink(P){ print(P[0].x); }} composite MovAver(output O, input rawIn){ param int N; graph stream<float x> O=Aver(rawIn){ float w[N]; init{ for(i=0;i<N;i++) w[i]=i;} work{ int sum=0, i; sum += rawIn[i].x*w[i]; O[0].i=sum/N; } window{ rawIn sliding(N,1) O tumbling(1);} }} … N Sliding Window Average tumbling Window 1 Sink

Compiler Framework COStream program B A C E Synchronization Data Flow Front-end Lexer, parser and construct AST B MIR Construct stream graph from AST A C E Synchronization Data Flow D Symbol Execute Initial && Periodic schedule of SG Runtime System Low-level c/c++ compiler Optimization Parallel exploring Buffer allocation Synchronization Multi/Many-core Code Generation

Software Pipelining A0 21 A1 B0 9 B1 B2 Core1 Core2 Core3 Step1:Task Assignment

Software Pipelining Stage1 Stage0 A0 21 A1 B0 9 B1 B2 Step2:Stage Assignment

Software Pipelining Time B0 B1 B2 Core1 Core2 Core3 A0 A1 … Barrier

Ongoing Research Compilation Backend Currently using pthread Software Pipelining within codelets Backend Currently using pthread Will use Fresh Breeze runtime

Ongoing Research (cont’d) Runtime Currently re-optimize for every new configuration Graceful transition between configurations

Transition between configurations

Transition between configurations

Transition between configurations

Transition between configurations

Acknowledgements AFOSR FA9550-13-1-0213 NSF CCF-0925863, CCF-0937907 and OCI-0904534