1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft.

Slides:



Advertisements
Similar presentations
Distributed Data-Parallel Programming using Dryad Andrew Birrell, Mihai Budiu, Dennis Fetterly, Michael Isard, Yuan Yu Microsoft Research Silicon Valley.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Cluster Computing with Dryad Mihai Budiu, MSR-SVC LiveLabs, March 2008.
Configuration management
Configuration management
Distributed Data-Parallel Computing Using a High-Level Programming Language Yuan Yu Michael Isard Joint work with: Andrew Birrell, Mihai Budiu, Jon Currey,
epiC: an Extensible and Scalable System for Processing Big Data
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Introduction CSCI 444/544 Operating Systems Fall 2008.
DryadLINQ A System for General-Purpose Distributed Data-Parallel Computing Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep.
Optimus: A Dynamic Rewriting Framework for Data-Parallel Execution Plans Qifa Ke, Michael Isard, Yuan Yu Microsoft Research Silicon Valley EuroSys 2013.
Distributed Computations
From LINQ to DryadLINQ Michael Isard Workshop on Data-Intensive Scientific Computing Using DryadLINQ.
Distributed computing using Dryad Michael Isard Microsoft Research Silicon Valley.
Dryad / DryadLINQ Slides adapted from those of Yuan Yu and Michael Isard.
1: Operating Systems Overview
CPS216: Advanced Database Systems (Data-intensive Computing Systems) How MapReduce Works (in Hadoop) Shivnath Babu.
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
 Introduction Introduction  Definition of Operating System Definition of Operating System  Abstract View of OperatingSystem Abstract View of OperatingSystem.
Module 15: Monitoring. Overview Formulate requirements and identify resources to monitor in a database environment Types of monitoring that can be carried.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
Computer System Architectures Computer System Software
Microsoft DryadLINQ --Jinling Li. What’s DryadLINQ? A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. [1]
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
LOGO Scheduling system for distributed MPD data processing Gertsenberger K. V. Joint Institute for Nuclear Research, Dubna.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Configuration Management (CM)
MapReduce How to painlessly process terabytes of data.
MapReduce M/R slides adapted from those of Jeff Dean’s.
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
PARALLEL APPLICATIONS EE 524/CS 561 Kishore Dhaveji 01/09/2000.
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
Dryad and DryaLINQ. Dryad and DryadLINQ Dryad provides automatic distributed execution DryadLINQ provides automatic query plan generation Dryad provides.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
1 The EDIT System, Overview European Commission – Eurostat.
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Definition DryadLINQ is a simple, powerful, and elegant programming environment for writing large-scale data parallel applications running on large PC.
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Our Graphics Environment Landscape Rendering. Hardware  CPU  Modern CPUs are multicore processors  User programs can run at the same time as other.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
TensorFlow– A system for large-scale machine learning
Some slides adapted from those of Yuan Yu and Michael Isard
Game Architecture Rabin is a good overview of everything to do with Games A lot of these slides come from the 1st edition CS 4455.
CSCI5570 Large Scale Data Processing Systems
Spark Presentation.
Parallel Programming By J. H. Wang May 2, 2017.
Distributed Computations MapReduce/Dryad
Abstract Major Cloud computing companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for.
Parallel Computing with Dryad
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
Chapter 17 Parallel Processing
Introduction to locality sensitive approach to distributed systems
DryadInc: Reusing work in large-scale computations
CS639: Data Management for Data Science
Operating System Overview
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft Research, Silicon Valley Presented by: Thomas Hummel

2  Introduction  System Overview  Dryad Graph  Program Development  Program Execution  Experimental Results  Future Work Agenda

Introduction  Problem  How to write efficient distributed programs easily?  Environment  Parallel Processors  High Speed Links  Administered Domain  Ignore Low Level Issues 3

Introduction  Parallel Execution  Faster Execution  Automatic Specification  Manual Specification  GPU Shader  Distributed Databases  MapReduce 4

Introduction 5  Graph Model  Verticies Are Programs  Edges Are Communication Links  Forced Parallelism Mindset  Necessary Abstraction

Introduction 6  GPU Shader  Low Level  Hardware Specific  MapReduce  Simplicity Paramount  Performance Sacrificed  Database  Implicit Communication  Algebra Optimized

Introduction 7  Dryad  Fine Communication Control  Multiple Input/Output Sets  Must Consider Resources  Execution Engine  Executes DAG Of Programs  Outputs Directed To Inputs  No Recursion

System Overview 8  Dryad Job  DAG  Data Passed On Edges  Vertex is a Program  Message Structure  User Defined  Shared Memory  TCP  Files

System Overview 9  Dryad Job  DAG  Data Passed On Edges  Vertex is a Program  Message Structure  User Defined  Shared Memory  TCP  Files

System Overview 10  System Organization  Job Manager  Name Server  Dameon (Work Nodes)

Dryad Graph 11  Graph Description Language  “Embedded” in C++  Combine Sub-Graphs  C++ Class  Inherited By Vertex Program  Program Name  Program Factory

Dryad Graph 12  Vertex Creation  C++ Class  Inherited By Vertex Program  Program Name  Program Factory  One Vertex Is a Graph  Factory Called  Program Specific Arguments Applied

Dryad Graph 13  Edge Creation  Composition (Combine) Operation  Two Graphs  Varying Assignment Methods

Dryad Graph 14

Dryad Graph 15  Communication Channel  File I/O By Default  TCP  Shared Memory  Pitfall: Connected Vertices Must Be On Same Process  Deadlock Avoidance  DAG Architecture

Program Development 16  Vertex Program Development  C++ Base Classes  Status And Errors Reported to Job Manager  Standard “Main” Method  Channel Readers/Writers  Supplied Via Argument List  Legacy Programs  C++ Wrapper

Program Development 17  Pipelined Execution  Assuming Sequential Code  Event Based Programming  Channels Are Asynchronous  Thread Pool  Optimized For Verticies

Program Execution 18  Job Manager  Job Ends If JM Machine Fails  Different Schemes Possible To Avoid This  Versioning System For Execution Instances  Vertex Execution  Starts When All Input Channels Ready  User Can Specify Execution Machine  Can Be Re-Run On Failures  Job Ends After All Verticies Have Run

Program Execution 19  Fault Tolerance  Re-Run Vertex If Failed  Channel Re-Creation (File Recreation)  TCP/Shared Memory Failures Cause Failures On All Connected Vertices  Staged Execution Allows Intermediate Error Checking

Experimental Results 20  SQL Operation  10 Computer Cluster  Gigabit Connections  Data Mining Operation  1800 Computer Cluster  10 TB Data Set  11 Minute Execution Time

Future Work 21  Scripting Language  Nebula  Additional Abstraction  SISS Integration  SQL Server Integration  Distributed SQL Queries  Query Optimizer