High level programming for the Grid Gosia Wrzesinska Dept. of Computer Science Vrije Universiteit Amsterdam vrije Universiteit.

Slides:



Advertisements
Similar presentations
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Experiences.
Advertisements

Vrije Universiteit Interdroid: a platform for distributed smartphone applications Henri Bal, Nick Palmer, Roelof Kemp, Thilo Kielmann High Performance.
7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
Christian Delbe1 Christian Delbé OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis November Automatic Fault Tolerance in ProActive.
Cilk NOW Based on a paper by Robert D. Blumofe & Philip A. Lisiecki.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Work Stealing for Irregular Parallel Applications on Computational Grids Vladimir Janjic University of St Andrews 12th December 2011.
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
The road to reliable, autonomous distributed systems
Distributed supercomputing on DAS, GridLab, and Grid’5000 Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal, Jason Maassen, Rob van Nieuwpoort, Thilo Kielmann, Niels Drost, Ceriel Jacobs, Frank.
Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat.
Grid Adventures on DAS, GridLab and Grid'5000 Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Ibis: a Java-centric Programming Environment for Computational Grids Henri Bal Vrije Universiteit Amsterdam vrije Universiteit.
Transposition Driven Work Scheduling in Distributed Search Department of Computer Science vrijeamsterdam vrije Universiteit amsterdam John W. Romein Aske.
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Software Engineering and Middleware A Roadmap Author: Wolfgang Emmerich Presented by: Sam Malek.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Distributed Databases
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
RUNNING PARALLEL APPLICATIONS BEYOND EP WORKLOADS IN DISTRIBUTED COMPUTING ENVIRONMENTS Zholudev Yury.
This work was carried out in the context of the Virtual Laboratory for e-Science project. This project is supported by a BSIK grant from the Dutch Ministry.
Panel Abstractions for Large-Scale Distributed Systems Henri Bal Vrije Universiteit Amsterdam.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters,
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
SALSA: Language and Architecture for Widely Distributed Actor Systems. Carlos Varela, Abe Stephens, Department of.
Communication Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
7/26/ Design and Implementation of a Simple Totally-Ordered Reliable Multicast Protocol in Java.
OS2- Sem ; R. Jalili Introduction Chapter 1.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
A High Performance Middleware in Java with a Real Application Fabrice Huet*, Denis Caromel*, Henri Bal + * Inria-I3S-CNRS, Sophia-Antipolis, France + Vrije.
More on Adaptivity in Grids Sathish S. Vadhiyar Source/Credits: Figures from the referenced papers.
Fault-tolerant Scheduling of Fine- grained Tasks in Grid Environments.
Databases Illuminated
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Gluepy: A Framework for Flexible Programming in Complex Grid Environments Ken Hironaka Hideo Saito Kei Takahashi Kenjiro Taura (University of Tokyo) {kenny,
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
Grid Computing Framework A Java framework for managed modular distributed parallel computing.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
FATCOP: A Mixed Integer Program Solver Michael FerrisQun Chen University of Wisconsin-Madison Jeffrey Linderoth Argonne National Laboratories.
(Superficial!) Review of Uniprocessor Architecture Parallel Architectures and Related concepts CS 433 Laxmikant Kale University of Illinois at Urbana-Champaign.
Wide-Area Parallel Computing in Java Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences vrije Universiteit.
Distributed Computing Systems CSCI 6900/4900. Review Definition & characteristics of distributed systems Distributed system organization Design goals.
1 Cilk Chao Huang CS498LVK. 2 Introduction A multithreaded parallel programming language Effective for exploiting dynamic, asynchronous parallelism (Chess.
Parallel Computing on Wide-Area Clusters: the Albatross Project Aske Plaat Thilo Kielmann Jason Maassen Rob van Nieuwpoort Ronald Veldema Vrije Universiteit.
Next Generation of Apache Hadoop MapReduce Owen
Fault tolerance, malleability and migration for divide-and-conquer applications on the Grid Gosia Wrzesińska, Rob V. van Nieuwpoort, Jason Maassen, Henri.
Definition of Distributed System
Advanced Operating Systems
湖南大学-信息科学与工程学院-计算机与科学系
The Arabica Project A distributed scientific computing project based on a cluster computer and Java technologies. Daniel D. Warner Dept. of Mathematical.
Atlas: An Infrastructure for Global Computing
Parallel programming in Java
Presentation transcript:

High level programming for the Grid Gosia Wrzesinska Dept. of Computer Science Vrije Universiteit Amsterdam vrije Universiteit

Distributed supercomputing Parallel processing on geographically distributed computing systems (grids) Programming for grids is hard –Heterogeneity –Slow network links –Nodes joining, leaving, crashing We need a grid programming environment to hide this complexity Amsterdam Cardiff Brno Internet Berlin

Satin: Divide-and-Conquer for Grids Divide-and-Conquer (fork/join parallelism) –Is inherently hierarchical (fits the platform) –Has many applications: parallel rendering, SAT solver, VLSI routing, N-body simulation, multiple sequence alignment, grammar based learning Satin: –High-level programming model –Java-based –Grid aware load balancing –Support for fault tolerance, malleability, migration fib(1) fib(0) fib(4) fib(1) fib(2) fib(3) fib(5) fib(1) fib(2) cpu 2 cpu 1 cpu 3 cpu 1

Example: Raytracer public class Raytracer { BitMap render(Scene scene, int x, int y, int w, int h) { if (w < THRESHOLD && h < THRESHOLD) { /*render sequentially*/ } else { res1 = render(scene,x,y,w/2,h/2); res2 = render(scene,x+w/2,y,w/2,h/2); res3 = render(scene,x,y+h/2,w/2,h/2); res4 = render(scene,x+w/2,y+h/2,w/2,h/2); return combineResults(res1, res2, res3, res4); }

Parallelizing the Raytracer interface RaytracerInterface extends satin.Spawnable { BitMap render(Scene scene, int x, int y, int w, int h); } public class Raytracer extends satin.SatinObject() implements RaytracerInterface{ BitMap render(Scene scene, int x, int y, int w, int h) { if (w < THRESHOLD && h < THRESHOLD) { /*render sequentially*/ } else { res1 = render(scene,x,y,w/2,h/2); /*spawn*/ res2 = render(scene,x+w/2,y,w/2,h/2); /*spawn*/ res3 = render(scene,x,y+h/2,w/2,h/2); /*spawn*/ res4 = render(scene,x+w/2,y+h/2,w/2,h/2); /*spawn*/ sync(); return combineResults(res1, res2, res3, res4); }

Running Satin applications Job4Job5Job2 Job3 Job8 Job16 Job9 Job17 Job10Job11 Job12 Job6Job7 Job13 Job1 processor 1 processor 2 processor 3r4 r2 r1

Performance on the Grid GridLab testbed: 5 cities in Europe 40 cpus in total Different architectures, OS Large differences in processor speeds Latencies: –0.2 – 210 ms daytime –0.2 – 66 ms night Bandwidth: –9KB/s – 11MB/s 80% efficiency

Join: let it start stealing Leave, crash: –avoid checkpointing –recompute Optimizations: –reusing orphan jobs –reusing results from gracefully leaving processors We can: –Tolerate crashes with minimal loss of work –Add and remove (gracefully) processors with no loss –Efficiently migrate (add new nodes + remove old nodes) Fault tolerance, malleability, migration orphans recompute

The performance of FT and malleability 16 cpus Amsterdam 16 cpus Leiden

Efficient migration 4 cpus Berlin 4 cpus Brno 8 cpus Leiden (Leiden part migrated to Delft)

Shared data for d&c applications Data sharing abstraction needed to extend applicability of Satin –Branch & bound, game tree search etc. Sequential consistency inefficient on the Grid –High latencies –Nodes leaving and joining Applications often allow weaker consistency

Shared objects with guard consistency Define consistency requirements with guard functions –Guard checks if the local replica is consistent Replicas allowed to become inconsistent as long as guards satisfied –If guard unsatisfied, bring replica into consistent state Applications: VLSI routing, learning SAT solver, TSP, N-body simulation

Shared objects performance 3 clusters in France (Grid5000), 120 nodes Wide-area, heterogeneous testbed Latency: 4-10 ms Bandwidth: Mbps Ran VLSI routing app 86% efficiency Rennes Nice Bordeaux

Summary Satin: a grid programming environment –Allows rapid development of parallel applications –Performs well on wide-area, heterogeneous systems –Adapts to changing sets of resources –Tolerates node crashes –Provides divide-and-conquer + shared objects programming model –Applications: parallel rendering, SAT solver, VLSI routing, N-body simulation, multiple sequence alignment, grammar based learning etc.

Acknowledgements Henri Bal Jason Maassen Rob van Nieuwpoort Ceriel Jacobs Kees Verstoep Kees van Reeuwijk Maik Nijhuis Thilo Kielmann Publications and software distribution available at: vrije Universiteit

Additional Slides

Guards: example /*divide-and-conquer job*/ List computeForces(byte[] nodeId, int iteration, Bodies bodies) { /*compute forces for subtree rooted at nodeId*/ } /*guard function*/ boolean guard_computeForces(byte[] nodeId, int iteration, Bodies bodies) { return (bodies.iteration+1 != iteration); }

Handling orphan jobs - example processor 1 processor 2 processor

Handling orphan jobs - example processor 1 processor

Handling orphan jobs - example 1367 processor 1 processor

Handling orphan jobs - example 1367 processor 1 processor 3 (9, cpu3) (15,cpu3) 9cpu3 15 cpu broadcast

Handling orphan jobs - example 1367 processor 1 processor 3 9cpu3 15 cpu

Handling orphan jobs - example processor 1 processor 3 9cpu3 15 cpu3 1213

Handling orphan jobs - example processor 1 processor 3 9cpu3 15 cpu3 1213

Processors leaving gracefully processor 1 processor 2 processor

Processors leaving gracefully processor 1 processor 2 processor 3 8 Send results to another processor; treat those results as orphans

Processors leaving gracefully processor 1 processor Send results to another processor; treat those results as orphans

Processors leaving gracefully processor 1 processor 3 (11,cpu3)(9,cpu3)(15,cpu3) 11 cpu3 9 cpu3 15 cpu

Processors leaving gracefully processor 1 processor cpu3 9 cpu3 15 cpu3 1213

Processors leaving gracefully processor 1 processor cpu3 9 cpu3 15 cpu3 1213

The Ibis system Java-centric => portability –„write once, run anywhere” Efficient communication –Efficient pure Java implementation –Optimized solutions for special cases with native code High level programming models: –Divide & Conquer (Satin) –Remote Method Invocation (RMI) –Replicated Method Invocation (RepMI) –Group Method Invocation (GMI)

Ibis design

Compiling Satin programs Java compiler bytecode rewriter JVM source bytecode

Executing Satin programs Spawn: put work in work queue Sync: –Run work from queue –If empty: steal (load balancing)

Satin: load balancing for Grids Random Stealing (RS) –Pick a victim at random –Provably optimal on a single cluster (Cilk) –Problems on multiple clusters: (C-1)/C % stealing over WAN Synchronous protocol

Grid-aware load balancing Cluster-aware Random Stealing (CRS) [van Nieuwpoort et al., PPoPP 2001] –When idle: Send asynchronous steal request to random node in different cluster In the meantime steal locally (synchronously) Only one wide-area steal request at a time

Configuration LocationTypeOSCPUCPUs Amsterdam, The Netherlands Cluste r LinuxPentium-38 x 1 Amsterdam, The Netherlands SMPSolarisSparc1 x 2 Brno, Czech Republic Cluste r LinuxXeon4 x 2 Cardiff, Wales, UK SMPLinuxPentium-31 x 2 ZIB Berlin, Germany SMPIrixMIPS1 x 16 Lecce, Italy SMPTru64Alpha1 x 4

Handling orphan jobs For each finished orphan, broadcast (jobID,processorID) tuple; abort the rest All processors store tuples in orphan tables Processors perform lookups in orphan tables for each recomputed job If successful: send a result request to the owner (async), put the job on a stolen jobs list processor 3 broadcast (9,cpu3)(15,cpu3)

A crash of the master Master: the processor that started the computation by spawning the root job If master crashes: –Elect a new master –Execute normal crash recovery –New master restarts the applications –In the new run, all results from the previous run are reused

Some remarks about scalability Little data is broadcast (< 1% jobs, pointers) Message combining Lightweight broadcast: no need for reliability, synchronization, etc.

Job identifiers rootId = 1 childId = parentId * branching_factor + child_no Problem: need to know maximal branching factor of the tree Solution: strings of bytes, one byte per tree level

Shared Objects - example public interface BarnesHutInterface extends WriteMethods { void computeForces(

Satin “Hello world”: Satonacci class Sat { int Sat (int n) { if (n < 2) return n; int x = Sat(n-1); int y = Sat(n-2); return x + y; } Single-threaded Java Sat(1) Sat(0) Sat(4) Sat(1) Sat(2) Sat(3) Sat(5) Sat(1) Sat(2)

Parallelizing Satonacci public interface SatInter extends ibis.satin.Spawnable { public int Sat (int n); } class Sat extends ibis.satin.SatinObject implements SatInter { public int Sat (int n) { if (n < 2) return n; int x = Sat(n-1); /*spawned*/ int y = Sat(n-2); /*spawned*/ sync(); return x + y; } Leiden Delft Brno Internet Berlin

Satonacci – c.d. Sat(3)Sat(2)Sat(4) Sat(3) Sat(2) Sat(1) Sat(0) Sat(1)Sat(0) Sat(1) Sat(2)Sat(1) Sat(0) Sat(5) processor 1 processor 2 processor