Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB.

Slides:



Advertisements
Similar presentations
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Advertisements

Adopt Algorithm for Distributed Constraint Optimization
CS 484. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
IP Routing Lookups Scalable High Speed IP Routing Lookups.
Work Stealing for Irregular Parallel Applications on Computational Grids Vladimir Janjic University of St Andrews 12th December 2011.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
Scalable and Crash-Tolerant Load Balancing based on Switch Migration
Peer services: from Description to Invocation Manuel Oriol International Workshop on Agents and Peer-to-Peer Computing (AP2PC 2002)
Scalable Application Layer Multicast Suman Banerjee Bobby Bhattacharjee Christopher Kommareddy ACM SIGCOMM Computer Communication Review, Proceedings of.
Architecture and Real Time Systems Lab University of Massachusetts, Amherst An Application Driven Reliability Measures and Evaluation Tool for Fault Tolerant.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Centralized vs. Decentralized Design for Internet Applications Adriana Iamnitchi Department of Computer Science The University of Chicago I.
CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
A Grid-enabled Branch and Bound Algorithm for Solving Challenging Combinatorial Optimization Problems Authors: M. Mezmaz, N. Melab and E-G. Talbi Presented.
CX: A Scalable, Robust Network for Parallel Computing Peter Cappello & Dimitrios Mourloukos Computer Science UCSB.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
N-Tier Architecture.
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Lecture 0 Anish Arora CSE 6333 Introduction to Distributed Computing.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Pregel: A System for Large-Scale Graph Processing Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and.
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
Distributed Software Engineering Lecture 1 Introduction Sam Malek SWE 622, Fall 2012 George Mason University.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Advanced Eager Scheduling for Java-Based Adaptively Parallel Computing Michael O. Neary & Peter Cappello Computer Science Department UC Santa Barbara.
Classification and Analysis of Distributed Event Filtering Algorithms Sven Bittner Dr. Annika Hinze University of Waikato New Zealand Presentation at CoopIS.
1 Program Testing (Lecture 14) Prof. R. Mall Dept. of CSE, IIT, Kharagpur.
1 An Adaptive File Distribution Algorithm for Wide Area Network Takashi Hoshino, Kenjiro Taura, Takashi Chikayama University of Tokyo.
J ICOS A Java-centric Internet Computing System Peter Cappello Computer Science Department UC Santa Barbara.
J ICOS’s Abstract Distributed Service Component Peter Cappello Computer Science Department UC Santa Barbara.
WS-DREAM: A Distributed Reliability Assessment Mechanism for Web Services Zibin Zheng, Michael R. Lyu Department of Computer Science & Engineering The.
Workshop on Parallelization of Coupled-Cluster Methods Panel 1: Parallel efficiency An incomplete list of thoughts Bert de Jong High Performance Software.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
CS 584. Discrete Optimization Problems A discrete optimization problem can be expressed as (S, f) S is the set of all feasible solutions f is the cost.
Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
Advanced Computer Networks Lecture 1 - Parallelization 1.
1 Fault-Tolerant Mechanism for Hierarchical Branch and Bound Algorithm Université A/Mira de Béjaïa CEntre de Recherche sur l’Information Scientifique et.
Proposal of Asynchronous Distributed Branch and Bound Atsushi Sasaki†, Tadashi Araragi†, Shigeru Masuyama‡ †NTT Communication Science Laboratories, NTT.
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
Security Kim Soo Jin. 2 Contents Background Introduction Secure multicast using clustering Spatial Clustering Simulation Experiment Conclusions.
Dijkstra-Scholten and Shavit-Francez termination algorithms
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
Seminar On Rain Technology
JICOS A Java-Centric Network Computing Service Peter Cappello & Christopher James Coakley Computer Science University of California, Santa Barbara.
J ICOS A Java-Centric Distributed Computing Service Peter Cappello & Chris Coakley Computer Science Department UC Santa Barbara.
Towards High Performance Processing of Streaming Data May Supun Kamburugamuve, Saliya Ekanayake, Milinda Pathirage and Geoffrey C. Fox Indiana.
TensorFlow– A system for large-scale machine learning
Paul Ammann & Jeff Offutt
Web: Parallel Computing Rabie A. Ramadan , PhD Web:
JICOS A Java-Centric Distributed Computing Service
Introduction to Load Balancing:
Software Testing.
CX: A Scalable, Robust Network for Parallel Computing
Introduction to Data Management in EGI
Replication Middleware for Cloud Based Storage Service
Software Testing (Lecture 11-a)
CLUSTER COMPUTING.
The Design of a Grid Computing System for Drug Discovery and Design
CSC4005 – Distributed and Parallel Computing
Deterministic and Semantically Organized Network Topology
We have the following incomplete B&B tree:
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB

Introduction Goals Service parallel applications that are: –Large: too big for a cluster –Coarse-grain: to hide communication latency Simplicity of use –Design focus: decomposition [composition] of computation. Scalable high performance –despite large communication latency Fault-tolerance –1000s of hosts, each dynamically [dis]associates.

Introduction Some Related Work

Introduction Some Applications Search for extra-terrestrial life Computer-generated animation Computer modeling of drugs for: –Influenza –Cancer –Reducing chemotherapy’s side-effects Financial modeling Storing nuclear waste

Outline Architecture Model of Computation API Scalable Computation Experimental Results Conclusions & Future Work

Architecture Basic Components Brokers Clients Hosts

Architecture Broker Discovery B BB B B BBB Broker Naming System B H

Architecture Broker Discovery B BB B B BBB Broker Naming System B H

Architecture Broker Discovery B B B B B B BB Broker Naming System B H

Architecture Broker Discovery B BB B B BBB Broker Naming System B H PING (BID?)

Architecture Broker Discovery B BB B B BBB Broker Naming System B H

Architecture Network of Broker-Managed Host Trees Each broker manages a tree of hosts

Architecture Network of Broker-Managed Host Trees Brokers form a network

Architecture Network of Broker-Managed Host Trees Brokers form a network Client contacts broker

Architecture Network of Broker-Managed Host Trees Brokers form a network Client contacts broker Client gets host trees

Scalable Computation Deterministic Work-Stealing Scheduler Task container addTask( task )getTask( ) stealTask( ) HOST

Scalable Computation Deterministic Work-Stealing Scheduler Task getWork( ) { if ( my deque has a task ) return task; else if ( any child has a task ) return child’s task; else return parent.getWork( ); } CLIENT HOSTS

Models of Computation Master-slave –AFAIK all proposed commercial applications Branch-&-bound optimization –A generalization of master-slave.

Models of Computation Branch & Bound UPPER =  LOWER = 0

Models of Computation Branch & Bound UPPER =  LOWER = 2

Models of Computation Branch & Bound UPPER =  LOWER = 3

Models of Computation Branch & Bound UPPER = 4 LOWER = 4

Models of Computation Branch & Bound UPPER = 3 LOWER = 3

Models of Computation Branch & Bound UPPER = 3 LOWER = 6

Models of Computation Branch & Bound UPPER = 3 LOWER =

Models of Computation Branch & Bound Tasks created dynamically Upper bound is shared To detect termination: scheduler detects tasks that have been: –Completed –Killed (“bounded”)

API public class Host implements Runnable {... public void run() { while ( (node = jDM.getWork()) != null ) { if ( isAtomic() ) compute(); // search space; return result else { child = node.branch(); // put children in child array for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) jDM.addWork( child[i] ); //else child is killed implicitly }

API private void compute() {... boolean newBest = false; while ( (node = stack.pop()) != null ) { if ( node.isComplete() ) if ( node.getCost() < UpperBound ) { newBest = true; UpperBound = node.getCost(); jDM.propagateValue( UpperBound ); best = Node( child[i] ); } else { child = node.branch(); for (int i = 0; i < node.numChildren; i++) if ( child[i].setLowerBound() < UpperBound ) stack.push( child[i] ); //else child is killed implicitly } } if ( newBest ) jDM.returnResult( best ); }

Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

Scalable Computation Weak Shared Memory Model Slow propagation of bound affects performance not correctness. Propagate bound

Scalable Computation Fault Tolerance via Eager Scheduling When: All tasks have been assigned Some results have not been reported A host wants a new task Re-assign a task! Eager scheduling tolerates faults & balances the load. –Computation completes, if at least 1 host communicates with client.

Scalable Computation Fault Tolerance via Eager Scheduling Scheduler must know which: –Tasks have completed –Nodes have been killed Performance  balance –Centralized schedule info –Decentralized computation

Experimental Results

Example of a “bad” graph

Conclusions Javelin 2 relieves designer/programmer managing a set of [Inter-] networked processors that is: –Dynamic –Faulty A wide set of applications is covered by: –Master-slave model –Branch & bound model Weak shared memory performs well. Use multicast (?) for: –Code distribution –Propagating values

Future Work Improve support for long-lived computation: –Do not require that the client run continuously. A dag model of computation –with limited weak shared memory.

Future Work Jini/JavaSpaces Technology TaskManager aka Broker HH HH H H H H “Continuously” disperse Tasks among brokers via a physics model

Future Work Jini/JavaSpaces Technology TaskManager uses persistent JavaSpace –Host management: trivial –Eager scheduling: simple No single point of failure –Fat tree topology

Future Work Advanced Issues Privacy of data & algorithm Algorithms –New computation-communication complexity model –N-body problem, … Accounting: Associate specific work with specific host –Correctness –Compensation (how to quantify?) Create open source organization –System infrastructure –Application codes