JICOS A Java-Centric Network Computing Service Peter Cappello & Christopher James Coakley Computer Science University of California, Santa Barbara.

Slides:



Advertisements
Similar presentations
Distributed Processing, Client/Server and Clusters
Advertisements

Distributed Systems CS
M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Development of Parallel Simulator for Wireless WCDMA Network Hong Zhang Communication lab of HUT.
The Stanford Directory Architecture for Shared Memory (DASH)* Presented by: Michael Bauer ECE 259/CPS 221 Spring Semester 2008 Dr. Lebeck * Based on “The.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Computer Architecture Introduction to MIMD architectures Ola Flygt Växjö University
Hiperspace Lab University of Delaware Antony, Sara, Mike, Ben, Dave, Sreedevi, Emily, and Lori.
A Parallel Computational Model for Heterogeneous Clusters Jose Luis Bosque, Luis Pastor, IEEE TRASACTION ON PARALLEL AND DISTRIBUTED SYSTEM, VOL. 17, NO.
Microsoft Cloud Futures 2010 April 9, 2010 Jie Li 1, Youngryel Ryu 2, Deb Agarwal 3, Keith Jackson 3, Marty Humphrey 1, Catharine van Ingen 4 University.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Steve Lantz Computing and Information Science Parallel Performance Week 7 Lecture Notes.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
CX: A Scalable, Robust Network for Parallel Computing Peter Cappello & Dimitrios Mourloukos Computer Science UCSB.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
CS 470/570:Introduction to Parallel and Distributed Computing.
MapReduce: Simplified Data Processing on Large Clusters 컴퓨터학과 김정수.
Computer System Architectures Computer System Software
RUNNING PARALLEL APPLICATIONS BEYOND EP WORKLOADS IN DISTRIBUTED COMPUTING ENVIRONMENTS Zholudev Yury.
Panel Abstractions for Large-Scale Distributed Systems Henri Bal Vrije Universiteit Amsterdam.
Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.
Performance Evaluation of Parallel Processing. Why Performance?
UNIT - 1Topic - 2 C OMPUTING E NVIRONMENTS. What is Computing Environment? Computing Environment explains how a collection of computers will process and.
Pregel: A System for Large-Scale Graph Processing Presented by Dylan Davis Authors: Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert,
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters Kenji Kaneda Yoshihiro Oyama Akinori Yonezawa (University of Tokyo)
IMDGs An essential part of your architecture. About me
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Heterogeneous Multikernel OS Yauhen Klimiankou BSUIR
Operating System for the Cloud Runs applications in the cloud Provides Storage Application Management Windows Azure ideal for applications needing:
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Distributed Computing CSC 345 – Operating Systems By - Fure Unukpo 1 Saturday, April 26, 2014.
Advanced Eager Scheduling for Java-Based Adaptively Parallel Computing Michael O. Neary & Peter Cappello Computer Science Department UC Santa Barbara.
PART II OPERATING SYSTEMS LECTURE 8 SO TAXONOMY Ştefan Stăncescu 1.
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
CS 326: Functional Programming 1. 2 Erlang – A survey of the language & applications Paper by: Joe Armstrong, Computer Science Laboratory, Ericsson Telecom.
J ICOS A Java-centric Internet Computing System Peter Cappello Computer Science Department UC Santa Barbara.
Mobile Agents For Mobile Computing Department Of Computer Science – Dartmouth College Robert Gray David Kotz Saurab Nog Daniela Rus George Cybenko.
J ICOS’s Abstract Distributed Service Component Peter Cappello Computer Science Department UC Santa Barbara.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
Workshop on Parallelization of Coupled-Cluster Methods Panel 1: Parallel efficiency An incomplete list of thoughts Bert de Jong High Performance Software.
Data Structures and Algorithms in Parallel Computing Lecture 1.
Internet-Based TSP Computation with Javelin++ Michael Neary & Peter Cappello Computer Science, UCSB.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
OpenMP for Networks of SMPs Y. Charlie Hu, Honghui Lu, Alan L. Cox, Willy Zwaenepoel ECE1747 – Parallel Programming Vicky Tsang.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Java-Based Parallel Computing on the Internet: Javelin 2.0 & Beyond Michael Neary & Peter Cappello Computer Science, UCSB.
Performance Evaluation of Parallel Algorithms on a Computational Grid Environment Simona Blandino 1, Salvatore Cavalieri 2 1 Consorzio COMETA, 2 Faculty.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Dynamic Load Balancing Tree and Structured Computations.
Project Paper Presentation Hanlin Wan March 15, 2011.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
J ICOS A Java-Centric Distributed Computing Service Peter Cappello & Chris Coakley Computer Science Department UC Santa Barbara.
Using volunteered resources for data-intensive computing and storage David Anderson Space Sciences Lab UC Berkeley 10 April 2012.
A few words about parallel computing
Software Architecture in Practice
Hadoop Aakash Kag What Why How 1.
JICOS A Java-Centric Distributed Computing Service
For Massively Parallel Computation The Chaotic State of the Art
CX: A Scalable, Robust Network for Parallel Computing
AGENT OS.
Chapter 17: Database System Architectures
Cloud Web Filtering Platform
Atlas: An Infrastructure for Global Computing
Database System Architectures
Presentation transcript:

JICOS A Java-Centric Network Computing Service Peter Cappello & Christopher James Coakley Computer Science University of California, Santa Barbara

API Goals Application program is oblivious to: –Number of processors –Processor topology –Inter-process communication –Faulty compute servers

API Divide & Conquer (DAC) f(3)f(2) + f(1) f(0) + + f(1)f(0) + f(4)

API DAC Common environment –Input object –Shared object f(3)f(2) + f(1) f(0) + + f(1)f(0) + f(4)

Architectural Goals Scalable Heterogeneous processors & OS Mobile code Support adaptively parallel computation Tolerate faulty compute servers Reduce or hide communication latency

Architecture H H H H H H S H H H H H H

M C login setComputation getResult logout

Hiding Communication Latency Task Caching f(3)f(2) + f(1) f(0) + + f(1)f(0) + f(4)

Hiding Communication Latency Task Pre-fetching f(3)f(2) + f(1) f(0) + + f(1)f(0) + f(4)

Hiding Communication Latency Execute Task on Server f(3)f(2) + f(1) f(0) + + f(1)f(0) + f(4)

Tolerating Faulty Hosts Transactions kill performance

Tolerating Faulty Hosts Transactions kill performance

Tolerating Faulty Hosts ProxyH TASKS Proxy TASKS TASK H

Performance Experiments Problem: 200-city TSP –61,295 BranchAndBound Tasks (2.05s) –30,647 MinSolution Tasks (< 1ms) 120-processor experiments use 3 processor types (CX journal paper derives formula)

Fault Tolerance Experiments Problem: 200-city TSP Killed p processors after 1,500s, for p = 2, 6, 20, 24, 26, 30. % overhead: actual time / ideal time H H H H H H H H H H H H H H H H S 32 PROCESSORS

Fault Tolerance Experiments

Task Server Overhead H H H H H H H H H H H H H H H S/H H H H H H H H H H H H H H H H H S 22 PROCESSORS 11 HOSTS, 1 SERVER 12 MACHINES TIME: s 22 PROCESSORS 11 HOSTS, 1 SERVER 11 MACHINES TIME: s

Conclusions API = Cilk + Common Task Environment Architecture –network of servers, each serving many hosts. –Supports adaptive parallelism –Efficiently tolerates faulty hosts Excellent speedups –2 processors (1 Host): 9 hours and 32 minutes –120 processors: < 12 minutes (96.66 % ideal) –3 application-controlled latency-hiding directives Small Server overhead: Run Host on Server

THANK YOU! URL: cs.ucsb.edu/projects/jicos Download System Source Tutorial

A Distributed Computing Taxonomy NOT Application fixes processor topology a priori Adaptively parallelNOT Tolerates faulty computer serversNOT Divide & Conquer API NOT

Ancestry CilkLinda AtlasSatinJavelinCX JICOS Pirannha