Andrei Goldchleger, Fabio Kon, Alfredo Goldman and Marcelo Finger Department of Computer.

Slides:

Advertisements

Similar presentations

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University

Advertisements

Ubiquitous Computing and Active Spaces The Gaia Approach Fabio Kon Department of Computer Science University of São Paulo, Brazil

Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.

Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.

A Computation Management Agent for Multi-Institutional Grids

Reseach in DistriNet (department of computer science, K.U.Leuven) General overview and focus on embedded systems task-force.

Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.

Workload Management Workpackage Massimo Sgaravatto INFN Padova.

Mobile Agents: A Key for Effective Pervasive Computing Roberto Speicys Cardoso & Fabio Kon University of São Paulo - Brazil.

Andrei Goldchleger, Fabio Kon, Alfredo Goldman and Marcelo Finger University of São.

Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.

Ch 12 Distributed Systems Architectures

Workload Management Massimo Sgaravatto INFN Padova.

Checkpointing-based Rollback Recovery for Parallel Applications on the InteGrade Grid Middleware Raphael Y. de Camargo Andrei Goldchleger Fabio Kon Alfredo.

Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.

1 Int System Introduction to Systems and Networking Department Faculty of Computer Science and Engineering Ho Chi Minh City University of Technology.

Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.

Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.

Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.

Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.

LIGO-G E ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech Argonne, May 10, 2004.

Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.

A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster

A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.

DISTRIBUTED COMPUTING

Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.

CoG Kit Overview Gregor von Laszewski Keith Jackson.

Component Architecture (CORBA – RMI) -Shalini Pradhan.

Nicholas LoulloudesMarch 3 rd, 2009 g-Eclipse Testing and Benchmarking Grid Infrastructures using the g-Eclipse Framework Nicholas Loulloudes On behalf.

SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.

Rio de Janeiro, October, 2005 SBAC Portable Checkpointing for BSP Applications on Grid Environments Raphael Y. de Camargo Fabio Kon Alfredo Goldman.

EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology

1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.

Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.

BOF: Megajobs Gracie: Grid Resource Virtualization and Customization Infrastructure How to execute hundreds of thousands tasks concurrently on distributed.

Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.

Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”

Service - Oriented Middleware for Distributed Data Mining on the Grid ，劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)

Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.

Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.

July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.

Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.

Nguyen Tuan Anh. VN-Grid: Goals  Grid middleware (focus of this presentation)  Tuan Anh  Grid applications  Hoai.

GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.

GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public and Grid Computing.

7. Grid Computing Systems and Resource Management

1 My Dream of Jini Fabio Kon Jalal Al-Muhtadi Roy Campbell M. Dennis Mickunas Department of Computer Science University of Illinois at.

WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.

COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.

GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.

Tool Integration with Data and Computation Grid “Grid Wizard 2”

Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.

David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC.

Mobile Analyzer A Distributed Computing Platform Juho Karppinen Helsinki Institute of Physics Technology Program May 23th, 2002 Mobile.

Group # 14 Dhairya Gala Priyank Shah. Introduction to Grid Appliance The Grid appliance is a plug-and-play virtual machine appliance intended for Grid.

PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.

INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.

Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.

Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,

Data Management on Opportunistic Grids

Clouds , Grids and Clusters

GWE Core Grid Wizard Enterprise (

Globus —— Toolkits for Grid Computing

Defining the Grid Fabrizio Gagliardi EMEA Director Technical Computing

Presentation transcript:

Andrei Goldchleger, Fabio Kon, Alfredo Goldman and Marcelo Finger Department of Computer Science IME/USP InteGrade: Object-Oriented Grid Middleware Leveraging Idle Computing Power of Desktop Machines

2 Motivation - need for computation High demand for computationally-instensive applications –multimedia processing –scientific computing –finantial simulations and predictions –weather forecast –oil drilling –schedulling, planning, etc.

3 Motivation - waste of resources Corporations, universities, and government have hundreds or thousands of desktop computers for its employees and students. Desktops are idle 99% of the time –idle at night (6PM to 8 am) –idle during work hours –idle even when users are typing on the desktop keyboard Dedicated clusters are idle most of the time generating heat and noise

4 Paradox 1.High demand for computatinal power 2.High level of idle resources Third-world countries like Brazil cannot afford to waste resources like that. Developed countries should also manage their resources better, at least for environmental reasons. InteGrade’s goal is to solve this paradox

5 Team Members o Alfredo Goldman, Fabio Kon, Marcelo Finger e Siang W. Song (DCC – IME/USP) o Markus Endler e Renato Cerqueira (DI – PUC- Rio) o Edson Cáceres e Henrique Mongelli (DCT – UFMS) o Approximately 10 graduate students

6 InteGrade: Description Middleware to build a grid of commodity machines Desktop users (Resource Providers) export their resources to the grid Grid applications use only idle resources Advantages over traditional dedicated clusters of commodity hardware

7 Based on standard distributed object-oriented technology (CORBA) Preserves resource provider’s QoS at all costs Supports a wide range of parallel applications Usage pattern collection and analysis InteGrade: Key Features

8 InteGrade: OO CORBA Middleware Communication and architecture based on the CORBA industry standard –Object-orientation at all levels –Platform independent –Language independent Leverages existing CORBA services (e.g. naming, trading, events, etc.) Export functionality as CORBA services If desired, can also operate with other communication models –Sockets, MPI, BSP, etc.

9 User-level scheduler (DSRT) limits resource consumption of Grid applications Lightweight CORBA ORB (O 2 ) Configurable Resource Sharing (Optional) Feature: Preserves Resource Provider’s QoS

10 Enhances scheduling by offering an approximate view of resource utilization Usage Data is collected in short intervals (e.g. 5 min.) and analysed Data is grouped in larger intervals called periods Clustering algorithms applied to data will derive behavioral categories (e.g. night, lunch-break, week-days, etc) Each machine learns about the utilization of its resources and uses knowledge of past to predict the future Feature: Usage Pattern Collection and Analysis

11 Often unsupported by other grid initiatives, especially ones that make opportunistic use of shared resources –In most Grid systems parallel applications must have little or no communication among application nodes InteGrade research focuses on other kinds of parallel application (with communication) Information about links interconnecting nodes must be collected and utilized for scheduling Feature: Support for a Wide Range of Parallel Applications

12 Usage pattern collection and analysis provides hints, minimizing interruptions Checkpointing for sequential applications – Must be implemented on a machine and OS independent way Progress of parallel applications is more difficult to ensure, requiring global consistent checkpoints Possible solution: use BSP as parallel application model Feature: Ensures Application Progress

13 Architecture: Intra- Cluster LRM - Local Resource Manager GRM - Global Resource Manager

14 Architecture: Intra- Cluster LUPA - Local Usage Pattern Analyzer GUPA - Global Usage Pattern Analyzer

15 Architecture: Intra- Cluster NCC - Node Control Center ASCT - Application Submission and Control Tool

16 Architecture: Inter- Cluster

17 Related Work Our work is influenced by 5 systems: –Globus, Legion, Condor, and 2K Condor (U. of Wisconsin-Madison) –Pioneer (started on late 80s) –A “hunter” of idle workstations on local networks –Condor-G interfaces with Globus for integration with wide-area grids –Support for parallel applications is limited –We could not get its source-code Globus (Argonne National Labs / U. of Chicago / USC) –Does not focus on QoS-preserving utilization of desktop machines –Not object-oriented –InteGrade uses CORBA and OO design

18 Legion (U. of Virginia) –Proprietary distributed object model –InteGrade has deeper focus on idle resource management and desktop machines (U. of California Berkeley) –Hard-coded application –No communication between application nodes BOINC (U. of California Berkeley) – Limited support for parallel applications Related Work (continued)

19 2K (U. of Illinois at Urbana-Champaign) –a CORBA-based distributed operating system –does not focus on grid computing or parallel applications –provided a proof-of-concept prototype for some of the protocols we are using in InteGrade Related Work (continued)

20 Implementation Status Already Implemented: –Intra-Cluster Information Update Protocol –Intra-Cluster Execution Protocol Sequential applications Parametric applications Software used: –GRM: Java using JacORB –LRM: C++ using O 2

21 Implementation Status: ClusterView

22 Ongoing sub-projects Refinements and extensions to architecture and core software infrastructure Initial support for parallel applications Network discovery and monitoring User usage pattern collection and analysis Global, wide-area scheduling Migration and mobile agents lightweight middleware autonomic computing –self-awareness, self-healing, self-adaptation Security and privacy

23 Project Information Source code available at FAPESP’s incubadora (anonymous CVS checkout & Web front end) Increasing number of students working on the project Initial beta version expected for the end of 2003 (alpha version already up and running)