 Grand Large ACI MD GdX INRIA meeting with NII1October 7, 2003 Grid eXplorer (GdX) A research tool for exploring Grid/P2P issues Franck Cappello INRIA.

Slides:



Advertisements
Similar presentations
XtreemOS IP project is funded by the European Commission under contract IST-FP XtreemOS: Building and Promoting a Linux-based Operating System.
Advertisements

SALSA HPC Group School of Informatics and Computing Indiana University.
Grid’5000 A Nation Wide Experimental Grid. Grid’5000 Grid raises a lot of research issues: Security, Performance, Fault tolerance, Load Balancing, Fairness,
Grid’5000 GdX Grid'5000 and Grid eXplorer 1 Large Scale Experimental Grids Grid’5000 Grid eXplorer & Franck Cappello INRIA
Design Deployment and Use of the DETER Testbed Terry Benzel, Robert Braden, Dongho Kim, Clifford Informatino Sciences Institute
A Nation Wide Experimental Grid The Grid’5000 project: architecture and objectives Building a nation wide experimental platform for Grid researchers –
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Grid’5000 Introduction to Grid' Grid’5000 A Nation Wide Experimental Grid.
1 Virtual Machine Resource Monitoring and Networking of Virtual Machines Ananth I. Sundararaj Department of Computer Science Northwestern University July.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
MPICH-V: Fault Tolerant MPI Rachit Chawla. Outline  Introduction  Objectives  Architecture  Performance  Conclusion.
4 december, DAS3-G5K Interconnection Workshop Hosted by the VU (Thilo Kielmann), Amsterdam Dick Epema (TUD) and Franck Cappello (INRIA) Parallel.
1 Sonia Fahmy Ness Shroff Students: Roman Chertov Rupak Sanjel Center for Education and Research in Information Assurance and Security (CERIAS) Purdue.
Grid’5000 Grid' DAS-3 workshop 104/12/06 Grid’5000 * DAS-3 – Grid'5000 workshop December 4th, *5000 CPUs Pierre NEYRON - INRIA.
June 2007CRI workshop (Boston, MA) Testbeds Henning Schulzrinne Columbia University.
Overview of grid / cloud research in France Michel DAYDÉ Scientific Delegate at INS2/CNRS in charge of HPC / Grid / cloud Université de Toulouse - IRIT.
1 Int System Introduction to Systems and Networking Department Faculty of Computer Science and Engineering Ho Chi Minh City University of Technology.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
I-Cluster ACM-SIGOPS JTE Cluster Computing Bruno Richard Research Program Manager HP Labs Grenoble.
Optimized Java computing as an application for Desktop Grid Olejnik Richard 1, Bernard Toursel 1, Marek Tudruj 2, Eryk Laskowski 2 1 Université des Sciences.
JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Workshop.
DISTRIBUTED COMPUTING
Rio de Janeiro, October, 2005 SBAC Portable Checkpointing for BSP Applications on Grid Environments Raphael Y. de Camargo Fabio Kon Alfredo Goldman.
Large-scale Deployment in P2P Experiments Using the JXTA Distributed Framework Gabriel Antoniu, Luc Bougé, Mathieu Jan & Sébastien Monnet PARIS Research.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.
F. Cappello, O. Richard, P. Sens ---oo Draft oo--- Contact us for experiment proposal Grid eXplorer (GdX) An Instrument for eXploring the GRID F. Cappello,
Programming Parallel and Distributed Systems for Large Scale Numerical Simulation Application Christian Perez INRIA researcher IRISA Rennes, France.
Heavy and lightweight dynamic network services: challenges and experiments for designing intelligent solutions in evolvable next generation networks Laurent.
Experiments in computer science Emmanuel Jeannot INRIA – LORIA Aleae Kick-off meeting April 1st 2009.
SALSA HPC Group School of Informatics and Computing Indiana University.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Spring 2011.
Héméra: Scientific Challenges using Grid’ Christian Perez INRIA, France.
JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Grid Data.
11/18 SC 2003 MPICH-V2 a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging joint work with A.Bouteiller, F.Cappello,
XtremWeb: Building an Experimental Platform for Global Computing Gilles Fedak, Cécile Germain, Vincent Néri, Franck Cappello Université Paris Sud, LRI,
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Workshop BigSim Large Parallel Machine Simulation Presented by Eric Bohm PPL Charm Workshop 2004.
Parallel Checkpointing - Sathish Vadhiyar. Introduction  Checkpointing? storing application’s state in order to resume later.
Emulation in Data Grid eXplorer. Emulation problematic Distributed applicationTarget environment Simulation Emulation App. Model Env. model Formal analysis.
OpenWP: A directive based language and runtime for coarse grain distributed executions Matthieu Cargnelli*, Guillaume Alléon*°, Franck Cappello° * EADS,
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
Going Large-Scale in P2P Experiments Using the JXTA Distributed Framework Mathieu Jan & Sébastien Monnet Projet PARIS Paris, 13 February 2004.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
An Overview of Scientific Workflows: Domains & Applications Laboratoire Lorrain de Recherche en Informatique et ses Applications Presented by Khaled Gaaloul.
Auger & XtremWeb: Monte Carlo computation on A Global Computing platform O. Lodygensky, G. Fedak, V. Neri, A.Cordier, F. Cappello Laboratoire de l’Accelerateur.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Fault Tolerance and Checkpointing - Sathish Vadhiyar.
Parallel Checkpointing - Sathish Vadhiyar. Introduction  Checkpointing? storing application’s state in order to resume later.
Lynda : Lyon Neuroimaging Database and Applications (1) Institut des Sciences Cognitives UMR 5015 CNRS ; (2) parallel computing ENS-Lyon ; (3)Centre de.
MicroGrid Update & A Synthetic Grid Resource Generator Xin Liu, Yang-suk Kee, Andrew Chien Department of Computer Science and Engineering Center for Networked.
Denis Caromel1 OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis -- IUF IPDPS 2003 Nice Sophia Antipolis, April Overview: 1. What.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
Grid Institute Scientific Council, September 10, 2008
Clouds , Grids and Clusters
Introduction to Distributed Platforms
Globus —— Toolkits for Grid Computing
Grid Computing.
University of Technology
GGF15 – Grids and Network Virtualization
Dimitri Papadimitriou (ALB)
Sky Computing on FutureGrid and Grid’5000
CLUSTER COMPUTING.
Development & Evaluation of Network Test-beds
Sky Computing on FutureGrid and Grid’5000
Distributed Systems and Algorithms
Presentation transcript:

 Grand Large ACI MD GdX INRIA meeting with NII1October 7, 2003 Grid eXplorer (GdX) A research tool for exploring Grid/P2P issues Franck Cappello INRIA Grand-Large LRI, University of Paris sud. Research Program ACI “Data Mass” Pierre Sens INRIA Regal LIP6, University of Paris 6. Pascale Primet INRIA RESO LIP, ENS de Lyon. Olivier Richard INRIA Apache ID, IMAG. Christohpe Cérin LARIA, University of Amiens.

 Grand Large ACI MD GdX INRIA meeting with NII2October 7, 2003 Outline Motivating a large scale instrument for Grid A large scale instrument for exploring Grid issues in reproducible experimental conditions Concluding remarks

 Grand Large ACI MD GdX INRIA meeting with NII3October 7, 2003 Open issues in Grid/P2P Security Data Storage/consultation/movement Multi users/ Multi applications scheduling Coordination (virtual, ephemeral infrastructure) Programming Fault Tolerance! Scalability Performance Easy/efficient deployment techniques Application characterization techniques Etc.

 Grand Large ACI MD GdX INRIA meeting with NII4October 7, 2003 Fundamental components of Grids Systems –nodes, OS, –distributed systems mechanisms (resource discovery, storage, scheduling, etc.), –middleware, runtimes, –Fault (crash, transient) –Workload (multiple users/multiple applications) –Heterogeneity (resource diversity, performance) –Malicious users/behaviors Networks –routers, links, topology, –protocols, –Theoretical features: synchronous, pseudo synchronous or asynchronous –Disconnection –Packet loss –Congestion Static Dyn.

 Grand Large ACI MD GdX INRIA meeting with NII5October 7, 2003 What are the current approaches for studying Systems and Networks? Theoretical models: –Scheduling, load balancing, performance, etc. –congestion, routing, packet loss, topology, traffic, etc.  Difficulty to model dynamic behaviors and system complexity Simulators: –SimGrid, SimGrid2, GridSim, MicroGrid, Bricks –NS, NS2, Cnet, Real, etc.  Strong limitations (scalability, != than execution of real codes, validation) Experimental testbed: –For Grid Most testbed are for production, each testeb is specific –Long tradition in Network (Arpanet, Magic, Geant, Renater, VTHD)  Shared testbed not fully decoupled (cost) from production network, experimental conditions difficult to reproduce, representativeness?) We have no way to test: a) ideas independently, at a significant scale, b) with realistic parameters and behaviors!

 Grand Large ACI MD GdX INRIA meeting with NII6October 7, 2003 Case Study 1: XtremWeb-Auger Understanding the origin of very high cosmic rays: Aires: Air Showers Extended Simulation –Sequential, Monte Carlo. Time for a run: 5 to 10 hours (500MhzPC) PC worker Aires PC worker air shower Server Internet and LAN PC Worker PC Client Air shower parameter database (Lyon, France) XtremWeb Estimated PC number ~ 5000 Trivial parallelism Master Worker paradigm

 Grand Large ACI MD GdX INRIA meeting with NII7October 7, 2003 Internet Icluster Grenoble PBS Madison Wisconsin Condor U-psud network LRI Condor Pool Autres Labos lri.fr XW Client XW Coordinator Application : AIRES (Auger) Deployment: Coordinator at LRI Madison: 700 workers Pentium III, Linux (500 MHz+933 MHz) (Condor pool) Grenoble Icluster: 146 workers (733 Mhz), PBS LRI: 100 workers Pentium III, Athlon, Linux (500MHz, 733MHz, 1.5 GHz) (Condor pool) Case Study 1: XtremWeb-Auger

 Grand Large ACI MD GdX INRIA meeting with NII8October 7, 2003 Case Study 1: XtremWeb-Auger No way to reproduce the same experimental conditions (Configuration BUT ALSO the Dynamic of the system) How to compare fundamental mechanisms then (scheduling), At large Scale ( nodes)? 3000 tasks

 Grand Large ACI MD GdX INRIA meeting with NII9October 7, 2003 Case study 2: MPICH-V: Fault tolerant MPI for the Grid Node Network Node Dispatcher Node Channel Memory Checkpoint server PC client MPI_send()PC client MPI_recv() Programmer’s view unchanged: Problems: 1) volatile nodes (any number at any time) 2) firewalls (PC Grids) 3) non named receptions (  should be replayed in the same order as the one of the previous failed exec.)

 Grand Large ACI MD GdX INRIA meeting with NII10October 7, 2003 ~4,8 Gb/s ~1 Gb/s Icluster-Imag, 216 PIII 733 Mhz, 256MB/node 5 subsystems with 32 to 48 nodes, 100BaseT switch 1Gb/s switch mesh between subsystems Linux, PGI Fortran or GCC compiler  Very close to a typical Building LAN  Simulate node Volatility Very close to the LRI network! Case study 2: MPICH-V: Fault tolerant MPI for the Grid

 Grand Large ACI MD GdX INRIA meeting with NII11October 7, 2003 MPICH-V (CM but no logs) MPICH-V (CM with logs) MPICH-V (CM+CS+ckpt) MPICH-P4 Number of faults during execution Total execution time (sec.) Base exec. without ckpt. and fault ~1 fault/110 sec. MPICH-V vs. MPICH-P4 Execution time with faults (Fault injection) Interesting but, what about MPICH-V on nodes? Case study 2: MPICH-V: Fault tolerant MPI for the Grid

 Grand Large ACI MD GdX INRIA meeting with NII12October 7, 2003 New generation of research tools Emulab/NIST Net/Modelnet – Network Emulator with actual PC+ routers emulator (dummynet) – Reproducible experimental conditions – Real applications and protocols PlanetLab – A real platform distributed over the Internet (planet wide) – Real life conditions (not reproducible) – Real applications and protocols log(abstraction) mathsimulationemulationlive nkWANiLab NS SSFNet QualNet JavaSim Mathis formula DummyNet EmuLab ModelNet WAIL HENP Abilene CalREN WAIL PlanetLab CAIRN NLR WANiLab – Cluster with real routers – Reproducible experimental conditions – Real applications and protocols

 Grand Large ACI MD GdX INRIA meeting with NII13October 7, 2003 Experiments range demand for Grid/P2P: “GRIDinLAB” Virtual Grids: (1 Grid node on 1 PC) Emulation of a Grid/P2P systems at 1:1 scale Execute real applications on Grid nodes (possibly slowdown the CPU for heterogeneity emulation) Use actual routers or emulators (Dummynet) Inject traces (congestion, workload, fault) Emulation of Grids: (10 or 100 Grid nodes on 1 PC) Emulation of a Grid/P2P systems at 10/100:1 scale Execute real applications or core of applications Use network emulators Inject synthetic traces Large scale simulation of Grids: (1000 Grid nodes on 1 PC) Simulation of Large Grid/P2P systems at 1000:1 Simulate Application Simulate network Simulate dynamic conditions

 Grand Large ACI MD GdX INRIA meeting with NII14October 7, 2003 “GRIDinLAB” in the methodology spectrum log(abstraction) log(cost) mathsimulation emulation live ntWANiLab NS SSFNet QualNet JavaSim Mathis formula Optimization Linear model Nonlinear model Stocahstic model DummyNet EmuLab ModelNet WAIL HENP Abilene CalREN WAIL PlanetLab CAIRN NLR WANiLab Real Wan Routers Emulation driven simulation “GRIDinLAB” Credits: WANiLAB Large scale simulation Grid eXplorer

 Grand Large ACI MD GdX INRIA meeting with NII15October 7, 2003 Outline Motivating a large scale instrument for Grid A large scale instrument for exploring Grid issues in reproducible experimental conditions Concluding remarks

 Grand Large ACI MD GdX INRIA meeting with NII16October 7, 2003 Grid eXplorer A “GRIDinLAB” instrument for CS researchers (Not a production facility) For Grid/P2P researcher community Network researcher community  Addressing specific issues of each domain  Enabling research studies combining the 2 domains  Ease and develop collaborations between the two communities.

 Grand Large ACI MD GdX INRIA meeting with NII17October 7, 2003 Grid eXplorer An experimental conditions database or generation An experimental Platform: A cluster +HP Network +Software A tool set for conducting experiments & measurements, result analysis

 Grand Large ACI MD GdX INRIA meeting with NII18October 7, 2003 Grid eXplorer: the big picture A set of sensors An experimental Conditions data base Emulator Core Hardware + Soft for Emulation Simulation A set of tools for analysis Validation on Real life testbed

 Grand Large ACI MD GdX INRIA meeting with NII19October 7, 2003 Grid eXplorer (GdX) research project 1) Build the instrument: “Design and develop, for the community of Computer Science Researchers, an emulation platform for Large Scale Distributed Systems (Grid, P2P and other distributed systems)” 2) Use the Instrument for “a set of research experiments investigating the impact of Large Scale in distributed systems and especially related to large data sets (security, reliability, performance).” 1K CPU cluster (may be only 500 depending on the budget) configurable network (Ethernet, Myrinet, others?) configurable OS (kernel, distribution, etc.) A set of emulation/simulation tools (existing + new ones) Multi-users Located/managed by IDRIS INRIA meeting with NII

 Grand Large ACI MD GdX INRIA meeting with NII20October 7, 2003 IMAG, ID (UMR 5132), Laboratoire d’Informatique et Distribution, Université de Grenoble LaRIA (UPRES EA 2083), Laboratoire de Recherche en Informatique d’Amiens, Université de Picardie Jules Verne LRI (UMR 8623), Laboratoire de Recherche en Informatique, Université de Paris-sud LAAS-CNRS (UPR 8001), Laboratoire d'Analyse et d'Architecture des Systèmes LORIA (UMR 7503), Laboratoire lorrain de recherche en informatique et ses applications LIP-ENS Lyon (URM 5668), Laboratoire de l'Informatique du Parallélisme LIFL (ESA 8022), Laboratoire d’Informatique Fondamentale de Lille INRIA Sophia Antipolis, UNSA, I3S-CNRS LIP6 (UMR 7606), Laboratoire d'Informatique de Paris 6 LABRI (UMR 5800), Laboratoire Bordelais de Recherche en Informatique IBCP (UMR5086), Institut de Biologie et Chimie des Protéines CEA, Direction des Technologies de l'Information (Saclay) IRISA, Institut de Recherche en Informatique et Systèmes Aléatoires Laboratories involved in GdX 13 Labs

 Grand Large ACI MD GdX INRIA meeting with NII21October 7, 2003 Alain Lecluse (IBCP), Alexandre Genoud, (Projet OASIS, INRIA Sophia Antipolis) Antoine Vernois, (IBCP) Arnaud Contes, (Projet OASIS, INRIA Sophia Antipolis) Aurélien Bouteiller, (LRI), Bénedicte Legrand (LIP6) Brice Goglin (doctorant), (INRIA LIP RESO), Brigitte Rozoy (LRI) Cécile Germain (LRI) Christophe Blanchet, (IBCP) Christophe Cérin, (Amiens, Laria) Christophe Chassot, (LAAS-ENSICA), Colette Johnen (LRI) CongDuc Pham, (LIP) Cyril Randriamaro, (LaRIA) Denis Caromel, (Projet OASIS, INRIA Sophia Antipolis) Eddy Caron, (LIP/ENS Lyon), Emmanuel Jeannot, (Loria) Eric Totel (Supélec Rennes) Fabrice Huet, (Projet OASIS, INRIA Sophia Antipolis) Faycal Bouhaf (DEA)(INRIA LIP RESO), Franck Cappello, (LRI) Françoise Baude, (Projet OASIS, INRIA Sophia Antipolis) Frédéric Desprez, (LIP/INRIA Rhône-Alpes), Frédéric Magniette, (LRI) Gabriel Antoniu, (IRISA/INRIA Rennes), George Bosilca, (LRI) Georges Da Costa, (ID-IMAG), Géraud Krawezik (LRI) Gil Utard, (LaRIA) Gilles Fedak, (LRI) Grégory Mounié (ID-IMAG) Guillaume Auriol, (LAAS-ENSICA), Guillaume Mercier, (LaBRI), Guy Bergère, (LIFL, GrandLarge INRIA Futur) Haiwu He, (LIFL, GrandLarge INRIA Futur) Isaac Scherson, (LIFL, GrandLarge, INRIA Futur) Jens Gustedt (LORIA & INRIA Lorraine) Joffroy Beauquier (LRI) Johanne Cohen, (Loria) Kavé Salamatian (LIP6), Lamine Aouad (LIFL, GrandLarge, INRIA Futur) Laurent Baduel, (Projet OASIS, INRIA Sophia Antipolis) Laurent Dairaine, (LAAS) Luc Bougé, (IRISA/ENS Cachan Antenne de Bretagne), Luciana Arantes (LIP6), Ludovic Mé, (Supélec Rennes) Luis Angelo Estefanel, (ID-IMAG) Marin Bertier (LIP6), Mathieu Goutelle, (KIP) Mathieu Jan, (IRISA) Michel Diaz, (LAAS-ENSICA), Michel Koskas (Amiens, Laria) Nicolas Lacorne, (IBCP) Nicolas Larrieu (LAAS-ENSICA), Nicolas Viovy (CEA-DSM-LSCE) Oleg Lodygensky, (LRI) Olivier Richard (ID-IMAG), Olivier Soyez, (LaRIA) Pascal Berthou, (LAAS-ENSICA), Pascale Primet (LIP), Pascale Vicat-Blanc Primet, (INRIA LIP RESO), Patrick Sénac, (LAAS-ENSICA), Philippe d'Anfray (CEA-DTI/SISC), Philippe Gauron, (LRI) Philippe Owezarski (LAAS) Pierre Fraigniaud, (LRI) Pierre Lemarini, (LRI) Pierre Sens (LIP6 / INRIA), Pierre-André Wacrenier, (LaBRI), Raymond Namyst, (LaBRI), Samir Djilali, (LRI) Sébastien Tixeuil (LRI) Serge Petiton, (LIFL, GrandLarge INRIA Futur) Stéphane Vialle (Supélec) Tanguy Pérennou (LAAS) Thierry Gayraud, (LAAS-ENSICA), Thierry Priol, (IRISA) Thomas Hérault, (LRI) Timur Friedman (LIP6) Vincent Danjean, (LaBRI), Vincent Néri (LRI) 84 members

 Grand Large ACI MD GdX INRIA meeting with NII22October 7, Research Topics The 4 research topics and their leaders: -Infrastructure (Hardware + system), Olivier Richard (ID-IMAG, Grenoble) -Emulation, Pierre Sens (LIP6, Paris 6) -Network,Pascale Primet (LIP,Lyon) -Applications. Christophe Cérin (Laria, Amiens)

 Grand Large ACI MD GdX INRIA meeting with NII23October 7, 2003 ExperiencesInfrastructureEmulationNetworkApplication I.1 PlatformXXXX I.2 Virtual GridXX I.3 Virt. TechniquesXX I.4 Emul driven SimulX I.5 Network.XXX I.6 Heterogeneity emulX I.7 CommunicationX I.8 Internet Emul.XXX II.1 Engineering tech.XXX II.2 Mobile objectsXX II.3 Fault toleranceXX II.4 DHT.X II.5 Data baseXX II.6 SchedulingXX II.7 Comm. Optimizat.X II.8 Data sharingXX II.9 Uni and multicastXX II.10 Cellul. automatonXX II.11 BioinformatiqueX II.12 P2P storageXX II.13 ReliabilityXXX II.14 SecurityXXX II.15 NG. InternetXXX II.16 Grid coupled sys.X

 Grand Large ACI MD GdX INRIA meeting with NII24October 7, 2003 Grid eXplorer (GdX) current status: First stage: Building the Instrument –First GdX meeting was on September 16, –Hardware design meeting planned for October 15. –Hardware selection meeting on November 8 –Choosing the nodes (single or dual?) –Chossing the CPU (Intel IA 32, IA64, Athlon 64, etc.) –Chossing the experimental Network (Myrinet, Ethernet, Infiniband, etc.) –Choosing the general experiment production architecture (parallel OS architecture, user access, batch scheduler, result repositoty) –Chossing the experimental database harware –Etc.

 Grand Large ACI MD GdX INRIA meeting with NII25October 7, 2003 Example: Nearest Neighbor Scheduling (distribute a task set among a large number of nodes) Three phases Self stabilizing algo (sand heap): —Negotiation —Distribution —Execution Anti clock wise X Y X=X - X-Y 2 Y=Y + Y-Y 2 If X>Y Negotiation rule: Distribution rule: Tasks and Parameters follow the negotiation route, immediately Execution rule: Execution starts Immediately when A local load balance Is reached

 Grand Large ACI MD GdX INRIA meeting with NII26October 7, 2003 Simulation/Emulation tools

 Grand Large ACI MD GdX INRIA meeting with NII27October 7, 2003 Nearest Neighbor Scheduling with a 3D visualization tool 10K tasks on 900 nodes in mesh Negotiation (red movie) Distribution (blue movie) Execution (green movie) Observation results: —Symmetry for the negotiation phase —Asymmetry for Distribution and Execution phases. —Waves Several hours to get 1 movie  parallel simulation is required!

 Grand Large ACI MD GdX INRIA meeting with NII28October 7, 2003 Outline Motivating a large scale instrument for Grid A large scale instrument for exploring Grid issues in reproducible experimental conditions Concluding remarks

 Grand Large ACI MD GdX INRIA meeting with NII29October 7,

 Grand Large ACI MD GdX INRIA meeting with NII30October 7, 2003 Grid eXplorer (GdX) A long term effort  A medium term milestone: 2 years for a fully functional prototype For Grid and Network researcher communities, Many scientific issues (large scale emulation, experimental conditions injection, distance to reality, etc.) A Cluster of 1K CPU + Experimental condition database + Simulators/Emulators + Visualization tools