Download presentation
Presentation is loading. Please wait.
Published byWilliam Holmes Modified over 9 years ago
1
Grid’5000 GdX Grid'5000 and Grid eXplorer 1 Large Scale Experimental Grids Grid’5000 Grid eXplorer & Franck Cappello INRIA fci@lri.fr
2
Grid’5000 GdX Grid'5000 and Grid eXplorer 2 Grid raises a lot of research issues: Security, Performance, Fault tolerance, Scalability, Load Balancing, Coordination, Message passing, Data storage, Programming, Communication protocols and architecture, Deployment, etc. Theoretical models and simulators cannot capture real life conditions Production platforms have strong difficulties to reproduce experimental conditions How to test and compare? Fault tolerance protocols Security mechanisms Deployment tools etc. Grid experimental platforms rational
3
Grid’5000 GdX Grid'5000 and Grid eXplorer 3 log(cost) log(realism) mathsimulation emulation live systems Models: Sys, apps, Platforms, conditions Real systems Real applications Real platforms Real conditions Tools for Distributed System Studies To investigate Distributed System issues, we need: 1) Tools (model, simulators, emulators, experi. Platforms) 2) Strong interaction between these research tools Tools for Large Scale Distributed Systems Real systems Real applications “In-lab” platforms Synthetic conditions Key system mecas. Algo, app. kernels Virtual platforms Synthetic conditions
4
Grid’5000 GdX Grid'5000 and Grid eXplorer 4 log(cost) log(realism) mathsimulation emulation live systems SimGrid MicroGrid Bricks NS, etc. Model Protocol proof Grid eXplorer WANinLab Emulab Grid’5000 TERAGrid PlanetLab Naregi Testbed We need a Grid experimental platform According to the current knowledge: There is no large scale testbed dedicated to Grid experiments Grid’5000 as a live system Grid eXplorer as a large scale emulator
5
Grid’5000 GdX Grid'5000 and Grid eXplorer 5 Fundamental components of Grids Systems –nodes, OS, –distributed systems mechanisms (resource discovery, storage, scheduling, etc.), –middleware, runtimes, –Fault (crash, transient) –Workload (multiple users/multiple applications) –Heterogeneity (resource diversity, performance) –Malicious users/behaviors Networks –routers, links, topology, –protocols, –Theoretical features: synchronous, pseudo synchronous or asynchronous –Disconnection –Packet loss –Congestion Static Dyn.
6
Grid’5000 GdX Grid'5000 and Grid eXplorer 6 1)Remotely controllable Grid nodes installed in geographically distributed laboratories 2)A « Controllable » and « Monitorable » Network between the Grid nodes 3)A middleware infrastructure connecting the nodes (security) 4)A playground to prepare experiments 5)A toolkit to deploy, run, monitor, control experiments and collect results What do we need for Grid experiments ? We need these components for A nation wide Experimental Platforms and Emulators
7
Grid’5000 GdX Grid'5000 and Grid eXplorer 7 -Thierry Priol (ACI Grid Director) -Brigitte Plateau (President of ACI Grid SC) -Dani Vandrome (Director of Renater) -Frédéric Desprez (Lyon) -Michel Daydé (Toulouse) -Yvon Jégou (Rennes) -Stéphane Lantéri (Sophia) -Raymond Namyst (Bordeaux) -Pascale Primet (Lyon) -Olivier Richard (Grenoble) Steering Committee: (organizer: Franck Cappello, Orsay) Technical Committee: -David Gueldrech (Sophia) -Jean Claude Barbet (Orsay) -Franck Bonnassieux (UREC) -Julien le duc (Grenoble) -Fred Desprez (Lyon) -Yvon Jégou (Rennes) -Olivier Coulaud (Bordeaux) -Frédéric Barbaresco (Toulouse) Forums: Deployment/exploitation: Franck Cappello (AS1, RTP8) Programming models: Raymond Namyst (AS2, RTP8) Grid experiments under real life conditions Grid’5000
8
GdX Grid'5000 and Grid eXplorer 8 1)Building a nation wide experimental platform for Grid researches (like a particle accelerator for the computer scientists) 10/11 geographically distributed sites every site hosts a cluster (from 256 CPUs to 1K CPUs) All sites are connected by RENATER (French Res. and Edu. Net.) RENATER hosts probes to trace network load conditions Design and develop a system/middleware environment for safely test and repeat experiments 2) Use the platform for Grid experiments in real life conditions Address critical issues of Grid system/middleware: Programming, Scalability, Fault Tolerance, Scheduling Address critical issues of Grid Networking High performance transport protocols, Qos Port and test applications Investigate original mechanisms P2P resources discovery, Desktop Grids The Grid’5000 Project
9
Grid’5000 GdX Grid'5000 and Grid eXplorer 9 Lab’s Network LAB/Firewall Router Test Cluster Control Master Site 1 Site 2 Site 3 Users (ssh loggin + password) Firewall/nat Control Slave Test Cluster Front end Control Slave Control site Grid’5000 Big Picture Gateway +VPN (192. For all nodes) One machine Can be seen as a Virtual Grid Gateway
10
Grid’5000 GdX Grid'5000 and Grid eXplorer 10 Grid’5000 Schedule Grid’5000 Hardware Call for proposals Sept03 Selection of 7 sites Nov03 ACI GRID Funding Jan04 Call for Expression Of Interest March04 Vendor selection Jun/July 04 Instal. First tests Spt 04 Final review Oct 04 Fisrt Demo (SC04) Nov 04 Grid’5000 System/middleware Forum Security Prototypes (not full size) Control Prototypes (not full size) Grid’5000 Programming Forum Grid’5000 Builder Community Grid’5000 Experiments
11
Grid’5000 GdX Grid'5000 and Grid eXplorer 11 Grid’5000 in November’2004 (Sorry we cannot give hardware details yet) Grid 5000 nodes (soon 4 ) 3 Grid eXplorer Pau
12
Grid’5000 GdX Grid'5000 and Grid eXplorer 12 Grid 5k + GdX Funding (ACI Grid+ ACI DM + Regional funding) Grid’5000 (ACI Grid) 0,6M€ ~0,4€ ~0,35€ ~0,5€ ~0,3?€ ~0,35€ ~3M€ for hardware only (may increase) Grid eXplorer (ACI Data Masses) Pau
13
Grid’5000 GdX Grid'5000 and Grid eXplorer 13 Summary of Grid5000 experiments of Grid’5000 members Networking –End Host Communication layer –High performance long distance protocols –High Speed Network Emulation Middleware / OS –Grid’5000 control/access –Grid’5000 experiment automation –Scheduling / data distribution in Grid –Fault tolerance in Grid –Resource management –Grid SSI OS and Grid I/O –Desktop Grid/P2P systems Programming –Component programming for the Grid (Java, Corba) –GRID-RPC –GRID-MPI –Code Coupling Applications –Multi-parametric applications (Climate modeling/Functional Genomic) –Large scale experimentation of distributed applications (Electromagnetism, multi-material fluid mechanics, parallel optimization algorithms, CFD, astrophysics –Medical images, Collaborating tools in virtual 3D environment
14
Grid’5000 GdX Grid'5000 and Grid eXplorer 14 Middleware1(XP)Grid5000 Grid’5000 control - Computing Environment deployment (Ka-tools) - Experiment automation (security and control) - VGrid « mapping a virtual Grid on a real testbed » - Monitoring, benchmarking, performance characterization and analysis Scheduling / distribution - Scheduling : Data transfers, global communications, work stealing,... - Data re-distribution in Grid - Task distribution and load balancing in heterogeneous Grid - Mixed Parallelism (task and data parallelism) - Mixing data management and task scheduling - Hierarchical and Distributed Scheduling Fault tolerance - Fault tolerant Grid-RPC (RPC-V) - Hierarchical Fault tolerant MPI (MPICH-V) - Fault tolerant in data-flow approach (Athapascan) XP: eXPeriments on
15
Grid’5000 GdX Grid'5000 and Grid eXplorer 15 Middleware2(XP)Grid5000 Management -AROMA tool : resources management over a Grid of clusters with different classes of services -Mobile agents for open Grid management -Management of Grids and hosted services (security, QoS, monitoring & control, dynamic configuration, …) -Optimization for wide area distributed query processing -Virtualization of data storage on Grids -Automatic Deployment of GridRPC middle tier. - Multiclusters and lightweights Grid resource management (OAR/CIGRI) Global Computing/P2P Middleware - Executing Web Services on Desktop Grid Workers (XtremWeb) - Distributing the Coordination in Desktop Grids (XtremWeb) - Harnessing Clusters as parallel Workers - Probabilistic certification in peer-to-peer systems - Large Scale Data Sharing Service based on JXTA (JuxMem) - Experimenting management services for textual document in P2P systems Grid SSI OS and Grid I/O - Grid file system (NFSG) - Grid-aware OS (Kerrighed) - Coupling Computational Grid with Reality Center
16
Grid’5000 GdX Grid'5000 and Grid eXplorer 16 End Host Communication layer - Intelligent Usage of NICs for local and wide area communications - Direct file access over Myrinet : ORFA/NFS and ORFA/LUSTRE High performance long distance protocols - Alternative Transport for very high speed networks (backpressure) - Differentiated transport with delay control on WAN - Reliable active and non active Multicast - Network Bandwidth optimization in Grid (VTHD++, Paco++). High Speed Network Emulation - Automatic Deployment of emulated high speed domains - Experiment design for grid flow interactions studies Grid Networking Layer - Network Resource and QoS on demand - Grid Overlay and Programmable Routers - Measurement Services for network aware middleware Network(XP)Grid5000
17
Grid’5000 GdX Grid'5000 and Grid eXplorer 17 Component programming on the grid - ProActive : a JAVA library (parallel, distributed, concurrent computing with security and mobility) - Assessment of scalability, deployment, security and fault tolerance issues - Hierarchical components architecture - PadicoTM/Paco++ combining parallel and distributed computing RPC Environment - Large scale experimentation of the DIET platform (Distributed Interactive Engineering Toolbox) - Client/Agent/Server model following the GridRPC standard with distributed scheduling agents MPI Environment - Time sharing Grid resources - Migration over Clusters with heterogeneous high speed networks Code Coupling - Application coupling with Athapascan - Communication / method invocation rescheduling into ORB (HOMA) - Fluid transfer simulation and geological code with PadicoTM/Paco++ Programming(XP)Grid5000
18
Grid’5000 GdX Grid'5000 and Grid eXplorer 18 Applications(XP)Grid5000 Multi-parametric applications - ACI GRID-TLSE Project : expertise site for sparse linear algebra - Climate modeling and Global Change -DataGène Project : Functional genomic Large scale experimentation of distributed message passing applications –JECS: a JAVA Environment for Computational Steering Distributed computing and interactive visualization of 3D numerical simulations (Caiman and Oasis project-teams) Collaborative environment Computational Electromagnetism application (JEM3D) –MECAGRID (ACI GRID project, Smash project-team) Massively parallel computations in multi-material fluid mechanics Study of numerical algorithms for heterogeneous computing platforms –Grid computing for medical applications (Epidaure project-team) Interoperable medical image registration grid service –Optimal design of complex systems (Coprin project-team) Evaluation of parallel optimization algorithms based on interval analysis techniques Study of load balancing strategies on heterogeneous resources + CFD, astrophysics,… applications + Collaborating tools in virtual 3D environment.
19
Grid’5000 GdX Grid'5000 and Grid eXplorer 19 Other experiments on Grid’5000 Grid’5000 will be opened to other French Grid researchers (certainly through a selection procedure) Grid’5000 will be connected to the EU CoreGrid testbed and may be used as an experimental platform for CoreGrid researchers (still through a kind of selection procedure)
20
Grid’5000 GdX Grid'5000 and Grid eXplorer 20 log(cost) log(realism) mathsimulation emulation live systems SimGrid MicroGrid Bricks NS, etc. Model Protocol proof Grid eXplorer Grid’5000 Grid’5000 + Grid eXplorer Combining two Grid research instruments Relax Real Life Conditions Relax Conditions Reproducibility
21
Grid’5000 GdX Grid'5000 and Grid eXplorer 21 Grid experiments under synthetic reproducible conditions 1)Build the instrument: - 1K CPU cluster (may be only 500 depending on the budget) - configurable network (Ethernet, Myrinet, others?) - configurable OS (kernel, distribution, etc.) - A set of emulation/simulation tools (existing + new ones) - Multi-users - Located/managed by IDRIS 2)Study impact of Scale in Grid/P2P systems 1)Address critical issues of Grid system/middleware: Programming, Scalability, Fault Tolerance, Scheduling Address critical issues of Grid Networking High performance transport protocols, Qos Port and test applications Investigate original mechanisms P2P resources discovery, Desktop Grids Grid'5000 and Grid eXplorer GrideXplorer
22
Grid’5000 GdX Grid'5000 and Grid eXplorer 22 Grid eXplorer Big picture An experimental conditions data base Emulator Core Hardware + Soft: Emulation & Simulation A set of tools for analysis A set of sensors in Grid’5000 Validation on Grid’5000 Emulab cluster
23
Grid’5000 GdX Grid'5000 and Grid eXplorer 23 IMAG, ID (UMR 5132), Laboratoire d’Informatique et Distribution, Université de Grenoble LaRIA (UPRES EA 2083), Laboratoire de Recherche en Informatique d’Amiens, Université de Picardie Jules Verne LRI (UMR 8623), Laboratoire de Recherche en Informatique, Université de Paris-sud LAAS-CNRS (UPR 8001), Laboratoire d'Analyse et d'Architecture des Systèmes LORIA (UMR 7503), Laboratoire lorrain de recherche en informatique et ses applications LIP-ENS Lyon (URM 5668), Laboratoire de l'Informatique du Parallélisme LIFL (ESA 8022), Laboratoire d’Informatique Fondamentale de Lille INRIA Sophia Antipolis, UNSA, I3S-CNRS LIP6 (UMR 7606), Laboratoire d'Informatique de Paris 6 LABRI (UMR 5800), Laboratoire Bordelais de Recherche en Informatique IBCP (UMR5086), Institut de Biologie et Chimie des Protéines CEA, Direction des Technologies de l'Information (Saclay) IRISA, Institut de Recherche en Informatique et Systèmes Aléatoires Laboratories involved in GdX 13 Labs
24
Grid’5000 GdX Grid'5000 and Grid eXplorer 24 ExperiencesInfrastructureEmulationNetworkApplication I.1 PlatformXXXX I.2 Virtual GridXX I.3 Virt. TechniquesXX I.4 Emul driven SimulX I.5 Network emul.XXX I.6 Heterogeneity emulX I.7 CommunicationX I.8 Internet Emul.XXX II.1 Engineering tech.XXX II.2 Mobile objectsXX II.3 Fault toleranceXX II.4 DHTX II.5 Data baseXX II.6 SchedulingXX II.7 Comm. Optimizat.X II.8 Data sharingXX II.9 Uni and multicastXX II.10 Cellul. automatonXX II.11 BioinformatiqueX II.12 P2P storageXX II.13 ReliabilityXXX II.14 SecurityXXX II.15 NG. InternetXXX II.16 Grid coupled sys.X
25
Grid’5000 GdX Grid'5000 and Grid eXplorer 25 Grid eXplorer and Grid’5000 interactions Design Test/Check Validation under Real Life Conditions Grid’5000 Design Test/Check Validation of Scalability and under Synthetic Conditions Grid eXplorer Integration to standard middleware, Deployment, Performance Scalability, Fault tolerance
26
Grid’5000 GdX Grid'5000 and Grid eXplorer 26 Summary Researches in Grid and P2P need large scale platforms –To study protocols, systems, middleware, programming models and applications in real life OR reproducible experimental conditions Grid’5000 and Grid eXplorer -Will be experimental platforms for Grid researchers (like particle accelerator for physicists) -A nation wide platform and a large scale emulator -Strong relations between this two projects (researchers are the same persons for the two projects!) -Hardware should be installed by November 2004 -Prototypes (security and control) should work for November 2004 -Will be opened for experiments in early 2005
27
Grid’5000 GdX Grid'5000 and Grid eXplorer 27 Q&A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.