Auger & XtremWeb: Monte Carlo computation on A Global Computing platform O. Lodygensky, G. Fedak, V. Neri, A.Cordier, F. Cappello Laboratoire de l’Accelerateur Lineaire; Laboratoire de Recherche en Informatique; CNRS, Université Paris sud, France.
Sommaire Introduction XtremWeb Auger distributed computing Conclusion
CHEP O.Lodygensky 3March 27, 2003 Different GRID 2 distributed system types « GRID » « Desktop GRID » « Internet Computing » Peer to Peer systems (P2P) Global computing systems Traditionnal computing centers, Clusters Windows, Linux, Mac OS <100 Stables Individually identified Trusted ~ Volatiles No individual ident Not trusted Nodes caracteristics
CHEP O.Lodygensky 4March 27, 2003 Volunteer PC : load & exec task Volunteer PC parametres Client Application set params. / get results. Internet server Desktop GRID Dedicated applications distributed.net, –Decrypthon Production projects –Folderol, Open source/research projects –XtremWeb, BOINC, Commercial platforms –Entropia, Datasynapse, –United Devices, Grid systems One server centralizes scheduling On volunteer PCs Volunteer PC
CHEP O.Lodygensky 5March 27, 2003 Scalability : up to 100 k, 1 M hosts Heterogeneity : different hardwares, OSes Volatility : unpredictable participant behaviour Napster, Kazaa, etc. : they work well despite volatility. Perenity : developments and upgrades must be easy Performances : ~30 Tflops, Kazaa (1 M users : 100kb/s, 1Mb/s 100 Gb/s, 1 Tb/s?). Sécurity : Volunteer PCs and servers integrity ; Prevent application & results corruption ; Authentication. Desktop Grid characteristics
CHEP O.Lodygensky 6March 27, 2003 Sommaire Introduction XtremWeb Auger distributed computing Conclusion
CHEP O.Lodygensky 7March 27, 2003 XW : Architecture Centralized Global Computing (Peer to Peer) 3 entities : client/coordinator/worker PC Client/Worker Internet / LAN Global Computing coordinator (client) PC Worker P2P Coordinator PC Client/worker PC Worker Hierarchical Coordinator
CHEP O.Lodygensky 8March 27, 2003 XW : Technology Pre requisite for installation: database (Mysql), JAVA > jdk1.2. Data Base SQL Java JDBC Server Java Communication protocol XML-RPC SSL Http Server PHP3-4 Installation GNU autotool Worker Client Java
CHEP O.Lodygensky 9March 27, 2003 Worker Loaded App Sandbox (SBLSM) Coordinat. ssh ssh Client XW : Security
CHEP O.Lodygensky 10March 27, 2003 XW : fault tolerance model Client Coord. Submit task Worker1 Get work Put result Sync/Retrieve result Client2 Worker2 Sync/Get work Put result Sync/Retrieve result Coord. Sync/Submit task Sync/Get work Sync/Put result Sync/Retrieve result Every entity is volatile by essence Connectionless protocols => All entities are stand alone
CHEP O.Lodygensky 11March 27, 2003 Sommaire Introduction XtremWeb Auger distributed computing Conclusion
CHEP O.Lodygensky 12March 27, 2003 Pierre Auger Observatory Understanding the origin of very high cosmic rays: Aires: Air Showers Extended Simulation –Sequential, Monte Carlo. Time for a run: 5 to 10 hours PC worker Aires PC worker air shower Server Internet and LAN PC Worker PC Client Air shower parameter database (Lyon, France) XtremWeb Traditional Super Computing Centers CINES (Fr) Fermi Lab (USA) Estimated PC number ~ 5000
CHEP O.Lodygensky 13March 27, 2003 Auger-XW (AIRES): High Energy Physics Internet Icluster Grenoble PBS Madison Wisconsin Condor U-psud network LRI Condor Pool Other Labs lri.fr XW Client XW Coordinator Application : AIRES Deployment: Coordinator at LRI Madison: 700 workers Pentium III, Linux (500 MHz+933 MHz) (Condor pool) Grenoble Icluster: 146 workers (733 Mhz), PBS LRI: 100 workers Pentium III, Athlon, Linux (500MHz, 733MHz, 1.5 GHz) (Condor pool)
CHEP O.Lodygensky 14March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 15March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 16March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 17March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 18March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 19March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 20March 27, 2003 Auger-XW (AIRES): High Energy Physics
CHEP O.Lodygensky 21March 27, 2003 Sommaire Introduction XtremWeb Auger distributed computing Conclusion
CHEP O.Lodygensky 22March 27, 2003 Conclusion XtremWeb : a « desktop Grid » platform Fault tolerance. XtremWeb : « connectionless » + « restartable » Security : certificats + crypto + sandbox +… – What we have learned so far with XtremWeb: – Deployment is critical – When they understand the computational power potential, users rapidly ask for more resources!!! XtremWeb Auger: International Desktop GRID Condor pools with XW as global infrastructure Good performances (ratio 1:60 with several small hosts than the reference) =>Schedulling is a lack of XtremWeb <= =>Strong need of results browsing tools <=
CHEP O.Lodygensky 23March 27, 2003 Software XtremWeb : –Since 2001 –Acual version : 1.2.rc0