Download presentation
Presentation is loading. Please wait.
1
Grid S.G. Ansari 16 June 201516 June 201516 June 2015 GaiaGrid – A three Year Experience Salim Ansari Toulouse 20 th October, 2005
2
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Why Grid? The GDAAS Study had underestimated the necessary computational power to carry out the Gaia Data Analysis prototype. The number of parallel activities spun out of control, as algorithm providers began delivering algorithms that could not be implemented on the limited infrastructure dedicated to GDAAS A clear need for a collaborative environment was inevitable
3
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Objectives 1. 1. to increase computational power whenever and wherever needed at low cost 2. 2. to provide a framework of developing Shell task algorithms for Gaia and 3. 3. to establish a collaborative environment, where the community may share and exchange results
4
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Constraints Moto: Low cost, high return on investment Low cost hardware budget: reusability of low end PC’s Small investment in industrial effort: [0.5 FTE] System Administration: 1 junior staff + maintenance [1 FTE]
5
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Core vs. Shell Tasks Core Tasks: Initial Data Treatment Initial Data Treatment Global Iterative Solution Global Iterative Solution Cross-correlations Cross-correlations Acts upon the totality of the data Acts upon the totality of the data Shell Tasks: Classification Classification Photometric analysis Photometric analysis Spectroscopic analysis Spectroscopic analysis Any data analysis involving remote expertise and acts upon a portion of the data at a time Centralised As a result of the GDAAS Study, two categories of algorithms had been established:
6
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Gaia Virtual Organisation June 2005
7
Grid S.G. Ansari16 June 201516 June 201516 June 2015 The Processing Scope The Processing Scope Michael Perryman, GAIA-MP-009, 17 August 2004, Version 1.1 Task Processing Power in total Duration [1.2 Teraflop machine]* Core Tasks 40 × 10 18 FLOPs385 days CPU time on the target ‘2012 machine’ Shell Tasks 90 × 10 18 FLOPs880 days CPU time on the target ‘2012 machine’ TOTAL 10 21 FLOPs Assuming factor 10 in uncertainty Top Tasks - GIS processing: 125 days (CPU processing on 2012 machine) - first-look: 125 days (assumed equal to GIS at present) - spectro PSF fitting: 71 days - variability period: 33 days - various DMS classes: 60 days - DMS: ASM analysis - multiples: ASM * Assuming a 40 GFlop machine today extrapolated to 2012 with Moore’s Law
8
Grid S.G. Ansari16 June 201516 June 201516 June 2015 The first months Setting up the hardware and nodes was easy and took 2 man months Globus was installed on: ESTEC nodes ESTEC nodes ESRIN was already up and running ESRIN was already up and running CESCA node in Barcelona CESCA node in Barcelona ULB node in Brussels ULB node in Brussels ARI node in Heidelberg ARI node in Heidelberg GridAssist tool was identified as a potential workflow tool
9
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Task distribution on GaiaGrid GDAAS DB Gaia Simulator Core Processing GridAssist Controller Data Access Layer Globus NodeShell TaskGlobus NodeShell Task Initial Data Treatment } Barcelona ESTEC ULB ESRIN
10
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Current Infrastructure 9 Infrastructures in 7 countries (voluntary) [51 CPUs]: ESTEC [14 CPUs] (SCI-CI) + 1 Gigabit dedicated link to Surfnet ESTEC [14 CPUs] (SCI-CI) + 1 Gigabit dedicated link to Surfnet ESAC [ 4 CPUs] (SCI-SD) + 8 Mb link to REDIRIS ESAC [ 4 CPUs] (SCI-SD) + 8 Mb link to REDIRIS ESRIN [16 CPUs] (EOP) + 155 Mb link to GARR ESRIN [16 CPUs] (EOP) + 155 Mb link to GARR CESCA [ 5 CPUs] (Barcelona) + REDIRIS connectivity CESCA [ 5 CPUs] (Barcelona) + REDIRIS connectivity ARI [ 2 CPUs] (Heidelberg) + Academic backbone ARI [ 2 CPUs] (Heidelberg) + Academic backbone ULB [ 1 CPU] (Brussels) + Academic backbone ULB [ 1 CPU] (Brussels) + Academic backbone DutchSpace [7 CPU] (Leiden) + Commercial link DutchSpace [7 CPU] (Leiden) + Commercial link IoA [1 CPU] (Cambridge) + Academic Backbone IoA [1 CPU] (Cambridge) + Academic Backbone UGE [1 CPU] (Geneva) + Academic Backbone UGE [1 CPU] (Geneva) + Academic Backbone 2 Data Storage Elements: CESCA [5 Terabytes] CESCA [5 Terabytes] ESTEC [2 Terabytes] ESTEC [2 Terabytes] ESAC [upto 4 Terabytes) ESAC [upto 4 Terabytes) The current infrastructure has been created on an experimental basis and should not yet be considered part of an operational environment
11
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Current Applications Gaia Simulator Astrometric Binary Star Shell Task Variability Star Analysis Shell Task RVS Cross Correlation Shell Task
12
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Global Gaia Data Processing
13
Grid S.G. Ansari16 June 201516 June 201516 June 2015 The GridAssist Client Performance Grid Computation Heidelberg Rome Leiden Barcelona
14
Grid S.G. Ansari16 June 201516 June 201516 June 2015 The GridAssist Client Distributed Grid Computation Barcelona Brussels Noordwijk
15
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Results Gaia Simulator profited tremendously from GaiaGrid, which accelerated the simulations of the Astrometric Binary Stars. This would have otherwise needed to be scheduled on a single infrastructure at CESCA, which was at the same time running GDAAS tasks. The Astrometric Binary Star Analysis for a single HTM cell (383 systems) is down to 15 minutes (and falling) on 2 infrastructures from a single CPU in Brussels, which was taking 3 hours.
16
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Possible Implementation: The Gaia Collaboration Environment Binary Star Analysis Variable Star Analysis Radial Velocity Cross Correlations Photometric Analysis Classification GaiaLib Core Interface Binary Star Analysis Variable Star Analysis Classification Gaia Data Results The Gaia Community would develop, analyse and update the data transparently, without having any notion of where each component is running, or have to worry about CPU and storage limitatons.
17
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Security Issues All ESTEC and ESRIN Grid machines lie outside the ESA firewall Security is controlled via ESA Grid certification Currently no distinction is made between projects (e.g. GaiaGrid and PlanckGrid.) The GridAssist tool provides basic functionality to distinguish an administrator (person who may add/remove sources) from a workflow user.
18
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Certification Certification Authority for ESA Grid lies currently with ESTEC (SCI-C) This is under review in light of higher-level discussions within EIROForum Grid Group
19
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Future Activities The GaiaGrid environment is available to anyone wishing to experiment with parallelization and distribution of tasks In the current Gaia Data Processing framework, the environment can only be used as standalone. The possibility of using the Grid environment to also carry out some core tasks is being investigated. GaiaGrid can be considered the testbed for all algorithms under development
20
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Conclusions GaiaGrid has demonstrated that it is easy to setup a Grid environment. GaiaGrid has also demonstrated the collaborative capabilities by allowing the sharing of results amongst multiple institutes The deployment of the Gaia Simulator has led programmer to think more “portable”
21
Grid S.G. Ansari16 June 201516 June 201516 June 2015 Lessons learned The development of Gaia algorithms is a task that involves a community of people dispersed across Europe No single group should believe that they can implement all of these algorithms without the proper support by the community A sound collaboration environment is essential to ensure that everyone in a single community has a common understanding of the problematics. Processing is cheap and the technology is simple, but cumbersome to maintain. Each shell task has to be installed on all the Grid machines used in any Virtual Organisation. There is no magic to Grid! The main hurdles in Grid involve security and certification. Who should be allowed to run jobs on my machine(s)? Grid should always be considered as “added value”, but should not be considered within the scope of day-to-day operations like the data processing in Gaia (if it becomes that, you have underestimated the effort of carrying out your project and should review your internal resources for the long term.)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.