Presentation is loading. Please wait.

Presentation is loading. Please wait.

The CTA Computing Grid Project Cecile Barbier, Nukri Komin, Sabine Elles, Giovanni Lamanna, LAPP, CNRS/IN2P3 Annecy-le-Vieux Cecile Barbier - Nukri Komin1EGI.

Similar presentations


Presentation on theme: "The CTA Computing Grid Project Cecile Barbier, Nukri Komin, Sabine Elles, Giovanni Lamanna, LAPP, CNRS/IN2P3 Annecy-le-Vieux Cecile Barbier - Nukri Komin1EGI."— Presentation transcript:

1 The CTA Computing Grid Project Cecile Barbier, Nukri Komin, Sabine Elles, Giovanni Lamanna, LAPP, CNRS/IN2P3 Annecy-le-Vieux Cecile Barbier - Nukri Komin1EGI User Forum Vilnius, 11.4.2011

2 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20112 CTA-CG CTA Computing Grid LAPP, Annecy Giovanni Lamanna, Nukri Komin Cecile Barbier, Sabine Elles LUPM, Montpellier Georges Vasileiadis Claudia Lavallay, Luisa Arrabito Goal: Bring CTA on the Grid

3 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20113 CTA-CG Aim provide working environment, tools and services for all tasks assigned to Data Management and Processing Center simulation data processing storage offline analysis user's interface Test Grid computing software around Grid computing estimate computing needs and requirements requests at Lyon, close contact with DESY Zeuthen

4 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20114 Outline CTA and its Data Management and Processing Centre current activities: massive Monte Carlo Simulations preparation of Meta Data Base short-term plan bring the user on the Grid and to the data ideas for future data management and analysis pipe-line Note: CTA is in preparatory phase here mostly work in progress and ideas

5 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20115 Current Cherenkov Telescopes

6 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20116 Cherenkov Telescope Array large array, 30 – 100 telescopes in 3 sizes Preparatory Phase 2010-2013 100 institutes, 22 countries

7 The CTA Observatory main logical units : Science Operation Centre: organisation of observations Array Operation Centre: the on-site service Science Data Centre: Software development Data analysis Data reduction Data archiving Data dissemination to observers Total expected data volume from CTA: 1 to 10 (?) PB per year (main data stream for permanent storage is of the order of 1 (10 ?) GB/s) MC requirements: tens of CPU years, hundreds of TB Existing ICT-based infrastructures, such as EGEE/EGI and GEANT, are potential solutions to provide the CTA observatory with best use of e-infrastructures. 7 CTA Operational Data Flow

8 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20118 CTA Virtual Grid Organisation Benefits of the EGEE/EGI Grid institutes can provide easily computing power minimal man power needed, usually sites already supporting LHC can be managed centrally (e.g. for massive simulations) distributed but transparent for all users (compare HESS) CTA Virtual Organisation: vo.cta.in2p3.fr French name, but open to everyone (renaming almost impossible) VO manager: G. Lamanna @ LAPP

9 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.20119 CTA VO – computing 14 sites in 5 countries providing access to their computing resources 3 big sites: CC Lyon, DESY Zeuthen, Cyfronet Poland GRIF: several sites of various sizes in/around Paris many small sites (~100 CPUs) 30k logical CPUs, shared with other VOs ~1000 – 2000 CPUs for CTA at any time (based on experience)

10 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201110 CTA VO – computing load not smooth, only one simulation manager (Nukri Komin) LAPP, CCIN2P3 and DESY Zeuthen among the biggest contributors

11 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201111 CTA VO – storage each site providing several 100GB up to 10 TB local disk space massive storage (several 100 TB): CC Lyon (including tapes), DESY Zeuthen, Cyfronet massive storage for large temporary files simulations: corsika, will be kept for reprocessing corsika file size 20-30 GB for 100000 proton showers

12 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201112 Grid Monte Carlo Production first massive use of Grid: MC simulations About 55000 good quality runs high requirements per run that only few grid sites can handle up to 4 GB RAM 10 GB local scratch disk space many problems solved, next round much more efficient with automated MC simulation production using the EasiJob tool developed at LAPP

13 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201113 web interface monitoring Create the job: config files and scripts GANGA job and grid control out: browse results in: configure and start a task CTA VO Grid Operation Centre Grid tools and software developed for CTACG Interface with the community Configuration Monitoring Access to data files central data base    files on grid SEs Automated Simulation Production

14 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201114 Automated Simulation Production EasiJob – Easy Integrated Job Submission developed by S. Elles within the MUST frame work MUST = Mid-Range Data Storage and Computing Centre widely open to Grid Infrastructure, at LAPP Annecy and Savoie University more general than CTA can be used for any software and every experiment based on GANGA (Gaudi/Athena aNd Grid Alliance) http://ganga.web.cern.ch/ganga/ Grid front-end in python developed for Atlas and LHCb, used by many other experiments task configuration, job submission and monitoring, file bookkeeping

15 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201115 EasiJob – Task Configuration description of a task set of parameters, with default values, define if browsable representation in data base parameter keyword (#key1) job template set of files (input sandbox) keywords will be replaced by data base values web interface example: corsika

16 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201116 EasiJob – Job Classes configure site classes requirements are based on published parameters e.g. GLueHostMainMemoryRAMSize > 2000 these requirements are interpreted differently at each site sites have different storage capacities creation of job classes large jobs only on a subset of sites job/site matching currently semi-manual close interaction with local admins (in particular Lyon and Zeuthen) web interface

17 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201117 EasiJob – Job Submission define number of jobs for a task automated job submission to the site with the minimum of waiting jobs submission is paused when too many jobs are pending status monitoring and re-submission of failed jobs keeps track of produced files logical file name (LFN) on the Grid echo statement in execution script monitoring on web page

18 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201118 EasiJob – Status deployed in Annecy, will be used for next simulations configuration and job submission not open to public want to avoid massive non-sense productions user certificates need to be installed manually idea: provide “software as a service”

19 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201119 web interface monitoring Create the job: config files and scripts GANGA job and grid control out: browse results in: configure and start a task CTA VO Grid Operation Centre Grid tools and software developed for CTACG Interface with the community Configuration Monitoring Access to data files central data base    files on grid SEs Bookkeeping

20 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201120 Bookkeeping current simulation: Not all output files kept Production parameters set in EasiJob data base automatically generated web interface [C. Barbier] shows only parameters defined as browsable proposes only values which were produced returns list of lfn (logical file names) starting point for more powerful meta data base

21 future: complicated data structure we want to keep track of files produced and their relations search for files using the production parameters find information on files, even if the files have been removed Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201121 Bookkeeping production file 1... production file 2... DST file... raw data file calibrated file DST real data structure

22 data: simulations, real data,... meta data: information describing the data logical and physical file name, production parameters, etc. meta information can be in several data bases Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201122 Meta Data Management [C. Lavalley, LUP Montpellier]

23 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201123 Meta Data Management AMI – Atlas Meta Data Interface developed at LPSC Grenoble can interrogate other data bases information can be pushed with AMI clients web, python, C++, Java clients manages access rights: username/password, certificate,... we will deploy AMI for CTA (with LUPM and LPSC) for simulations bookkeeping and file search to be tested for future use in CTA

24 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201124 Bring the User to the Data You have a certificate? You can submit jobs to the EGI grid : glite-wms-job-submit or use Ganga (http://ganga.web.cern.ch/ganga/)http://ganga.web.cern.ch/ganga/ easy to use python front end Grid User Interface needed certificate infrastructure software to download files and submit jobs we are evaluating a way to make Grid UI available Dirac :Distributed Infrastructure with Remote Agent Control http://dirac01.pic.es/DIRAC/

25 initially developed for LHCb, now generic version very easy to install Grid (and beyond) front-end work load management with pilot jobs : pull mode, no jobs lost due to Grid problems, shorter waiting time before execution integrated Data Management System integrated software management python and web interfaces for job submission LAPP, LUPM and Pic-IFAE Barcelona for setup/testing, will be open to collaboration soon we don't plan to use it for simulations Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201125 DIRAC

26 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201126 Analysis Chain telescopes raw data (level 0) raw data (level 0) calibrated camera images (level 1) calibrated camera images (level 1) photon list (level 2) photon list (level 2) sky maps, lightcurves, spectra (levels 3 and 4) sky maps, lightcurves, spectra (levels 3 and 4) e.g. Fermi Science Tools software available for Linux, Mac, Windows internal data published data

27 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201127 Data Rates Raw Data: some GB/s, 1..10 PB/year (level 0) production during night time, max. 8..10h per day 29 day cycle with peak at new moon Reconstructed Data (level 2, available to public) about 10% of raw data computing requirements: 1h of raw data needs ~200 CPUs*days (today) based on HESS Model++, 28min of 4 telescopes needs 3x 1Ms [M. de Naurois] Results (level 3 and 4, available to public) requirements: to be evaluated

28 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201128 Data Flow (a possible “Tier” view …) telescopes on-site or nearby computing centre computer cluster at participating institutes local machines, scientist's desktop two or three powerful computing centres Tier 0 1 2 3 remote site: local computing centre or fast internet link

29 Data Source and Reconstruction Tier 0: local data source (at or near observatory) – make data available on local storage element – possible site for archiving Tier 1: calibration and reconstruction sites – at least 2 sites for redundancy – guaranteed CPU time for calibration and reconstruction – can handle peaks when other site down or re-calibration – requirements: disk space and computing power strong network between Tier 0 and 1: – most computation on Tier 1 weak network connection: – data reduction at Tier 0 ( = Tier 1 is at site)

30 Data Analysis Tier 2: Science Data Centre(s) – (small) computing clusters at participating institutes – data quality check – 1 st analysis – provide preprocessed data and results Tier 3: scientist's computer – provide individual computing and software – also for non-CTA scientists – data access (data download from nearest Tier 2) – simple installation on all systems → possibly virtual machine

31 Cecile Barbier - Nukri KominEGI User Forum Vilnius, 11.4.201131 Summary CTA Computing Grid, several 1000 CPUs at 14 sites currently used for massive simulations simulations tools and services for easy submission and monitoring (LAPP) will set up a meta data base for easy search and use (with LUPM) soon: tests of DIRAC for user analysis (with PIC) future Data Management and Processing Centre on distributed sites (Tier 0,1, (2,3)) Disclaimer: CTA data management system still under study (nothing yet decided !) CTA Computing Grid is one approach under study


Download ppt "The CTA Computing Grid Project Cecile Barbier, Nukri Komin, Sabine Elles, Giovanni Lamanna, LAPP, CNRS/IN2P3 Annecy-le-Vieux Cecile Barbier - Nukri Komin1EGI."

Similar presentations


Ads by Google