H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 1 EnablingGrids for E-sciencE Benefits of the MAGIC Grid Status report of.

Slides:



Advertisements
Similar presentations
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
Advertisements

Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA NEMO Monte Carlo Application on the Grid R. Calcagno for the NEMO Collaboration.
The Grid Constantinos Kourouyiannis Ξ Architecture Group.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
Workload Management Massimo Sgaravatto INFN Padova.
A tool to enable CMS Distributed Analysis
Dr. Harald Kornmayer A distributed, Grid-based analysis system for the MAGIC telescope, CHEP 2004, Interlaken1 H. Kornmayer, IWR, Forschungszentrum Karlsruhe.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
3 rd DPHEP Workshop CERN, 7-8 December 2009 G. LAMANNA CTA C herenkov Telescope Array Giovanni Lamanna LAPP - Laboratoire d'Annecy-le-Vieux de Physique.
Nicholas LoulloudesMarch 3 rd, 2009 g-Eclipse Testing and Benchmarking Grid Infrastructures using the g-Eclipse Framework Nicholas Loulloudes On behalf.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
INFSO-RI Enabling Grids for E-sciencE iASTRO MC MEETING&WORKSHOP, 27-30, APRIL, 2005,SOFIA, BULGARIA Introduction to Grid Technologies.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE Project Gridification: the UNOSAT experience Patricia Méndez Lorenzo CERN (IT-PSS/ED) CERN,
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-siencE Generic Application: Monte Carlo Production.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
F. Goebel, MPI München, 14. June 2004, EGAAP, CERN Florian Goebel Max-Planck-Institut für Physik (Werner-Heisenberg-Institut) München for the MAGIC collaboration.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
Migration of Monte Carlo Simulation of High Energy Atmospheric Showers to GRID Infrastructure Migration of Monte Carlo Simulation of High Energy Atmospheric.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
Computing at MAGIC: present and future Javier Rico Institució Catalana de Recerca I Estudis Avançats & Institut de Física d’Altes Energies Barcelona, Spain.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Easy Access to Grid infrastructures Dr. Harald Kornmayer (NEC Laboratories Europe) Dr. Mathias Stuempert (KIT-SCC, Karlsruhe) EGEE User Forum 2008 Clermont-Ferrand,
May Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
INFSO-RI Enabling Grids for E-sciencE Comp Chem and MAGIC (based on info from Den Haag and follow-up since) F Harris on behalf of.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.
H. Kornmayer MAGIC-Grid EGEE, Panel discussion, Pisa, Monte Carlo Production for the MAGIC telescope A generic application of EGEE Towards.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VO auger experience with large scale simulations on the grid Jiří Chudoba.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
Jiri Chudoba for the Pierre Auger Collaboration Institute of Physics of the CAS and CESNET.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Overview of gLite, the EGEE middleware Mike Mineter Training Outreach Education National.
II EGEE conference Den Haag November, ROC-CIC status in Italy
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using EGEE middleware: Putting it all together!
INFSO-RI Enabling Grids for E-sciencE Resource allocation and negotiation update C. Vistoli, R. Rumler Operations workshop Bologna.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
Enabling Grids for E-sciencE LRMN ThIS on the Grid Sorina CAMARASU.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
Enabling Grids for E-sciencE University of Perugia Computational Chemistry status report EGAAP Meeting – 21 rst April 2005 Athens, Greece.
Workload Management Workpackage
BaBar-Grid Status and Prospects
Eleonora Luppi INFN and University of Ferrara - Italy
The MAGIC Data Center storage and computing infrastructures in Grid
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
The LHCb Computing Data Challenge DC06
Presentation transcript:

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Benefits of the MAGIC Grid Status report of an EGEE generic application Harald Kornmayer, Ariel Garcia (Forschungszentrum Karlsruhe) Toni Coarasa (Max-Planck-Institut für Physik, München) Ciro Bigongiari (INFN, Padua) Esther Accion, Gonzalo Merino, Andreu Pacheco, Manuel Delfino (PIC, Barcelona) Mirco Mazzucato (CNAF/INFN Bologna) in cooperation with MAGIC collaboration

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Outline Introduction What kind of MAGIC? The idea of a MAGIC Grid Grid added value Expectations vs. reality? Data challenges Experience Conclusion and Outlook

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Introduction: The MAGIC Telescope Ground based Air Cerenkov Telescope Gamma ray: 30 GeV - TeV LaPalma, Canary Islands ( 28° North, 18° West ) 17 m diameter operation since autumn 2003 (still in commissioning) Collaborators: IFAE Barcelona, UAB Barcelona, Humboldt U. Berlin, UC Davis, U. Lodz, UC Madrid, MPI München, INFN / U. Padova, U. Potchefstrom, INFN / U. Siena, Tuorla Observatory, INFN / U. Udine, U. Würzburg, Yerevan Physics Inst., ETH Zürich Physics Goals: Origin of VHE Gamma rays Active Galactic Nuclei Supernova Remnants Unidentified EGRET sources Gamma Ray Burst

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE ~ 10 km Particle shower Ground based γ-ray astronomy ~ 1 o Cherenkov light ~ 120 m Gamma ray GLAST (~ 1 m 2 ) Cherenkov light Image of particle shower in telescope camera reconstruct: arrival direction, energy reject hadron background

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE MAGIC – Why the Grid? MAGIC is an international collaboration Partners distributed all over Europe Amount of data can NOT be handled by one partner only (up to 200 GB per night) Access to data and computing needs to be more efficient MAGIC will build a second telescope MAGIC is an international collaboration Partners distributed all over Europe Amount of data can NOT be handled by one partner only (up to 200 GB per night) Access to data and computing needs to be more efficient MAGIC will build a second telescope Analysis is based on Monte Carlo simulations CORSIKA code CPU consuming 1 night of hadronic background needs days on 70 computer Lowering the threshold of MAGIC telescope requires new methods based on MC simulations More CPU power needed! Analysis is based on Monte Carlo simulations CORSIKA code CPU consuming 1 night of hadronic background needs days on 70 computer Lowering the threshold of MAGIC telescope requires new methods based on MC simulations More CPU power needed!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Developments - Requirements MAGIC needs a lot of CPU to simulate the hadronic background to explore the energy range 10 GeV – 100 GeV MAGIC needs a coordinated effort for the MonteCarlo production MAGIC needs an easy accessible system (Where are the data from run_1002 and run_1003?) MAGIC needs an scalable system (as MAGIC II will come 2007) MAGIC needs the possibility to access data from other experiments (HESS, Vertias, GLAST, PLANCK(?)) for multi wavelength campaigns

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE The infrastructure idea Use three national Grid centers CNAF, PIC, GridKA All the EGEE members Run the central services Connect MAGIC resources to enable collaboration (Get resources for free! ) 2 subsystems MC (Monte Carlo) Analysis Start with MC first!!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Development – MC Workflow I need 1.5 million hadronic showers with Energy E, Direction (theta, phi),... As background sample for observation of „Crab nebula“ Run Magic MonteCarlo Simulation and register output data Run Magic Monte Carlo Simulation (MMCS) and register output data Simulate the Telescope Geometry with the reflector program for all interesting MMCS files and register output data Simulate the Starlight Background for a given position in the sky and register output data Simulate the response of the MAGIC camera for all interesting reflector files and register output data Merge the shower simulation and the StarLight simulation and produce a MonteCarlo data sample

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Implementation 3 main components: meta data base bookkeeping of the requests, their jobs and the data Requestor user define the parameters by inserting the request to the meta data base Executor creates Grid jobs by checking the metadatabase frequently (cron) and generating the input files

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Grid added value expectations vs. reality : Collaboration (-) Complex software, limited # of OS, limited # of batch systems make the integration of new sites of MAGIC collaborators difficult The final integration of a cluster (SUSE, SGE batch system, AFS, firewall) took too long (9 months) Speed up of MC production (+) The reliable infrastructure and the good support from many sites made that possible! Many thanks to sk, bg, pl, uk, gr, it, es, de, … Service offered was overall good with problems when new releases appeared (every time! :–( ) with problems to have a sustainable configuration (for VO, replica service, …) Central services run by EGEE were stable! expectations vs. reality : Collaboration (-) Complex software, limited # of OS, limited # of batch systems make the integration of new sites of MAGIC collaborators difficult The final integration of a cluster (SUSE, SGE batch system, AFS, firewall) took too long (9 months) Speed up of MC production (+) The reliable infrastructure and the good support from many sites made that possible! Many thanks to sk, bg, pl, uk, gr, it, es, de, … Service offered was overall good with problems when new releases appeared (every time! :–( ) with problems to have a sustainable configuration (for VO, replica service, …) Central services run by EGEE were stable!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Grid added value II expectations vs. reality II: Persistent storage (+) of Monte Carlo data Some problems during the first runs –(Too many small files on a tape system is equal to /dev/null). We learnt that lesson! of observation data Started the automated production data transfer of real observation data from LaPalma to PIC, Barcelona in november TB of real data are available on the Grid now Improvements of data availability (?) Replica mechanisms needs to be tested! Measurements needed in the future! Ongoing work! expectations vs. reality II: Persistent storage (+) of Monte Carlo data Some problems during the first runs –(Too many small files on a tape system is equal to /dev/null). We learnt that lesson! of observation data Started the automated production data transfer of real observation data from LaPalma to PIC, Barcelona in november TB of real data are available on the Grid now Improvements of data availability (?) Replica mechanisms needs to be tested! Measurements needed in the future! Ongoing work!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Grid added value III expectations vs. reality III: Cost reduction (-) additional implementations were necessary (-) MAGIC implemented its own prototype meta data base system –to monitor the status of many jobs of a mass production –to check the “status” of a job! (  later) MAGIC implemented its own rudimentary workflow system –Nothing was available at the beginning GGUS reduced the costs definitely (+) MAGIC Grid participants appreciated the support structure of the GGUS portal Every new middleware release forced (-) a downtime of the system customization of the system expectations vs. reality III: Cost reduction (-) additional implementations were necessary (-) MAGIC implemented its own prototype meta data base system –to monitor the status of many jobs of a mass production –to check the “status” of a job! (  later) MAGIC implemented its own rudimentary workflow system –Nothing was available at the beginning GGUS reduced the costs definitely (+) MAGIC Grid participants appreciated the support structure of the GGUS portal Every new middleware release forced (-) a downtime of the system customization of the system

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Data challenges Past experience -Three MMCS data challenges: - Mar/Apr 2005: 10% failure - July 2005: 3.9%failure - Sept 2005: 3.4% failure - Improvements: -Underlying Middleware -Operation of Services -Many lessons learnt - Data management - Additional checks Last data challenge -December - today -Successful: Success (Data available) Mmcs output registered in the Grid -FAILED: 4567 () - Done (Failed): Done (Success): Scheduled: 86 - Submitted: 9 - Aborted: Waiting: 473 ?

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Useless status of jobs Data storage site is selected by the JDL ….OutputData = { [OutputFile="data/cer012345" ; LogicalFileName="lfn:mmcs_cer012345\“; StorageElement=" castorgrid.pic.es“; ], ….. The WMS should register the file automatically on the Grid! BUT: If the job fails (RLS service down, SE not available,...) the WMS mention the STATUS as “Done (Successful)” „DONE (Successful)“ has NO meaning for the output of data specified in the JDL! A more sophisticated system is necessary for a production system! We developed it for out own! (As every VO?) Can we get a WMS that takes data output into account?

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Missing VO support in WMS Mass production is managed by one member of the VO VO production manager No need to be a Grid expert! Every job is assigned to him exclusively! edg-job-submit -- vo magic mmcs_ jdl NO other member of the VO can get information about the status of the job edg-job-status about the stdout/stderr of the job edg-job-get-output The basic commands MUST have more VO support!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Meta data base The output data files should stored and registered on the Grid! But the files are only useful if “content describing” information can be attached to the files! “From Storage to knowledge!” “from Grid to e-Science” We implemented a “separate” meta data base that links this information and the file URI  One extensible framework for replica and meta data services would be nice!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Workflows The MAGIC Monte Carlo system is a good example for a scientific workflow 1000 jobs can be started in parallel (embarrassingly!) MAGIC looked for a middleware tool which support workflows Using standard workflow description Support for self recovery of failed jobs 3% of jobs “fail”  30 out of 1000 Without this feature NO workflow will succeed! There are tools around but we need something like a “best practise guide” for one tool! We don’t want to program it by our own on top of the meta data base!

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Experience – reliability 2005: Three different data challenges March/april 10,4% successful jobs July 3,8 % successfull jobs September 3.1 % successful jobs  EGEE infrastructure became more reliable! Mass production: Started in December after a training of users at FZK There is always a reason for failure! Deployment is a challenge too! New year Christmas in Spain LCG 2,7

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE MAGIC Grid is reality Production of MC using MAGIC Grid resources started in december! –We plan to ask (temporarily) for more CPUs for stress testing! MAGIC collaboration will put their real data on the Grid The challenges for computing will increase with the second telescope MAGIC Grid is reality Production of MC using MAGIC Grid resources started in december! –We plan to ask (temporarily) for more CPUs for stress testing! MAGIC collaboration will put their real data on the Grid The challenges for computing will increase with the second telescope EGEE – MAGIC Grid

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE MAGIC Grid – future prospects MAGIC is a good example to do e-Science to use the e-Infrastructure to exploit Grid-Technology What is about a „GRID“ of different VHE gamma ray observatories? „Towards a virtual observatory for VHE  -rays“ MAGIC is a good example to do e-Science to use the e-Infrastructure to exploit Grid-Technology What is about a „GRID“ of different VHE gamma ray observatories? „Towards a virtual observatory for VHE  -rays“ HESS/EU/Africa Veritas/US MAGIC/EU Kangaroo - AUS/JP

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Experience – InputProcOutput How to submit a job?  JDL JDL should specify Get the input form InputSandBox InputData Run the program Executable Store the output at OutputSandBox OutputData InputData OutputData File on UI InputSandBox File on Grid InputData File to UI OutputSandBox File to Grid OutputData OK! No file transfer! OK! Answer from experts: write a script that copy the file a SE to the WN BUT: I don‘t want to implement a WORKAROUND for basic grid functionality

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April EnablingGrids for E-sciencE Experience – Execution Data challenge Grid-1 12M hadron events jobs needed started march 2005 up to now ~ 4000 jobs First tests: with manual GUI submission Reasons for failure Network problems RB problems Queue problems Job successful: Output file registered at PIC Diagnostic: no tools found complex and time consuming  use metadata base, log the failure, resubmit and don‘t care 170/3780 Jobs failed  4.5 % failure