Presentation is loading. Please wait.

Presentation is loading. Please wait.

The German HEP Community Grid for the German HEP Community Grid 27-March-2007, ISGC2007, Taipei Agenda: D-Grid in context HEP Community.

Similar presentations


Presentation on theme: "The German HEP Community Grid for the German HEP Community Grid 27-March-2007, ISGC2007, Taipei Agenda: D-Grid in context HEP Community."— Presentation transcript:

1 The German HEP Community Grid P.Malzacher@gsi.de for the German HEP Community Grid 27-March-2007, ISGC2007, Taipei Agenda: D-Grid in context HEP Community Grid HEP-CG Work Packages Summary

2 ~ 10 000 scientists from 1000 institutes out of more then 100 countries, investigate with the help of huge accelerators basic problems of particle physics.

3 Collision rate 40 MHz. 200 events per second stored for further processing. 1 event ~ 1.6 MB, i.e. 320 MB/s. ~ 1000 events per file, i.e. 2GB files. One 2GB file every 5 seconds. The scale and the cost of LHC computing - data storage, simulation, reconstruction and analysis - require a distributed model. One interesting event (+ 30 background events): decay of a Higgs particle into four muons all tracks with pt > 2 GeV reconstructed tracks with pt > 25 GeV per experiment: ~10 PetaByte / year ~10 9 events / year ~10 3 batch and interactive user Tier 0 Tier 1 Tier 2 4-10 Tier 3 3-5/Tier 1 320MB/s Grid

4 20002001200220032004200520062007200820092010 EDGEGEEEGEE 2 Oct. HI run Mar.-Sep. pp run Today EGEE 3 ? GridKa/GGUS D-Grid in context: e-Science in Germany LCG R&DWLCG Ramp-up... Community Grids DGI DGI 2 D-Grid Initiative Initiative 10. Berlin HEP CG Commercial uptake of service

5 www.d-grid.de

6 Generic platform and generic Grid services D-Grid Integration Project Astro GridMedi GridC3 Grid HEP CG In Grid Text Grid C3 Grid Grid Computing & Knowledge management & e-Learning e-Science = see talk of Anette Weisbecker, in Life Sciences I

7 PC² RRZN TUD RZG LRZ RWTH FZJ FZK FHG/ ITWM Uni-KA D-Grid WPs: Middleware & Tools, Infrastructure, Network & Security, Management & Sustainability Middleware: Globus 4.x gLite (LCG) UNICORE GAT and GridSphere Data Management: SRM/dCache OGSA-DAI Meta data schemas VO Management: VOMS and Shibboleth see talk of Thomas Fieseler, in Operation I see talk of Michael Rambadt, in Middleware II

8 LHC groups in Germany Alice: Darmstadt, Frankfurt, Heidelberg, Münster ATLAS: Berlin, Bonn, Dortmund, Dresden, Freiburg, Gießen, Heidelberg, Mainz, Mannheim, München, Siegen, Wuppertal CMS: Aachen, Hamburg, Karlsruhe LHCb: Heidelberg, Dortmund

9 German HEP instituts on the WLCG monitoring map. WLCG: Karlsruhe (GridKa & Uni), DESY, GSI, München, Aachen, Wuppertal, Münster, Dortmund, Freiburg

10 HEP CG partner: Project partner: Uni Dortmund, TU Dresden, LMU München, Uni Siegen, Uni Wuppertal, DESY (Hamburg & Zeuthen), GSI via subcontract: Uni Freiburg, Konrad-Zuse-Zentrum Berlin, unfunded: Uni Mainz, HU Berlin, MPI f. Physik München, LRZ München, Uni Karlsruhe, MPI Heidelberg, RZ Garching, John von Neumann Institut für Computing, FZ Karlsruhe

11 Focus on tools to improve data analysis for HEP and Astroparticle Physics. Focus on gaps, do not reinvent the wheel. Data management Advanced scalable data management Job-and data co-scheduling Extendable Metadata catalogues for Lattice QCD and Astroparticle physics Job monitoring and automated user support Information services Improved Job failure treatment Incremental results of distributed analysis End-user data analysis tools Physics and user oriented job scheduling, workflows, automatic job scheduling All development is based on LCG / EGEE sw and will be kept compatible!

12 HEP CG WP1: Data Management Coordination P.Fuhrmann, DESY Developing and supporting a scalable Storage Element based on Grid standards (DESY, Uni Dortmund, UniFreiburg, unfunded FZK) Combined job- and data-scheduling, accounting and monitoring of data used (Uni Dortmund) Development of grid-based, extendable metadata catalogues with semantically world- wide access (DESY, ZlB, unfunded: Humboldt Uni Berlin, NIC)

13 Scalable Storage Element: dCache The dCache project is funded from DESY, FERMI Lab, OpenScience Grid and in part from the Nordic Data Grid Facility. HEP CG contributes: Professional product management: code versioning, packaging, user support and test suites. - only one host - ~ 10 TB - zero maintenance - thousands of pool - PB disk storage - hundreds of file transfers per second - not more than 2 FTEs dCache.ORG

14 dCache: The principle Backend Tape Storage Streaming Data (gsi)FTP http(g) Posix I/O xRoot dCap Storage Control SRM EIS protocol Engines dCache Controller Managed Disk Storage HSM Adapter dCache.ORG Information Prot.

15 dCache: The integration Storage Element Firewall IN - SITE Compute Element Information System FTS Channels gsiFtp SRM Storage Resource Manager Protocol File Transfer Service dCap/rfio/root OUT - SITE

16 CPU and data co-scheduling: online vs. near line files, information about time to get a file online

17 HEP CG WP2: Job Monitoring + User Support Tools Coordination: P.Mättig, Uni Wuppertal Development of a job information system (TU Dresden) Development of an expert-system to classify job - failures, automatic treatment of most common errors (Uni Wuppertal, unfunded FZK) R&D on interactive job steering and access to temporary, incomplete analysis job results (Uni Siegen)

18 User specific job- and resource usage- monitoring

19 Integration into GridSphere Focus on many job scenario. Ease of use. User should not need to know more than necessary, which should be almost nothing. From general to detailed views on jobs. Information like status, resource usage by jobs, output, time lines etc. Interactivity: zoom in display, clicking shows detailed information

20 Development of an expert-system to classify job -failures, automatic treatment of most common errors. Motivation Thousands of jobs/day in the LHC Computing Grid (LCG) Job status at run-time is hidden from the Manual error tracking is difficult and can take long Current monitoring is more resource then user oriented (GridICE, …) Therefore Monitoring on script level  JEM Automation necessary  Expert-system

21 gLite/LCG Workernode Preexecution Test Supervision of commands Status-reports via R-GMA Visualisation via GridSphere Bash Python Expert-system for error classification Integration in the ATLAS software environment Integration in GGUS post D-Grid I: automatic error correction,... ? JEM: Job Execution Monitor

22 HEP CG WP3: Distributed Interactive Data Analysis Coordination P.Malzacher, GSI (LMU, GSI, unfunded: LRZ, MPI M, RZ Garching, Uni Karlsruhe, MPI Heidelberg) Optimize application specific job scheduling Analyze and test of software environment required Job management and Bookkeeping of distributed analysis Distribution of analysis, sum-up of results Interactive Analysis: Creation of a dedicated analysis cluster Dynamic partitioning of Grid analysis clusters

23 Start with Gap Analysis LMU: Investigating Job-Scheduler requirements for distributed and interactive analysis GANGA (ATLAS/LHCb) project shows good features for this task Used for MC production, reconstruction and analysis on LCG GSI: Analysis based on PROOF Investigating different versions of PROOF clusters Connect ROOT and gLite: TGlite class TGrid : public TObject { public: … virtual TGridResult *Query ( … static TGrid *Connect ( const char *grid, const char *uid = 0, const char *pw = 0 … ClassDef(TGrid,0) };

24 Storage queues manager outputs catalog query  “static” use of resources  jobs frozen: 1 job / worker node  splitting at the beginning, merging  limited monitoring (end of single job) submit files jobs data file splitting myAna.C merging final analysis GANGA, Job split approach

25 catalog Storage scheduler query  farm perceived as extension of local PC  same macro, syntax as in local session  more dynamic use of resources  real time feedback  automated splitting and merging MASTER PROOF query: data file list, myAna.C files final outputs (merged) feedbacks The PROOF approach

26 Summary: Rather late compared to other national Grid initiatives a German e-science program is well under way. It is build on top of 3 different middleware flavors: UNICORE, Globus 4 and gLite. The HEP-CG production environment is based on LCG / EGEE software. The HEP-CG focuses on gaps in three work packages: data management, automated user support and interactive analysis. Challenges for HEP: Very heterogeneous disciplines and stakeholders. LCG/EGEE is not basis for many other partners. More Information I showed only a few highlights for more info see: http://www.d-grid.de


Download ppt "The German HEP Community Grid for the German HEP Community Grid 27-March-2007, ISGC2007, Taipei Agenda: D-Grid in context HEP Community."

Similar presentations


Ads by Google