Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.

Similar presentations


Presentation on theme: "Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009."— Presentation transcript:

1 Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009

2 2 Outline  Intro of CMS Computing Model  Setup of Swiss CMS Tier-3 at PSI  Working on the Swiss CMS Tier-3  Operational Experience

3 LCG Tier Organization 3 T0 (CERN) Filter farm Raw data custodial Prompt reconstruction 7 T1s Raw data custodial (shared) Re-reconstruction Skimming, calibration ~ 40 T2s Central scheduled MC production Analysis and MC Simulation for all CMS Users ~ Many T3s at institutes Local institutes’ users Final-stage analysis and MC Simulation optimized for users' analysis needs Swiss Tier-2 for ATLAS, CMS, LHCb …. Tier-3 for Swiss CMS community

4 CMS Data Organization Physicist’s View Event collection Dataset A set of event collections that would naturally be grouped for analysis To process Events: Find Transfer Access System View Files File Blocks Files grouped into blocks of reasonable size or logical content. To operate files: files blocks Stored in Grid Transfer and Access files in different storage system Manage replicas 4

5 CMS Data Organization Physicist’s View Find “What data exist?” “Where are data located?” Transfer Access CMS Data management Data Bookkeeping System Standardized and queryable info of event data mapping from event collections to files/file blocks Data Location Service maps file blocks to locations PhEDEx Data Transfer and Placement System LCG commands SRM and POSIX-I/O 5 Map Physicist and system views

6 CMS Tier-3 Local Site Globe CMS Data Management Service CMS Analysis work flow 6 Data Book- keeping DB DBS Data Book- keeping DB DBS Data Location DB DLS Data Location DB DLS LHC Grid Computing Tier-3 Local Cluster Tier-3 Local Cluster Tier-3 Storage Element PhEDEx Globe Data Transfer Agents and Database PhEDEx Globe Data Transfer Agents and Database PhEDEx Local Data Transfer Agents PhEDEx Local Data Transfer Agents File Transfer Service CRAB is a Python program to simplify the process of creation and submission of CMS analysis jobs into a grid environment.

7 Overview of Swiss CMS Tier-3 7  For CMS members of ETHZ, University of Zurich and PSI  Located at PSI  Try to adapt best to the users' analysis needs  running in test mode in October 2008, and in production mode since November 2008  30 registered physicist users  Manager: Dr. Derek Feichtinger, Assistant: Zhiling Chen

8 Hardware of Swiss CMS Tier-3 8 No. of Work Nodes Processors Cores/NodeTotal Cores 82*Xeon E5410864 No. of File Servers Type Space/Node (TB) Total Space (TB) 6SUN X450017.5107 Present Computing Power Present Storage

9 Layout of Swiss CMS Tier-3 at PSI 9 User Interface CMS VoBox (PhEDEx) CMS VoBox (PhEDEx) Storage Element (t3se01.psi.c h) [dcache admin, dcap, SRM, gridftp, resource info provider] Storage Element (t3se01.psi.c h) [dcache admin, dcap, SRM, gridftp, resource info provider] NFS Server NFS (home and shared software directories: CMSSW, CRAB, Glite) NFS Server NFS (home and shared software directories: CMSSW, CRAB, Glite) DB Server [postgres, pnfs, dcache pnfs cell] DB Server [postgres, pnfs, dcache pnfs cell] Computing Element [Sun Grid Engine ] Computing Element [Sun Grid Engine ] Dispatch/Collection Batch Jobs Submit/retrieve Batch Jobs Access Home/Software Directory Access Local SE : SRM, gridftp, dcap … Submit/retrieve LCG jobs User login Access PheEDEx Central DB Access Remote SE Accessed by LCG CMS Tier-3 at PSI Monitoring [ganglia collector, ganglia web front end ] Monitoring [ganglia collector, ganglia web front end ] Network connectivity: PSI has a 1Gb/s uplink to CSCS.

10 Setup of Swiss CMS Tier-3  User Interface (8 cores): t3ui01.psi.ch A fully operational LCG UI. It enables users to:  login from outside  Submit/Manage local jobs on the Tier-3 local cluster  Interact with the LCG Grid: Submit Grid jobs, access storage elements, etc.  Interact with AFS, CVS …  Test users’ Jobs  Local batch cluster(8 Work Nodes * 8 Cores):  Batch System: Sun Grid Engine 6.1 10

11 Setup of Swiss CMS Tier-3 (cont.)  Storage Element (SE): t3se01.psi.ch A fully equipped LCG storage element running a dCache. It allows users to:  Access files by local jobs (dcap, srmcp, gridftp etc.) in Tier-3  Access files (srmcp, gridftp) from other sites  Give users extra space in addition to the space in CSCS Tier-2  NFS Server (for small storage)  Hosts users’ home directories: analysis code, jobs output  Shared software: CMSSW, CRAB, Glite …  Easy to access, but not for huge files Note: If you need large storage space for longer time, you should use SE. 11

12 Setup of Swiss CMS Tier-3 (cont.)  CMS VoBox (PhEDEx):  Users can order datasets to Tier-3 SE  Admin can manage datasets with PhDEDx  Monitoring:  Status of batch system  Accounting  Worker nodes load  Free storage space  Network Activities …

13 13 Working on Swiss CMS Tier-3 Before Submit jobs: Order dataset  Check currently stored data sets for the Tier-3 from DBS Data Discovery Page  If the data sets are not stored on Tier-3, Order data sets to T3_CH_PSI by PhEDEx central web page

14 Work Flow on Tier-3 Working on Swiss CMS Tier-3 Submit and Manage batch jobs CRAB CRAB module for SGE Simplify creation and submission of CMS analysis jobs Consistent way to submit jobs to Grid or Tier-3 Local Cluster Sun Grid Engine More flexible More powerful controls Priority Job Dependency… Command line and GUI 14

15 Operational Experience  User acceptance of the T3 services seems to be quite good  Our CRAB SGE-scheduler module works well with SGE batch system.  SGE provides flexible and versatile way to submit and manage jobs on Tier-3 local cluster  Typical Problems in “Bad” jobs:  CMSSW jobs produce huge output file with tons of debug messages -> Fill up home directory quickly, cluster stalled  Set Quota for every user  Jobs initiate too many requests to SE in parallel -> Overload SE, jobs waiting  Users should beware 15

16 Upgrade Plan PhaseyearCPU/kCINT200 0 Disk/TB A (Plan)200818075 A (Archived)2008213.76107.1 B (Plan)End of 2009500250 16 Hardware Upgrade Software Upgrade: Regular upgrade Glite CMS Software: CMSSW, CRAB … Upgrade under discussion: using a parallel file system instead of NFS Better performance than NFS Good for the operational of large root files

17 Documents and User Support  Request Account: Send email to cms- tier3@lists.psi.chcms- tier3@lists.psi.ch  Users mailing list: cms-tier3-users@lists.psi.ch  https://twiki.cscs.ch/twiki/bin/view/CmsTier3/WebHom e Swiss CMS Tier-3 Wiki page 17

18

19 19

20 CMS Event Data Flow 20 FormatContentEvent Size [MB] RAWDetector data after online formatting; Result of HLT selections (~ 5 PB/Year) 1.5 RECOCMSSW Data Format containing the relevant output of reconstruction. (tracks, vertices, jets, electrons, muons, hits/clusters) 0.25 AODderived from the RECO information. They are in a convenient, compact format, enough information about the event to support all the typical usage patterns of a physics analysis 0.05 Event Data Flow Tier-0 Online system tape RAW,RECO AOD First pass reconstruction O(50) primary datasets O(10) streams (RAW) Tier-1 -1 Scheduled data processing (skim & reprocessing) tape RAW RECO AOD RECO, AOD Tier-2 -2 -2 -2 Analysis MC simulation Tier-2 -2 -2 -2 Analysis MC simulation Tier-3 Analysis Mc sinmulation Based on the hierarchy of computing tiers from LHC Computing Grid


Download ppt "Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009."

Similar presentations


Ads by Google