Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear.

Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear Physics

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 2 Outline  Architecture of CMS distributed Computing System  CMS computing Services and workflows  Data and Workload Management systems  Computing Model realization  Production Systems on the Grid  Data distribution, Monte Carlo production, Data analysis  Development work  Experience from past computing challenges  Plans and schedule  Summary

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 3 CMS Distributed Computing  Distributed model for computing in CMS  Cope with computing requirements for storage, processing and analysis of data provided by the experiment  Computing resources are geographically distributed, interconnected via high throughput networks and operated by means of Grid software  Running expectations  Beam time: 2-3x10 6 secs in 2007, 10 7 secs in 2008, 2009 and 2010  Detector output rate: ~250 MB/s  2.5 PetaBytes raw data in 2008  Aggregate computing resources required  CMS computing model document (CERN-LHCC-2004-035)  CMS computing TDR released on June 2005 (CERN-LHCC-2005-023)

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 4 Tiered Architecture Tier-0:  Accepts data from DAQ  Prompt reconstruction  Data archive and distribution to Tier-1’s Tier-1’s:  Real data archiving  Re-processing  Skimming and other data- intensive analysis tasks  Calibration  MC data archiving Tier-2’s:  User data Analysis  MC production  Import skimmed datasets from Tier-1 and export MC data  Calibration/alignment CAF: CMS Analysis Facility at CERN  Access to full raw dataset  Focused on latency-critical detector trigger calibration and analysis activities  Provide some CMS central services (e.g. store conditions and calibrations)

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 5 Computing Model  Site activities and functionality largely predictable  Activities are driven by data location  Organized mass processing and custodial storage at Tier-1s  ‘chaotic’ computing essentially restricted to data analysis at T2s  Resource evolution CPU DISK TAPE

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 6 Guiding Principles  Prioritization will be important  In 2007/8, computing system efficiency may not be 100%  Cope with potential reconstruction backlogs without delaying critical data  Reserve possibility of ‘prompt calibration’ using low-latency data  Also important after first reconstruction, and throughout the system  E.g. for data distribution, ‘prompt’ analysis  Streaming  Classifying events early allows prioritization and data access optimization  For example ‘express stream’ of hot / calibration events  Propose O(50) ‘primary datasets’, O(10) ‘online streams’. O(2PB)/yr raw data split into O(50) (40 TB) trigger-determined datasets  Baseline principles for 2008  Fast reconstruction code (reconstruct often)  Streamed primary datasets  Efficient workflow and bookkeeping systems  Distribution of RAW and RECO data together  Compact data format AOD (multiple distributed copies)

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 7 Data Tiers and Data Volume for 2008  RAW  Detector data + L1, HLT results after online formatting  Includes factors for poor understanding of detector, compression, etc  1.5MB/evt @ ~150 Hz; ~ 4.5 PB/year (two copies)  1 copy at Tier-0 and one spread over Tier-1’s  RECO  Reconstructed objects with their associated hits  250kB/evt; ~2.1 PB/year (incl. 3 reprocessing versions)  1 copy spread over Tier-1 centers (together with associated raw)  AOD  The main analysis format; objects + minimal hit info  50kB/evt; ~2.6PB/year - whole copy at each Tier-1  Large fraction at Tier-2 centers  Plus MC in ~ 1:1 ratio with data

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 8 Resources and data flows in 2008 tape Tier 0 4.6 MSI2K 0.4 PB disk 4.9 PB tape 5 Gbps WAN WNs Tier-1s 280 MB/s (RAW, RECO, AOD) 225 MB/s (RAW) 280 MB/s (RAW, RECO, AOD) 225 MB/s (RAW) Up to 1 GB/s (AOD analysis, calibration) Tier 2 0.9 MSI2K 0.2 PB disk 1Gbps WAN WNs 12 MB/s (MC) 60 MB/s (skimmed AOD, Some RAW+RECO) Tier-1 Tier 1 2.5 MSI2K 0.8 PB disk 2.2 PB tape 10 Gbps WAN WNs 900 MB/s ( AOD skimming, data reprocessing) Tier-2s Tier-1s Tier-0 48 MB/s (MC) 40 MB/s (RAW, RECO, AOD) AOD 240 MB/s (skimmed AOD, Some RAW+RECO)

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 9 Technical Baseline Principles  Baseline system with minimal functionality for first physics  Use Grid Services as much as possible and also CMS- specific services  Keep it simple!  Optimize for the common case  Optimize for read access (most data is write-once, read-many)  Optimize for organized bulk processing, but without limiting single user  Decouple parts of the system  Minimize job dependencies  Site-local information stays site-local  Use explicit data placement  Data does not move around in response to job submission  All data is placed at a site through explicit CMS policy

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 10 WMS & DMS Services Overview Data Management System  No global file replica catalogue  Data Bookkeeping and data Location Systems (RefDB + PubDB)  What data exist & where are located  Local File catalogue  Data Access and Storage  SRM and posix-IO-like  Data Transfer and placement system  PhEDEx Workload Management System  Rely on Grid Workload Management  Reliability, performance, monitoring, resource management, priorities  CMS-specific job submission, monitoring and bookkeeping tools Current WMS & DMS

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 11 Work on new DMS  Migration from current DMS (RefDB, PubDB, local XML POOL catalogues) to new DMS (DBS, DLS, local trivial catalogue)  Track and replicate data with a granularity of file blocks  Reduce load on catalogues  Current DMS kept for PTDR analyses (~ until mid 2006)  Data tracked and replicated by file blocks in new DMS  Global and local scope DBS/DLS  New DMS to be exercised with new MC production system  PhEDEx integration with FTS as reliable file transfer layer  PhEDEx takes care of large scale dataset/fileblock replication, multi- hop routing following a transfer topology (Tier0  Tier1’s  Tier2’s), data pre-stage from tape, monitoring and bookkeeping, priorities and policy, etc

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 12  Data Transfer and Placement System (Physics Experiment Data Export – PhEDEx)  In production since almost two years  Managing transfers of several TB/day ~150 TB known to PhEDEx, ~350 TB total replicated  Running at CERN, 7 Tier-1’s, 15 Tier-2’s  Distributed data analysis CMS Remote Analysis Builder (CRAB)  Tool for job preparation, submission and monitoring  ~ 60K analysis jobs/month  Monte Carlo Production System (McRunjob)  ~ 10M events/month (4x10K jobs), ~ 150M events in total  ~ 20% in OSG and 15% in LCG. Rest on local farm mode in big sites although mostly production on the Grid in the last months Production Systems

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 13 Work on new MC Production System  New MC production system being developed  Overcome current inefficiencies, introduce new capabilities and integrate with new Event Data Model and DMS  Large experience gained running McRunjob on the Grid  Designed for local farm and ported to the LCG last year  Hopefully less manpower consuming, better handling of grid/sites unreliability, better use of resources, automatic retries, better error report/handling,etc  Better coupling with data transfer system  Job chaining, e.g. generation  simulation  digitization  reconstruction  Overcome bottleneck due to digitization with pile-up (I/O dominated by chaining with simulation (CPU dominated)  Data merging, fileblock management  Using DBS/DLS

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 14 Experience from Computing Challenges  It is crucial to test prototypes of Grid resources and services of increasing scale and complexity  Iterative process in CMS computing system development  Scheduled CMS Computing Challenges and LCG Service Challenges  CMS Data Challenge 2004  Tier-0 reconstruction @ 25 Hz and data distribution to Tier-1 centers for real-time analysis using Grid interfaces  Put in place CMS data transfer and placement system (PhEDEx)  First large scale test of Grid WMS (real-time analysis)  Problems identified: small file size (transfer, mass storage), slow central replica and metadata RLS catalogue, lack of reliable file transfer system in LCG

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 15 LCG Service Challenge 3  Computing integration test exercising the bulk data processing part of the CMS computing model under realistic conditions  Test end-to-end systems of both CMS-specific and LCG services  Focused on validation of data storage, transfer and data serving infrastructure plus required workload components for job submission  CERN + all 7 Tier-1 + 13 Tier-2 sites participated  Goals  Data distribution T0  T1’s  T2’s  Throughput phase (July 2005): 280 TB, aggregate 200 MB/s sustained for days  Service phase (Sep-Nov 2005): 290 TB, 10-20 MB/s to each T1 on avg over a month  Automatic data publishing, validation and analysis execution at T1’s and T2’s  70K jobs run. 90% LCG efficiency. Only 60% CMS efficiency  Up to 200 MB/s read data throughput from disk to CPU  Lot of effort spent on debugging and integration  Two many underlying Grid and CMS services not sufficiently well prepared to test in a challenge environment. Sites had not verified functionalities

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 16 Planning and Schedule for 2006  Rolling out new framework/event data model, new DMS and new MC production system by end of March/April  MC production for SC4 by April/May using new system  LCG Service Challenge 4 starting on May/June  Training for CSA06  CMS Computing, Software and Analysis Challenge (CSA06) starting on Sept/Oct  Computing Systems commissioning  Validation new data processing framework / new event data model  Demonstrate computing system at a scale of 25% that of 2008

José Hernández CHEP'06, Mumbai, 15 Feb 2006 CMS Computing Model 17 Summary  CMS has adopted a distributed computing model making use of Grid technologies  Steadily increase in scale and complexity  Major changes in computing systems being done (DMS, processing framework/EDM, MC production system)  Major computing challenges ahead (SC4, CSA06)

Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear.

Similar presentations

Presentation on theme: "Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear.

Similar presentations

Presentation on theme: "Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear."— Presentation transcript:

Similar presentations

About project

Feedback