Presentation is loading. Please wait.

Presentation is loading. Please wait.

D. Britton GridPP3 David Britton 27/June/2006. D. Britton27/June/2006GridPP3 Life after GridPP2 We propose a 7-month transition period for GridPP2, followed.

Similar presentations


Presentation on theme: "D. Britton GridPP3 David Britton 27/June/2006. D. Britton27/June/2006GridPP3 Life after GridPP2 We propose a 7-month transition period for GridPP2, followed."— Presentation transcript:

1 D. Britton GridPP3 David Britton 27/June/2006

2 D. Britton27/June/2006GridPP3 Life after GridPP2 We propose a 7-month transition period for GridPP2, followed by a three year co-development programme with the LHC Computing Grid, the proposed European Grid Infrastructure (EGI), the Particle Physics experiments and the Institutes. The GridPP3 project, a continuation of GridPP, will deliver a full- scale Grid for Exploitation to meet the reconstruction, simulation and analysis requirements of experiments across the Particle Physics programme. Timeframe: GridPP2+ Sep 07 to Mar 08 GridPP3 Apr 08 to Mar 11 Budget: Has not been pre-specified… input to exploitation review was £36.6m for this period which is clearly (above) the upper limit.

3 D. Britton27/June/2006GridPP3 GridPP2+ In the 7 Month period from Sep-07 to Apr-08 we propose (following the suggestion of the Oversight Committee) to continue the GridPP2 project largely as-is primarily in order to: 1)To sort out issues with the time-frame for the PPRP process and post- extension in Sep 07. 2)Provide continuity of management and support over the expected start-up phase of the LHC. 3)Align future projects with financial years, with EGEE and possible future EGI project, and with other grants in the UK. Proposal is to continue all GridPP2 posts in this period except for the Application posts (which have been applied for via the Rolling Grant mechanism). We hope (need) to use the GridPP2+ period to install/commission a substantial pulse of hardware to be ready for the start of the LHC.

4 D. Britton27/June/2006GridPP3 Earth Wind Water Fire

5 D. Britton27/June/2006GridPP3 Proto-GridPP3 PMB CB Chair............ Project Leader........ Deputy Project Leader... Project Manager....... Deployment Board Chair.. Technical Coordinator... User Board Chair....... LCG Liaison.......... CERN Liaison......... EU Liaison........... Budget Holder........ Network Liaison....... NGS Liaison.......... Production Manager..... Outreach............ Steve Lloyd or Replacement Dave Britton (John Gordon) Sarah Pearce Steve Lloyd Tony Doyle (John Gordon) (Tony Cass) (Robin Middleton) (Pete Clarke) (Jeremy Coles) Sarah Pearce Dave Kelsey

6 D. Britton27/June/2006GridPP3 GridPP3 Deployment Board In GridPP2, the Deployment Board is squeezed into a space already occupied by the Tier-2 Board; the D-TEAM; and the PMB. Many meetings have been “joint” with one of these other bodies. Identity and function have become blurred. Project Management Board X In GridPP3, propose a combined Tier-2 Board and Deployment Board with overall responsibility for deployment strategy to meet the needs of the experiments. In particular, this is a forum where providers and users formally meet. Deals with: 1) Issues raised by the Production Manager which require strategic input. 2) Issues raised by users concerning the service provision. 3) Issues to do with Tier-1 - Tier-2 relationships. 4) Issues to do with Tier-2 allocations, service levels, performance. 5) Issues to do with collaboration with Grid Ireland and NGS.

7 D. Britton27/June/2006GridPP3 GridPP3 DB Membership 1) Chair 2) Production Manager 3) Technical Coordinator 4) Four Tier-2 Management Board chairs. 5) Tier-1 Board Chair. 6) ATLAS, CMS, LHCb representatives. 7) User Board Chair. 8) Grid Ireland representative 9) NGS representative. 10) Technical people invited for specific issues. Above list gives ~13 core members, 5 of whom are probably on PMB. There is a move away from the technical side of the current DB and it becomes a forum where the deployers meet each other and hear directly from the main users. The latter is designed to ensure buy-in by the users to strategic decisions.

8 D. Britton27/June/2006GridPP3 LHC Hardware Requirements GridPP Exploitation Review input: Took Global Hardware requirements and multiplied by UK authorship fraction. ALICE 1%ATLAS 10%CMS 5%LHCB 15% Problematic using “Authors” in the denominator when not all Authors (globally) have an associated Tier-1. Such an algorithm applied globally would not result in sufficient hardware. GridPP has asked the experiments for requirements and their input (relative to their global requirements) is: ALICE ~1.3%ATLAS ~13.7%CMS ~10.5%LHCb ~16.8% ?? (Global Requirements) X (Global T1 author frac.) (Global Requirements) (Number of Tier1s) ~50% X (Global Requirements) (Number of Tier1s) ~ UK Authorship fraction

9 D. Britton27/June/2006GridPP3 Proposed Hardware The input from the User Board was that that the hardware requirements in the GridPP3 proposal should be: Those defined by the LHC experiments; plus those defined by BaBar (historically well understood); plus a 5% provision for “Other” experiments at the Tier-2s only.

10 D. Britton27/June/2006GridPP3 Hardware Costs Kryder’s Law for disk cost Moore’s Law for CPU cost Hardware costs extrapolated from recent purchases. However, experience tells us there are fluctuations associated with technology steps. Significant uncertainty in integrated cost. Model must factor in: - Operational life of equipment - Known operational overheads - Lead time for delivery and deployment.

11 D. Britton27/June/2006GridPP3 Hardware Costs: Tape

12 D. Britton27/June/2006GridPP3 Tier-1 Hardware

13 D. Britton27/June/2006GridPP3 Tier-2 Resources In GridPP2 we paid for staff in return for provision of hardware, which is not a sustainable model. Need a transition to a sustainable model that generates sufficient (but not excessive) hardware, which institutes will buy into. Such a model should: Acknowledge that we are building a Grid (not a computer centre). That historically Tier2s have allowed us to lever resources/funding. That Tier2 are designed to provide different functions and different levels of service from the Tier1. Dual funding opportunities may continue for a while. Institutes may have strategic gain by continuing to be part of the "World's largest Grid"

14 D. Britton27/June/2006GridPP3 Tier-2 Hardware Model (for proposal) endorsed by CB: - GridPP funds ~15 FTE at the Tier-2s. - Tier-2 Hardware requirements are defined by the UB request. - That GridPP pays the cost of purchasing hardware to satisfy the following years requirements at the current year price, divided by the nominal hardware lifetime (~4 years for disk; ~5 years for CPU). E.g. 2253 TB of Disk is required in 2008. In January 2007, this would cost ~1.0k£/TB. With a life-time of 4 years, the 1-year “value” is 2253/4 = £563k. Note: This does not necessarily reimburse the full cost of the hardware because in subsequent years, the money GridPP pays depreciates with the falling cost of hardware, whereas the Tier2s who actually made a purchase, have been locked into a cost determined by the purchase date. However, GridPP does pay cost up to 1-year before the actual purchase date, and institutes which already own resources can delay the spend further.

15 D. Britton27/June/2006GridPP3 Tier-2 Resources Sanity Checks: 1)Can apply the model and compare cost of hardware at the Tier-1 and Tier-2 integrated over the lifetime of the project: 2)Total cost of ownership: Can compare total cost of the Tier-2 facilities with the cost of placing the same hardware at the Tier-1 (assuming that doubling the Tier-1 hardware requires a 35% increase in staff). Tier-1 Tier-2 CPU (K£/KSI2K-year):0.0700.045 DISK (K£/TB-year): 0.1440.109 TAPE (K£/TB-year):0.052 Including staff and hardware, the cost of the Tier-2 facilities is ~80% of cost of an enlarged Tier-1.

16 D. Britton27/June/2006GridPP3 Running Costs (Work in progress)

17 D. Britton27/June/2006GridPP3 Total Hardware Cost In addition to ~£1.6m GridPP2 money –likely to be problematic!

18 D. Britton27/June/2006GridPP3 Tier-1 Service “Tier1 Centres provide a distributed permanent back-up of the raw data, permanent storage and management of data needed during the analysis process, and offer a grid-enabled data service. They also perform data- intensive analysis and re-processing, and may undertake national or regional support tasks, as well as contribute to Grid Operations Services.”[LCG MoU] The exact role of the Tier-1 varies from experiment to experiment, and is provided in detail in the individual experiments’ TDRs. However broadly the Tier-1 will carry out the following tasks: acceptance of an agreed share of raw data from the Tier0 Centre, keeping up with data acquisition; acceptance of an agreed share of first-pass reconstructed data from the Tier0 Centre; acceptance of processed and simulated data from other centres of the WLCG; recording and archival storage of the accepted share of raw data (distributed back-up); recording and maintenance of processed and simulated data on permanent mass storage; provision of managed disk storage providing permanent and temporary data storage for files and databases; provision of access to the stored data by other centres of the WLCG … operation of a data-intensive analysis facility; provision of other services according to agreed Experiment requirements; ensure high-capacity network bandwidth and services for data exchange with the Tier0 Centre, as part of an overall plan agreed amongst the Experiments, Tier1 and Tier0 Centres; ensure network bandwidth and services for data exchange with Tier1 and Tier2 Centres, as part of an overall plan agreed amongst the Experiments, Tier1 and Tier2 Centres; administration of databases required by Experiments at Tier1 Centres. All storage and computational services shall be “grid enabled” according to standards agreed between the LHC Experiments and the regional centres. Tier-0Tier-1Tier-2 ALICEFirst-pass scheduled reconstruction Reconstruction On-demand analysis Central simulation On-demand analysis ATLASReconstruction Scheduled analysis / skimming Calibration Simulation On-demand analysis Calibration CMSReconstruction Scheduled analysis / skimming Simulation On-demand analysis Calibration LHCbReconstruction On-demand analysis Scheduled skimming Simulation

19 D. Britton27/June/2006GridPP3 Tier-1 Service

20 D. Britton27/June/2006GridPP3 Tier-1 Growth Now Start of GridPP3 End of GridPP3 Spinning Disks~2000 ~10,000~20,000 Yearly disk failures30-45 200-300? 400-600? CPU Systems~550 ~1800 ~2700 Yearly system failures35-40 120-130? 180-200? To achieve the levels of service specified in the MOU, a multi-skilled incident response unit (3 FTE) is proposed. This is intended to reduce the risk of over- provisioning other work areas to cope with long term fluctuations in fault rate. These staff will have an expectation that their primary daily role will be dealing with what has gone wrong. They will also provide the backbone of the primary callout team.

21 D. Britton27/June/2006GridPP3 Tier-1 Staff Work Area GridPP3 PPARC fundingCCLRC funding CPU2.00.0 Disk3.00.0 Tape Service (CASTOR)2.01.3 Core Services1.00.5 Operations3.01.0 Incident Response Unit3.00.0 Networking0.00.5 Deployment1.50.0 Experiment Support1.50.0 Tier-1 Management1.00.3 Totals18.03.6

22 D. Britton27/June/2006GridPP3 Tier-2 Service provision of managed disk storage providing permanent and/or temporary data storage for files and databases; operation of an end-user analysis facility; provision of other services, such as simulation, according to agreed Experiment requirements; provision of network services for data exchange with Tier1 Centres, as part of an overall plan agreed between the Experiments and the Tier1 Centres concerned. All storage and computational services shall be “grid enabled” according to standards agreed between the LHC Experiments and the regional centres. The following services shall be provided by each of the Tier2 Centres in respect of the LHC Experiments that they serve, according to policies decided by these Experiments: ServiceMaximum delay in responding to operational problems Average availability measured on an annual basis Prime timeOther periods End-user analysis facility2 hours72 hours95% Other services12 hours72 hours95%

23 D. Britton27/June/2006GridPP3 Tier-2 Staff

24 D. Britton27/June/2006GridPP3 Grid Deployment Staff (Operations) Team of 8: A Production Manager; 4 Tier-2 Coordinators; 3 GOC staff. Their activities include: Resource and deployment planning and scheduling upgrades Installation and configuration of Grid middleware services Support of these Grid services Grid Operations User support System manager support Monitoring, accounting and auditing Security (both operational and policy aspects) Documentation VO management and support

25 D. Britton27/June/2006GridPP3 Grid Support Staff

26 D. Britton27/June/2006GridPP3 GridPP Staff Evolution

27 D. Britton27/June/2006GridPP3 Dissemination 4. The bid (s) should : a) show how developments build upon PPARC’s existing investment in e- Science and IT investment, leverage investment by the e-science Core programme and demonstrate close collaboration with other science and industry and with key international partners such as CERN. It is expected that a plan for collaboration with industry will be presented or justification if such a plan is not appropriate. For exploitation review it was assumed dissemination was absorbed by PPARC. Unlikely at this point! Presently we have effectively 1.5 FTE working on dissemination alone (Sarah Pearce plus events officer). Want to maintain a significant dissemination activity (insurance policy) so adding in industrial liaison suggests maintaining the level at 1.5 FTE.

28 D. Britton27/June/2006GridPP3 Full Proposal Compares with exploitation review input of £36,643k which included £1,800k running costs.

29 D. Britton27/June/2006GridPP3 GridPP3 Balance

30 D. Britton27/June/2006GridPP3 Status GridPP3 proposal being drafted (deadline July 13 th ) Currently being run by CB (email) and OC (Friday) Request the Hardware defined by the experiments Request (minimum) staff we think are required Expect some iteration!


Download ppt "D. Britton GridPP3 David Britton 27/June/2006. D. Britton27/June/2006GridPP3 Life after GridPP2 We propose a 7-month transition period for GridPP2, followed."

Similar presentations


Ads by Google