Run II Review Closeout 15 Sept., 2005 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.

Slides:

Advertisements

Similar presentations

Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini

Advertisements

Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.

A Computation Management Agent for Multi-Institutional Grids

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.

IACT 901 Module 9 Establishing Technology Strategy - Scope & Purpose.

DITSCAP Phase 2 - Verification Pramod Jampala Christopher Swenson.

F Run II Experiments and the Grid Amber Boehnlein Fermilab September 16, 2005.

Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.

L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.

Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.

What if you suspect a security incident or software vulnerability? What if you suspect a security incident at your site? DON’T PANIC Immediately inform:

NICE :Network Intrusion Detection and Countermeasure Selection in Virtual Network Systems.

The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.

Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.

Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

- Iain Bertram R-GMA and DØ Iain Bertram RAL 13 May 2004 Thanks to Jeff Templon at Nikhef.

1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.

SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.

T. Bowcock Liverpool Sept 00. Sept LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments.

Grid Computing at The Hartford Condor Week 2008 Robert Nordlund

GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.

DØ Computing Model & Monte Carlo & Data Reprocessing Gavin Davies Imperial College London DOSAR Workshop, Sao Paulo, September 2005.

Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.

Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.

22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.

Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

June 10, D0 Use of OSG D0 relies on OSG for a significant throughput of Monte Carlo simulation jobs, will use it if there is another reprocessing.

16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.

Cloud Computing Project By:Jessica, Fadiah, and Bill.

NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.

1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.

CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.

What is SAM-Grid? Job Handling Data Handling Monitoring and Information.

Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.

GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.

6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.

Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.

Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.

UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.

Storage and Data Movement at FNAL D. Petravick CHEP 2003.

CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.

Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.

RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,

U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.

Computing Division FY03 Budget and budget outlook for FY04 + CDF International Finance Committee April 4, 2003 Vicky White Head, Computing Division.

Run II Review Closeout 15 Sept., 2004 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.

PCAP Close Out Feb 2, 2004 BNL. Overall  Good progress in all areas  Good accomplishments in DC-2 (and CTB) –Late, but good.

Site Services and Policies Summary Dirk Düllmann, CERN IT More details at

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.

Victoria A. White Head, Computing Division, Fermilab Fermilab Grid Computing – CDF, D0 and more..

A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.

1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.

A Computing Tier 2 Node Eric Fede – LAPP/IN2P3. 2 Eric Fede – 1st Chinese-French Workshop Plan What is a Tier 2 –Context and definition To be a Tier 2.

FIFE Architecture Figures for V1.2 of document. Servers Desktops and Laptops Desktops and Laptops Off-Site Computing Off-Site Computing Interactive ComputingSoftware.

DØ Computing Model and Operational Status Gavin Davies Imperial College London Run II Computing Review, September 2005.

DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.

1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.

An Overview to System Management WorldVistA Community Meeting June 14-17, 2007 Seattle, WA.

Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.

Information Systems Development

Working Group 4 Facilities and Technologies

SAM at CCIN2P3 configuration issues

Ákos Frohner EGEE'08 September 2008

DØ MC and Data Processing on the Grid

Presentation transcript:

Run II Review Closeout 15 Sept., 2005 FNAL

Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress since the last review. Things ARE working.

Data Handling See word doc…

Remote Analysis/Production(1) The committee commends D0 for their extensive use of offsite resources during their data reprocessing using SAMGrid and promoting its development. This was viewed as a great success. We commend CDF for their use of offsite resources for their Monte Carlo production. The committee commends CDF for new developments utilizing existing standard grid tools (Condor glide-in) to extend their functional environment to remote resources. The reliance on external connectivity of worker nodes will be a serious limitation to their access to available resources; we encourage them to develop the infrastructure to eliminate this impediment. Whatever method is chosen for grid job submission, CDF needs a data model with a data manager, SAM, for example, that is tightly integrated with managed storage (storage elements). We are concerned that the manpower devoted the Condor glide-in development effort is insufficient. The committee would like CDF to better-define their model for offsite computing (analysis and central tasks). Specifically, how are they planning to utilize grid resources to meet their needs? A better understanding of the CDF analysis model may produce economies of resources, on-site and off-site.

Remote Analysis/Production(2) The CD is to be strongly commended for their long term vision of common solutions for computing and storage infrastructure. Fermigrid seemed like a great idea last year, and now it seems like a requirement. It is also leveraged as a useful testbed for experiment-based infrastructure development. The progress in making SAMGrid more robust and scaleable is duly noted. We strongly encourage the continued evolution of SAMGrid to OSG/LCG interoperability. The committee is concerned about the “heavyweight” intrusive nature of the SAMGrid installation. We would like to see SAMGrid evolve to using only standards-based interfaces available on vanilla grid resources. We are encouraged that the SAMGrid team want to integrate managed storage (such as SRM). The committee feels that investment of personnel resources in continued development and investment in Fermigrid infrastructure will lead to efficiencies in the management, access, and use of future global resources. Both experiments indicated a future need to pursue user analysis offsite. We recommend that each experiment try to assess the obstacles they face in obtaining all of the offsite resources they project will be necessary, and work with the CD to develop a detailed plan to overcome these obstacles. As the LHC computing needs ramp up after the start of LHC data-taking, the computing and storage resources that have been “free” for use in Run II computing may become scarce. We encourage each of the experiments to engage now in a discussion with the offsite computing sources they have identified in order to negotiate the level of future access to these resources, and the interfaces that will be required. These should be formalized as MOUs.

Networking Plan to get to the MAN is good—need to implement ASAP CDF has run into network bottlenecks, so the network should be closely monitored. Can the Starlight network be exploited to improve Run II remote production? DB Support The new contract with Oracle (5 yr) seems to solve immediate problems. Exploring different options was a good plan, but now does not seem to be needed.

Infrastructure CD should investigate how to migrate from RH7.3 immediately CD should investigate (build/buy) monitoring tools to help the experiments to better predict needed disk to tape ratio, retirement rate, caching strategies, etc..

Planning/Management/Funding(1) The CDF hardware resource ramp up profile (cpu, disk, tape) needs more scrutiny –Last minute changes at this review give the impression that further refinement is needed to get accurate assessment for CPU and disk needs The CDF strategy of waiting until disks die may cause more disruptions than a planned retirement. The CDF model relied on large disks (in 2008 e.g.). They should consider the risks of this strategy (loss of IO throughput capability, large loss if 1 disk dies, etc.) The experiments need to secure remote resources (via MOUs) especially in light of the LHC ramp up. CD should adequately prepare for the space, power and cooling requirements implied by the experiments needs. The experiments should not ignore these cost implications. –Space seems under control (based on our tour!)

Planning/Management/Funding(2) D0 has significantly sped up their reconstruction software in the past year. Given the cost of needed hardware for the coming years, it is imperative that emphasis be placed on further optimization in this regard. Fallback plan for CDF if dCache-based analysis disk pool is not workable –We heard that the current fallback plan is to simply rely on the future availability of larger disks. Collaboration should indicate a stronger commitment to making dCache work. If this really fails, other alternatives such as xrootd need to be considered.

Planning/Management/Funding(3) Personnel/Manpower –optimize use of available manpower within CD and CDF/D0 for day-to-day tasks. –Take steps to reduce vulnerability due to limited number of suitably qualified experts for “service” tasks, (for example D0 code management & database support) –We urge careful review of the Run II task force recommendations on computing by the collaborations. It was noted that CDF needs more “day to day operational support” for Sam/dCache. This should be investigated: automate/improve wherever possible (incl. re-evaluating impact of large number of small files).