LArIAT FY15 and FY16 Computing Needs Hans Wenzel * (taking over from Michael Kirby) Scientific Computing Portfolio Management Team (SC-PMT) Review 4 th.

Slides:



Advertisements
Similar presentations
Living with Exadata Presented by: Shaun Dewberry, OS Administrator, RDC Tom de Jongh van Arkel, Database Administrator, RDC Komaran Hansragh, Data Warehouse.
Advertisements

March 24-28, 2003Computing for High-Energy Physics Configuration Database for BaBar On-line Rainer Bartoldus, Gregory Dubois-Felsmann, Yury Kolomensky,
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Peter Chochula, January 31, 2006  Motivation for this meeting: Get together experts from different fields See what do we know See what is missing See.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
GLAST LAT Project SE Test Planning meeting October 4, 2004 E. do Couto e Silva 1/13 SVAC Data Taking during LAT SLAC Oct 4, 2004 Eduardo.
DBS to DBS i 5.0 Environment Strategy. Agenda Recommended Reference Information Roles & Responsibilities Environment Activities by Phase o Pre-Planning.
GLAST LAT ProjectOnline Peer Review – July 21, Integration and Test L. Miller 1 GLAST Large Area Telescope: I&T Integration Readiness Review.
Using TOSCA Requirements /Capabilities Monitoring Use Case (Primer Considerations) Proposal by CA Technologies, IBM, SAP, Vnomic.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
Hot Checkout System for Accelerator Operations at JLab Ken Baggett (Team Leader) Theo Larrieu Ron Lauzé Randy Michaud Ryan Slominski Paul Vasilauskis.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
DBS to DBSi 5.0 Environment Strategy Quinn March 22, 2011.
Data Management Subsystem: Data Processing, Calibration and Archive Systems for JWST with implications for HST Gretchen Greene & Perry Greenfield.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.
Artdaq Introduction artdaq is a toolkit for creating the event building and filtering portions of a DAQ. A set of ready-to-use components along with hooks.
Requirements Review – July 21, Requirements for CMS Patricia McBride July 21, 2005.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Your First Azure Application Michael Stiefel Reliable Software, Inc.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Implementing a dual readout calorimeter in SLIC and testing Geant4 Physics Hans Wenzel Fermilab Friday, 2 nd October 2009 ALCPG 2009.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
MiniBooNE Computing Description: Support MiniBooNE online and offline computing by coordinating the use of, and occasionally managing, CD resources. Participants:
LHC Computing Review Recommendations John Harvey CERN/EP March 28 th, th LHCb Software Week.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
5 May 98 1 Jürgen Knobloch Computing Planning for ATLAS ATLAS Software Week 5 May 1998 Jürgen Knobloch Slides also on:
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Workshop on Computing for Neutrino Experiments - Summary April 24, 2009 Lee Lueking, Heidi Schellman NOvA Collaboration Meeting.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
Managing the CERN LHC Tier0/Tier1 centre Status and Plans March 27 th 2003 CERN.ch.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
D0 Status: 01/14-01/28 u Integrated luminosity s delivered luminosity –week of 01/ pb-1 –week of 01/ pb-1 –luminosity to tape: 40% s major.
Linda R. Coney – 5 November 2009 Online Reconstruction Linda R. Coney 5 November 2009.
DoE Review January 1998 Online System WBS 1.5  One-page review  Accomplishments  System description  Progress  Status  Goals Outline Stu Fuess.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
STAR Collaboration Meeting, BNL – march 2003 Alexandre A. P. Suaide Wayne State University Slide 1 EMC Update Update on EMC –Hardware installed and current.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Day in the Life (DITL) Production Operations with Energy Builder Copyright © 2015 EDataViz LLC.
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
Belle II Computing Fabrizio Bianchi INFN and University of Torino Meeting Belle2 Italia 17/12/2014.
Recent Evolution of the Offline Computing Model of the NOA Experiment Talk #200 Craig Group & Alec Habig CHEP 2015 Okinawa, Japan.
Fabric for Frontier Experiments at Fermilab Gabriele Garzoglio Grid and Cloud Services Department, Scientific Computing Division, Fermilab ISGC – Thu,
Component 8/Unit 1bHealth IT Workforce Curriculum Version 1.0 Fall Installation and Maintenance of Health IT Systems Unit 1b Elements of a Typical.
Gavin S. Davies Iowa State University On behalf of the NOvA Collaboration FIFE Workshop, June th 2014 FIFE workshop.
Minerva Test Beam 2 Status Geoff Savage for the Test Beam 2 Team March 23, /23/2015AEM - Minerva Test Beam 2 Status1.
Mu2e FY15 and FY16 Computing Needs Rob Kutschke Scientific Computing Portfolio Management Team (SC-PMT) Review 4 March 2015.
GPCF* Update Present status as a series of questions / answers related to decisions made / yet to be made * General Physics Computing Facility (GPCF) is.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
Minos Computing Infrastructure Arthur Kreymer 1 ● Young Minos issues: – /local/stage1/minosgli temporary reprieve – need grid shared area – Retiring.
Emanuele Leonardi PADME General Meeting - LNF January 2017
CMS High Level Trigger Configuration Management
The CMS-HI Computing Plan Vanderbilt University
Off-line & GRID Computing
ILD Ichinoseki Meeting
ATLAS DC2 & Continuous production
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
Presentation transcript:

LArIAT FY15 and FY16 Computing Needs Hans Wenzel * (taking over from Michael Kirby) Scientific Computing Portfolio Management Team (SC-PMT) Review 4 th March 2015 * With Input from: Michael Kirby, Bill Badgett, Andrzej Szelc, Kurt Biery, Jennifer Raaf, Flavio Cavanna, Jason St. John, Elena Gramellini, Irene Nutini, Randy Johnson, …

Outline Introduction –The experiment: science outlook, status and schedule Storage and processing estimates Scientific Service Requests for LArIAT TSW/EOP Status and Plans Future Directions (Challenges and R&D) 2 3/4/15Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Slides taken from Flavio Cavanna’s Northwestern Seminar 3/4/153Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Slides taken from Flavio Cavanna’s Northwestern Seminar 3/4/15Hans Wenzel | LArIAT FY15/FY16 Computing Needs4

Slides taken from Flavio Cavanna’s Northwestern Seminar 3/4/155Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Science outlook (cont.) Calorimeter R&D: study nuclear interactions and the role of protons produced for hadronic calorimetry  improved calorimetric resolution, what we learn here can be applied to future calorimeter concepts. LAr TPC provides very detailed snapshot of the event and will provide data to validate Geant4/Genie -> good synergy (already found “features” in Geant 4.9….) 6 3/4/15Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Slides taken from Flavio Cavanna’s Northwestern Seminar 3/4/157Hans Wenzel | LArIAT FY15/FY16 Computing Needs *

Starting asap – What does this mean? Today 11am (March 4), Operational Readiness Clearance (ORC) to get beam in MC7 for commissioning aerogel Cherenkov counters, and to finalize DAQ commissioning. Filter regeneration and filling: last two weeks of March. LArTPC data-taking: April  Exciting!! See more pictures at: 8 3/4/15Hans Wenzel | LArIAT FY15/FY16 Computing Needs

FY 15 Data Storage and Processing estimates Tertiary Beam delivers 4.2 sec spill every minute Raw Data is non-zero suppressed from the LAr TPC -> ~150 MB per spill and ~1.9 MB per TPC readout anticipate 100 TPC readouts for every spill FY15 initial run 3 months, 24x7 with 3 shifts -> spills spill X 150 MB/spill = 20 TB expect that simple gzip delivers 6X compression Raw Data total for FY15 is 3.3 TB on permanent storage Estimate that reconstruction combined with zero removal leads to reconstructed data 4 TB on permanent storage 3/4/159 FY15 Detector Data Volume = 7.3 TB Hans Wenzel | LArIAT FY15/FY16 Computing Needs

FY 15 Data Storage and Processing estimates Cosmic Trigger (being set up now) planned to be active for 30 sec between spills. Estimated Trigger Rate: < 1 Hz (may be only 4-5 evts/minute) k cosmic triggers per day. 3.9 million triggers x 1.9 MB per TPC readout = 7.4 TB  7.4/6 = 1.2 TB of compressed data in 3 month. Estimate that reconstruction combined with zero removal leads to 2.7 TB on permanent storage 3/4/1510 FY15 Cosmic Data Volume = 2.7 TB Hans Wenzel | LArIAT FY15/FY16 Computing Needs

FY 15: Data Storage and Processing estimates spills X 100 TPC readout -> 12 M Data evts Estimate that we need 10X the simulated events 120 M Simulated events of varying configurations 4 MB / evt X 120 M evt -> 480 TB again zero removal and compression -> 80 TB 3/4/1511 FY15 Simulation Data Volume = 80 TB FY15 Detector Data Volume = 10 TB Request FY15 Total Data Volume = 90 TB Hans Wenzel | LArIAT FY15/FY16 Computing Needs

FY15 Data Storage and Processing estimates Reconstruction of 12 M data events estimate ~ 20 sec / evt (µBooNE (30 sec/evt)) 12 M evts X 20 sec / event -> 67K CPU hrs anticipate processing data 3 times during FY15 Simulation and Reconstruction of 120 M simulated events Estimate 6 sec / event for simulation and digitization. 120 M evts X 6 sec / event -> 0.2 M CPU hrs Processing(reconstruction) 3 times (20 sec) -> 2M CPU hrs analysis of data similar scale to all reprocessing -> 2M CPU hrs 3/4/1512 FY15 Data Processing = 201K CPU hrs FY15 Sim Processing = 4.2 M CPU hrs Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Note: might reap huge benefit in CPU time from optimizing the code!!! FY 16: Data Storage and Processing estimates FY 16 plan on running for 6 months after the shutdown no change in readout or processing anticipated double the data and simulation volume double the processing 3/4/1513 FY16 Simulation Data Volume = 160 TB FY16 Detector Data Volume = 20 TB Request FY16 Total Data Volume = 180 TB FY16 Data Processing = 400K CPU hrs FY16 Sim Processing = 8.4M CPU hrs Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Scientific Service Requests for LArIAT DAQ – active development using artdaq – request continued support from RSE (Real-Time Systems Engineering) group (Kurt Biery) FermiGrid – In Production (currently 25 slots  ) FermiCloud – no current need but know how to utilize Gratia – In Production JobSub – In Production FIFEMon – In Production OSG Enabled – effectively ready with infrastructure, waiting for analyzers to adapt scripts and workflows Amazon – no plans dCache/enstore – currently finalizing, scratch in production – requested tape storage, no large request for more BlueArc 3/4/1514Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Slides taken from Flavio Cavanna’s Northwestern Seminar 3/4/1515Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Scientific Service Requests for LArIAT We use the standard SCD tools: IFDH/Gridftp – in production SAM Web – in Production FTS – is installed, and we're working to get it running. Software Framework (LArSoft) – in Production  request help for profiling and optimizing the code (started), being able to quickly update Geant 4 versions. Continuous Integration – planning GENIE/Geant4 – Scientific Computing Simulation Group (Daniel Elvira) Assistance on: selecting Physics lists, specialized physics list, selecting Genie/Geant 4 version, Validation  continued effort 3/4/1516Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Scientific Service Requests for LArIAT custom databases: –configuration db: LArIAT will want to retrieve data & MC files based on individual DAQ config parameter values SCD to host, serve, back up, and maintain this database. (24x7 uptime & support) CD is helping us access these database tables through SAM and we are beginning to organize the access of these tables by the analysis programs. Expect ~100 entries / week during The _dev configuration table in the database has been up and running for two weeks now. The table in the _prd database has been created and data can be inserted when start saving data. –SAM db: The SAM db instance has been set up. We expect one entry each minute. 3/4/1517Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Scientific Service Requests for LArIAT production operations – no plans now, but would like them CVMFS – have an active ticket to deploy repository at Fermilab  moving along Interactive Computing: GPCF (General Physics Computing Facility) VMS lariatgpvm01-03 with 2/4/4 CPU’s, 6/12/12GB of memory. experiment control room – at Meson Center, fully operational Redmine – in production CVS/Git – in production ECL (Electronic Logbook) – in production: 3/4/1518Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Scientific Service Requests for LArIAT ups/upd – in production docDb – in production: video conferencing – in production, Readytalk federated data management/high throughput analysis facilities – no plans 3/4/1519Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Did you meet your FY14 Scientific Goals? Based on last year’s Portfolio review and actions taken by SCD – did you accomplish what you said you would? There wasn’t a portfolio review… 3/4/1520Hans Wenzel | LArIAT FY15/FY16 Computing Needs

TSW/EOP Status and Plans Are your TSWs signed and up to date? –The TSW for FTBF is signed and up to date. The current agreement with Roser was only for 5 TB of storage. If not, do they need revision? –Yes! Art Kreymer provided TSW template  reviewing right now. 3/4/1521Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Future Directions (Challenges and R&D) Will your SOPs (Standard Operating Procedure) change significantly in the future (new phase of the experiment, new running conditions, etc.)? –FY17 will involve LArIAT Phase-II if funded Are future R&D projects in the pipeline? Yes Are additional resources going to be required to meet them? –Yes 3/4/1522Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Smooth ride from here on! 23 3/4/15Hans Wenzel | LArIAT FY15/FY16 Computing Needs

Backup 24

LArIAT Spreadsheet 1/15/15Experiment:LArIAT Projected needs:"This year""Next year""Out year" Prior usageCurrent usage"Immediate"thru end FY15thru end FY16thru end FY17 ServiceService OfferingOffering detailsDefinition of row or column See notes below Threshold for reporting (if below threshold, then enter "Yes" if small request) Past (e.g. previous 3 months, to be supplied) Current/Recent (allocation and/or utilization of resources, to be supplied) Needs at this moment. If different from current level then requires immediate reallocation of resources. Expectation of needs thru remainder of FY15. If different from current level then requires gradual reallocation of resources. Expectation of needs thru end of FY16. If different from FY15 level then requires purchases or reallocation in FY15. Expectation of needs thru end of FY17. If different from FY16 level then requires purchases or reallocation in FY16. Comments: Use footnotes if necessary. SCIENTIFIC COMPUTING SYSTEMS Batch Worker NodesFermiGrid yearly integral# CPU-hours1100, (assume all CPUs equivalent)FermiGrid peak integral# CPU-hours2any FermiGrid peak duration# of hours2any FermiGrid peak count# of peak periods2any20 33 OSG opportunistic yearly integral# CPU-hours1100,00000 OSG opportunistic peak integral# CPU-hours2any00 OSG opportunistic peak duration# of hours2any00 OSG opportunistic peak count# of peak periods2any00 External dedicated yearly integral# CPU-hours3100,00000 Large Memory or Multi-CPUDescribe needs in Comments any00 00 Server & Storage SupportStatic Interactive Service# of Static Services (e.g. VMs)4any33 34 Other Static Services# of Static Services4any11 11 Dynamic Services, average# of Dynamic Services Dynamic Services, peak# of Dynamic Services cvmfs ServiceRepository (Yes or No) Yes Build & Release ServiceUse facility (Yes or No)6Yes Database ServiceSpecify type(s), numbers7any22 22 Other Disk Service (specify)Servers with attached disk8any00 00 SCIENTIFIC DATA STORAGE & ACCESS dCacheShared RWCache disk storage (TB)120Yes 44 Shared RW lifetimeCache disk desired lifetime (days)11030 Shared ScratchCache disk storage (TB)120Yes 20 Shared Scratch lifetimeCache disk desired lifetime (days)11030 Dedicated WriteCache disk storage (TB)1any05 55 enstoreNew/additional capacityTape media (TB) NETWORKED STORAGE NAS/BlueArc*-appDedicated NAS (TB)1any22 33 *-dataDedicated NAS (TB)1any *-prodDedicated NAS (TB)1any00 00 *-anaDedicated NAS (TB)1any00 00 NETWORK SERVICES Physical InfrastructureDAQ LAN bandwidthDedicated for DAQ any 10GE? LAN bandwidthSpecific to experiment any 10GE? WAN InfrastructureDAQ WAN bandwidthDedicated for DAQ any? WAN bandwidthSpecific to experiment average > 2 Gb/s? Dedicated WAN circuitsDedicated for experiment any? 3/4/1525Hans Wenzel | LArIAT FY15/FY16 Computing Needs

ServiceService OfferingOffering detailsDefinition of row or column Past (e.g. previous 3 months, to be supplied) Current/Recent (allocation and/or utilization of resources, to be supplied) SCIENTIFIC COMPUTING SYSTEMS Batch Worker NodesFermiGrid yearly integral# CPU-hours (assume all CPUs equivalent)FermiGrid peak integral# CPU-hours FermiGrid peak duration# of hours10020 FermiGrid peak count# of peak periods20 OSG opportunistic yearly integral# CPU-hours00 OSG opportunistic peak integral# CPU-hours00 OSG opportunistic peak duration# of hours00 OSG opportunistic peak count# of peak periods00 External dedicated yearly integral# CPU-hours00 Large Memory or Multi-CPUDescribe needs in Comments00 Server & Storage SupportStatic Interactive Service# of Static Services (e.g. VMs)33 Other Static Services# of Static Services11 Dynamic Services, average# of Dynamic Services00 Dynamic Services, peak# of Dynamic Services00 cvmfs ServiceRepository (Yes or No)Yes Build & Release ServiceUse facility (Yes or No)Yes Database ServiceSpecify type(s), numbers22 Other Disk Service (specify)Servers with attached disk00 SCIENTIFIC DATA STORAGE & ACCESS dCacheShared RWCache disk storage (TB)Yes Shared RW lifetimeCache disk desired lifetime (days)30 Shared ScratchCache disk storage (TB)Yes Shared Scratch lifetimeCache disk desired lifetime (days)30 Dedicated WriteCache disk storage (TB)05 enstoreNew/additional capacityTape media (TB)05 NETWORKED STORAGE NAS/BlueArc*-appDedicated NAS (TB)22 *-dataDedicated NAS (TB)88 *-prodDedicated NAS (TB)00 *-anaDedicated NAS (TB)00 NETWORK SERVICES Physical InfrastructureDAQ LAN bandwidthDedicated for DAQ? LAN bandwidthSpecific to experiment? WAN InfrastructureDAQ WAN bandwidthDedicated for DAQ? WAN bandwidthSpecific to experiment? Dedicated WAN circuitsDedicated for experiment? 3/4/1526Hans Wenzel | LArIAT FY15/FY16 Computing Needs