DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016.

Slides:



Advertisements
Similar presentations
Overview of local security issues in Campus Grid environments Bruce Beckles University of Cambridge Computing Service.
Advertisements

TOPIC FOR THE DISCUSSION SECTION AUTHOR ONE AUTHOR TWO AUTHOR THREE FIFE Workshop Template Service Name.
ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Joining the Grid Andrew McNab. 28 March 2006Andrew McNab – Joining the Grid Outline ● LCG – the grid you're joining ● Related projects ● Getting a certificate.
Setting Up a Sandbox Presented by: Kevin Brunson Chief Technology Officer.
A crash course in njit’s Afs
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
CFT Offline Monitoring Michael Friedman. Contents Procedure  About the executable  Notes on how to run Results  What output there is and how to access.
OSG Public Storage and iRODS
27/04/05Sabah Salih Particle Physics Group The School of Physics and Astronomy The University of Manchester
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
MiniBooNE Computing Description: Support MiniBooNE online and offline computing by coordinating the use of, and occasionally managing, CD resources. Participants:
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Evolution of the Open Science Grid Authentication Model Kevin Hill Fermilab OSG Security Team.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Manchester HEP Desktop/ Laptop 30 Desktop running RH Laptop Windows XP & RH OS X Home server AFS using openafs 3 DB servers Kerberos 4 we will move.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
Next Steps: becoming users of the NGS Mike Mineter
February 28, 2003Eric Hjort PDSF Status and Overview Eric Hjort, LBNL STAR Collaboration Meeting February 28, 2003.
Next Steps.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Accounting Update John Gordon and Stuart Pullinger January 2014 GDB.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
Summary of User Requirements for Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Offline Week Alignment and Calibration Workshop February.
LHCbDirac and Core Software. LHCbDirac and Core SW Core Software workshop, PhC2 Running Gaudi Applications on the Grid m Application deployment o CVMFS.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Stephen Burke – Sysman meeting - 22/4/2002 Partner Logo The Testbed – A User View Stephen Burke, PPARC/RAL.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
TANYA LEVSHINA Monitoring, Diagnostics and Accounting.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
July 26, 2007Parag Mhashilkar, Fermilab1 DZero On OSG: Site And Application Validation Parag Mhashilkar, Fermi National Accelerator Laboratory.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
LHCb 2009-Q4 report Q4 report LHCb 2009-Q4 report, PhC2 Activities in 2009-Q4 m Core Software o Stable versions of Gaudi and LCG-AA m Applications.
StoRM + Lustre Proposal YAN Tian On behalf of Distributed Computing Group
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
ATLAS Computing Wenjing Wu outline Local accounts Tier3 resources Tier2 resources.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
Recent Evolution of the Offline Computing Model of the NOA Experiment Talk #200 Craig Group & Alec Habig CHEP 2015 Okinawa, Japan.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
AFS Home Directory Migration Details Andy Romero Core Computing Division.
News and Announcements Tom Junk DUNE Software and Computing General Meeting October 5, 2015.
Xiaomei Zhang CMS IHEP Group Meeting December
MICE Computing and Software
Belle II Physics Analysis Center at TIFR
Production Resources & Issues p20.09 MC-data Regeneration
Simulation use cases for T2 in ALICE
The LHCb Computing Data Challenge DC06
Presentation transcript:

DUNE Software and Computing News and Announcements Tom Junk DUNE Software and Computing General Meeting February 2, 2016

New Web Sites dune-data.fnal.gov  Monte Carlo – Challenge 5.0 and future MC MC samples and tiers  Data Files from the 35-ton prototype File list – automatically updated from file transfer script samweb usage tips – tells you how to access files! dune-young.hep.net  Content copied from lbne-young.hep.net (still not up to date) lbne-dqm.fnal.gov  Online and Nearline monitoring for 35-ton Tom Junk | DUNE S&C General News2

New Build Node dunebuild01.fnal.gov 16 Cores! (AMD Opteron 6320) 32 GB of RAM, 5 GB of swap To be used for building code only (we’ll watch for misuse) mrb i –j16 now gives you a big boost in speed dCache disks are not mounted /dune/data and /dune/data2 however are still mounted. /build/ has 2.8 TB in it. Not clear how to use this effectively. Let Tom know if you need something different on it. 16 Cores was chosen based on Lynn Garren’s build speed test: With builds using BlueArc (/dune/app), more cores than 16 gives diminishing returns in speed due to disk i/o bottlenecks. That and the fact that machines with > 16 cores are even less available than the one we got with 16 cores Tom Junk | DUNE S&C General News3

New Redmine Sites dunebsm Exotic Physics with DUNE dunefgt Fine-Grained Tracker dunelbl Long-Baseline Physics WG dunendk Nucleon Deay HighLAND Analysis Tool WA105 Dual-Phase protoDUNE Tom Junk | DUNE S&C General News4

CILogon Certificates Replacing OSG Grid certificates – DUNE VO user entries with OSG Grid Certificates now given entries for CILogon certificates Current OSG Grid certificates remain valid until their expiration – no need to hurry and get a replacement CILogon certificate but the next time it’s refreshed there will be a new procedure. Eileen and Anne have contacted certificate users of the docdb’s and gave instructions for obtaining and using CILogon certificates with the docdb’s. CILogon will replace KCA certificates too.  jobsub client called kx509 to generate short-lived certificates using the user’s Kerberos ticket.  other uses, like SAM, required the user to execute kx509 or get- cert.sh (which calls kx509) to get a certificate.  Jobsub use of CILogon “to be transparent to the users” Tom Junk | DUNE S&C General News5

AFS at Fermilab is being shut down Feb. 25, 2016 Web sites at /afs/fnal.gov/files/expww are migrated to the NFS storage area /web/sites/. Available on FNALU and dunegpvm01 (but not other dunegpvm’s) Home areas in /afs/fnal.gov/home/room[1,2,3]/username being replaced with other networked storage. I was never fond of our AFS home areas anyhow  Very small quotas in the home area: 500 MB (!)  Authentication token which expires after 26 hours has caused user confusion.  It has its own syntax for managing. Want to know your quota? fs lq.  Not available on grid workers (wouldn’t want that anyhow for the replacement.)  Backups in /afs/fnal.gov/files/backup/home  documentation (becoming irrelevant) Tom Junk | DUNE S&C General News6

New Home Areas and Web sites Used to have personal “professional” web areas in ~/public_html/index.html for example. Accessed via Directory listings over http disabled without a special Service Desk request. Now there are NFS web areas dunegpvm*:/publicweb/ / where is the first letter of your user ID (= kerberos principal) Backups in /publicweb/.snapshot in case you accidentally delete something Home area snapshots and backups in the post-AFS era to be defined and documented Tom Junk | DUNE S&C General News7

lbnegpvm*.fnal.gov  dunegpvm*.fnal.gov Users were in the lbne group  active users or recently active users given new accounts in the dune group New dunegpvm11 spun up with new group and new user list. No /lbne/data, /lbne/data2, /lbne/app mounts on new dune machine. Same areas are mounted under /dune Still have /pnfs/lbne mounted (needed as some files are accessible only that way). Same with /scratch/lbne Current status: migrated lbnegpvm06 – lbnegpvm10 to dunegpvm machines. Gave back dunegpvm11. lbnegpvm01 through lbnegpvm05 (with dunegpvm convenience names) being converted as I write this. Finding missing things (like dCache mounts) and iterating with the Service Desk Tom Junk | DUNE S&C General News8

BlueArc Dismount on Grid Workers Affects us in particular!  /lbne/data, /lbne/data2 not mounted on dunegpvm6-10 machines, but still mounted on grid worker nodes.  /dune/data, /dune/data2 not mounted on grid worker nodes (!). These mount points were made after the decision to migrate away from BlueArc on the grid was taken.  Two ways to store your data: ifdh cp it to dCache: /pnfs/dune/persistent/users and /pnfs/dune/scratch/users Ask about tape-backed space! (We prefer SAM so the files won’t get lost) ifdh cp the files to BlueArc (many people still do this). This too will be disabled! End of 2016 shutdown! Tom Junk | DUNE S&C General News9

Metadata Changes Existing data tiers: raw simulated detector-simulated full-reconstructed New data tier: sliced The slicer/stitcher input source only works on raw data – limited number of data products it has to know how to slice and stitch. A new problem: The slicer/stitcher reformats events based on a software trigger definition. Do we need to store which trigger def was used in metadata? Tack it on the end of the detector type string? Tom Junk | DUNE S&C General News10

A Good Run List Proposal So far only 35-ton has data and thus needs a good-run list. One person’s bad data is another person’s good data. Alex Himmel suggested it would make SAM dataset queries simpler if good-run status were part of the metadata Can request a new good-run metadata field: arbitrary string so we can encode various kinds of goodness or badness. CDF had good run lists that were distributed as root trees and text files. Didn’t make sense to limit public datasets to a particular good- run set because runs would be re-classified and it takes a long time to reprocess everything. Need curation of the good run list. Who decides? Shift tool? Data Quality Team needed to make judgments. For 35-ton, we probably want analyzers to be tightly coupled to the data taking. Label special data runs for special analyses and record run numbers and ranges that are intended for subsequent analyses Tom Junk | DUNE S&C General News11

FIFE News Summer 2016 FIFE Workshop during the week of June 20 Fermilab GPGrid new features: partitionable slots, priority queueing instead of quotas: ings%20Library/CSLiaison_01_13_16.pdf&action=default Job Efficiency Links Tom Junk | DUNE S&C General News12

Job Resource Limits Enforced on FNAL GPGrid Last year the grid was more forgiving about going over  time limits (not CPU, wall-clock time is what counts)  virtual memory size  disk space used But now these limits are enforced. See the page For examples of how to ask for resources and links to more documentation. What happens if your job goes over the limit? It doesn’t get killed, but rather gets Held. To find out what went wrong, jobsub_q --held --user= You can use fifemon.fnal.gov to monitor how many jobs you have in each state. Policy may be different on non-FNAL OSG sites Tom Junk | DUNE S&C General News13

Very minor... Users in the LBNE VO are getting s saying that their AUP (Acceptable Use Policy) signatures are expiring (1 year). Users can ignore these and use the DUNE VO instead Tom Junk | DUNE S&C General News14

/dune/app Filled up briefly yesterday Tom Junk | DUNE S&C General News15

Reminder: DAQ Workshop at CERN Dates: Feb at CERN DAQ Hardware, Software, and Offline Computing Infrastructure Ask Maxine about site access for non-CERN Tom Junk | DUNE S&C General News16