Offline Discussion M. Moulson 22 October 2004 Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk.

Slides:



Advertisements
Similar presentations
Status reports Trigger simulationM. Palutan, B. Sciascia DC GeometryA. Antonelli, S. Dell’Agnello DC wire sag measurementS. Dell’Agnello, C. Forti DC s-t.
Advertisements

Work finished M. Antonelli ISR and f decay simulation M. MoulsonBank-reduction code for DST’s M. Palutan et al.Trigger simulation parameters S. GiovannellaGEANFI.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Belle computing upgrade Ichiro Adachi 22 April 2005 Super B workshop in Hawaii.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Notes on offline data handling M. Moulson Frascati, 29 March 2006.
L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.
KLOE Offline Status Report Data Reconstruction MonteCarlo updates MonteCarlo production Computing Farm C.Bloise – May 28th 2003.
Online Data Challenges David Lawrence, JLab Feb. 20, /20/14Online Data Challenges.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
An Overview of PHENIX Computing Ju Hwan Kang (Yonsei Univ.) and Jysoo Lee (KISTI) International HEP DataGrid Workshop November 8 ~ 9, 2002 Kyungpook National.
MiniBooNE Computing Description: Support MiniBooNE online and offline computing by coordinating the use of, and occasionally managing, CD resources. Participants:
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
The Computing System for the Belle Experiment Ichiro Adachi KEK representing the Belle DST/MC production group CHEP03, La Jolla, California, USA March.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
MC/Offline Planning M. Moulson Frascati, 21 March 2005 MC status Reconstruction coverage and data quality Running time estimates MC & reprocessing production.
Offline/MC status report M. Moulson 6th KLOE Physics Workshop Sabaudia, May 2006.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
KLOE Computing Update Paolo Santangelo INFN LNF KLOE General Meeting University of Rome 2, Tor Vergata 2002, December
LAV Software Status Emanuele Leonardi – Tommaso Spadaro Photon Veto WG meeting – 2015/03/24.
Computing Resources for ILD Akiya Miyamoto, KEK with a help by Vincent, Mark, Junping, Frank 9 September 2014 ILD Oshu City a report on work.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
STATUS OF KLOE F. Bossi Capri, May KLOE SHUTDOWN ACTIVITIES  New interaction region  QCAL upgrades  New computing resources  Monte Carlo.
CHEP 2000: 7-11 February, 2000 I. SfiligoiData Handling in KLOE 1 CHEP 2000 Data Handling in KLOE I.Sfiligoi INFN LNF, Frascati, Italy.
Update on MC-04(05) –New interaction region –Attenuation lengths vs endcap module numbers –Regeneration and nuclear interaction –EMC time resolution –
The LHCb CERN R. Graciani (U. de Barcelona, Spain) for the LHCb Collaboration International ICFA Workshop on Digital Divide Mexico City, October.
Status of the KLOE experiment M. Moulson, for the KLOE collaboration LNF Scientific Committee 23 May 2002 Data taking in 2001 and 2002 Hardware and startup.
The KLOE computing environment Nuclear Science Symposium Portland, Oregon, USA 20 October 2003 M. Moulson – INFN/Frascati for the KLOE Collaboration.
Offline meeting Outcome of Scientific Committee Status of MC04 Proposal for a minimum bias EvCl sample Other offline activies: –Reprocessing –Dead wires.
Offline meeting  Status of MC/Datarec  Priorities for the future LNF, Sep
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
Offline Status Report A. Antonelli Summary presentation for KLOE General Meeting Outline: Reprocessing status DST production Data Quality MC production.
Status report of the KLOE offline G. Venanzoni – LNF LNF Scientific Committee Frascati, 9 November 2004.
Reliability of KLOE Computing Paolo Santangelo for the KLOE Collaboration INFN LNF Commissione Scientifica Nazionale 1 Roma, 13 Ottobre 2003.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
Status of the hadronic cross section (small angle) Federico Nguyen February 22 nd 2005  the 2002 data sample and available MC sets  trigger efficiency.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
5 June 2003Alan Norton / Focus / EP Topics1 Other EP Topics Some 2003 Running Experiments - NA48/2 (Flavio Marchetto) - Compass (Benigno Gobbo) - NA60.
Work finished M. Antonelli ISR and f decay simulation M. MoulsonBank-reduction code for DST’s.
Computing Resources for ILD Akiya Miyamoto, KEK with a help by Vincent, Mark, Junping, Frank 9 September 2014 ILD Oshu City a report on work.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
MC sign-off and running plans M. Moulson, 1 March 2004 Offline discussion Outline: Results of tests on new MC files Sign-off on production version What.
Batch Software at JLAB Ian Bird Jefferson Lab CHEP February, 2000.
Offline resource allocation M. Moulson, 11 February 2003 Discussion Outline: Currently mounted disk space Allocation of new disk space CPU resources.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
P latform stability and track-fit problems M. Moulson, T. Spadaro, P. Valente Tracking Meeting, 18 Jul 2001.
Status report on analysis of BR(K S  p + p - p 0 ) A. Antonelli, M. Moulson, Second KLOE Physics Workshop, Otranto, June 2002.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
1 Reminder: Why reprocess? data Reconstruction: DBV DSTs: DBV data DBV-19 for reconstruction & DSTs, Run < DBV-20 for.
PADME Kick-Off Meeting – LNF, April 20-21, DAQ Data Rate - Preliminary estimate Tentative setup: all channels read with Fast ADC 1024 samples, 12.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?
Status Report on Data Reconstruction May 2002 C.Bloise Results of the study of the reconstructed events in year 2001 Data Reprocessing in Y2002 DST production.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
MC/OFFLINE meeting  Status  Request by Ke2 group  aob LNF
Update on the false-dead DC wire problem
KLOE offline & computing: Status report
MC ’04-’05 objectives                                                       
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
ALICE Computing Upgrade Predrag Buncic
Presentation transcript:

Offline Discussion M. Moulson 22 October 2004 Datarec status Reprocessing plans MC status MC development plans Linux Operational issues Priorities AFS/disk space

2 Datarec DBV-20 DC geometry updated Global shift:  y =  550 μm,  z =  1080 μm Implemented in datarec for Run > Thickness of DC wall not changed (  75 μm) Modifications to DC timing calibrations Independence from EmC timing calibrations Modifications to event classification (EvCl) New KSTAG algorithm ( K S tagged by vertex in DC) Bunch spacing by run number in T0_FIND step 1 for ksl ns for 2004 data (also for MC, some 2000 runs) Boost values Runs not reconstructed without BMOM v.3 in HepDB p x values from BMOM(3) now used in all EvCl routines Run > 31690

3 Datarec operations Runs (29 Apr) to (21 Oct, 00:00) 413 pb -1 to disk with tag OK 394 pb -1 with tag = 100 (no problems) 388 pb -1 with full calibrations 371 pb -1 reconstructed (96%) 247 pb -1 DSTs (except K  K  ) fsun03-fsun10 decommissioned 11 Oct Necessary for installation of new tape library datarec submission moved from fsun03 to fibm35 DST submission moved from fsun04 to fibm keV offset in  s discovered!

4 150 keV offset in  s Discovered while investigating ~100 keV discrepancies between physmon and datarec +150 keV adjustment to fit value of  s not implemented in physmon in datarec when final BVLAB  s values written to HepDB Plan of action: 1.New Bhabha histogram for physmon fit, taken from data 2.Sync datarec fit with physmon 3.Fix BVLAB fit before final 2004 values computed 4.Update values in DB records histogram_history and HepDB BMOM currently from BVLAB scan, need to add 150 KeV Update of HepDB technically difficult, need a solution

5 Reprocessing plans Issues of compatibility with MC DC geometry, T0_FIND modifications by run number DC timing modifications do not impact MC chain Additions to event classification would require new MCDSTs only In principle possible to use run number range to fix p x values for backwards compatibility Use batch queues? Main advantage: Increased stability

6 Further datarec modifications Modification of inner DC wall thickness (  75 μm) Implement by run number Cut DC hits with drift times  2.5 μs Suggested by P. de Simone in May to reduce fraction of split tracks Others?

7 Program Events (10 6 ) LSF Time (B80 days) Size (TB) e  e    e + e    (ISR only)  rad ee  eeee  ee  all  all (21 pb -1 scan)  K S K L  K  K  Total  K S K L  rare 6220*320 est.1.7 est. MC production status

8 Generation of rare K S K L events K S          3   K L             (DE)     Peak cross section: 7.5 nb Approx 2x sum of BRs for rare KL channels In each event, either K S or K L decays to rare mode Random selection Scale factor of 20 applies to K L For K S, scale factor is ~100

9 MC development plans Beam pipe geometry for 2004 data (Bloise) LSB insertion code (Moulson) Fix  generator (Nguyen, Bini) Improve MC-data consistency on tracking resolution (Spadaro, others) MC has better core resolution and smaller tails than data in E miss  p miss distribution in  background for K S  e analysis Improving agreement would greatly help for precision studies involving signal fits, spectra, etc. Need to systematically look at other topologies/ variables Need more people involved

10 Linux software for KLOE analysis P. Valente had completed an earlier port based on free software VAST F90-to-C preprocessor Clunky to build and maintain M. Matsyuk has completed a KLOE port based on the Intel Fortran compiler for Linux Individual, non-commercial license is free libkcp code compiles with zero difficulty Reconsider issues related to maintenance of KLOE software for Linux

11 Linux usage in KLOE analysis Most users currently processing YBOS DSTs into Ntuples on farm machines and transferring Ntuples to PCs AFS does not handle random-access data well i.e., writing CWNs as analysis output Multiple jobs on a single farm node stress AFS cache Farm CPU (somewhat) limited AFS disk space perennially at a premium KLOE software needs minimal for most analysis jobs YBOS to Ntuple: No DC reconstruction, etc. Analysis jobs on user PCs accessing DSTs via KID and writing Ntuples locally should be quite fast Continuing interest on part of remote users

12 KLOE software on Linux: Issues 1.Linux machines at LNF for hosting/compilation 3 of 4 Linux machines in Computer Center are down, including klinux (mounts /kloe/soft, used by P. Valente for VAST build) 2.KLOE code distribution User PCs do not mount /kloe/soft Move /kloe/soft to network-accessible storage? Use CVS for distribution? Elegant solution but user must periodically update… 3. Individual users must install Intel compiler 4. KID Has been built for Linux in the past 5. Priority/manpower

13 Operational issues Offline expert training 1-2 day training course for all experts General update PC backup system Commercial tape backup system available to users to backup individual PCs

14 Priorities and deadlines In order of priority, for discussion: 1.Complete MC production: K S K L rare 2.Reprocessing 3.MC diagnostic work 4.Other MC development work for Linux Deadlines?

15 Disk resources Current recalled areas Production0.7 TB User recalls2.1 TB DST cache 12.9 TB (10.2 TB added in April) 2001 – 2002 Total DSTs 7.4 TB Total MCDSTs 7.0 TB 2004 DST volume scales with  L 3.2 TB added to AFS cell Not yet assigned to analysis groups 2.0 TB available but not yet installed Reserved for testing new network-accessible storage solutions

16 Limitations of AFS Initial problems with random-access files blocking AFS on farm machines resolved Nevertheless, AFS has some intrinsic limitations: Volume sizes at most 100 GB Already pushed to the limit – max spec is 8 GB! Cache must be much larger than AFS-directed data volume for all jobs on farm machine Problem characteristic of random-access files (CWNs) Current cache sizes 3.5 GB on each farm machine More than sufficient for a single job Possible problems with 4 big jobs/machine Enlarging cache sizes requires purchase of more local disk for farm machines

17 Network storage: Future solutions Possible alternatives to AFS 1.NFS v. 4 kerberos authentication – use klog as with AFS Size of data transfers smaller, expect fewer problems with random-access files 2.Storage Area Network (SAN) filesystem Currently under consideration as a Grid solution Works only with Fibre Channel (FC) interfaces FC – SCSI/IP interface implemented in hardware/software Availability expected in 2005 Migration away from AFS probable within ~6 months 2 TB allocated to tests of new network storage solutions Current AFS system will remain interim solution

18 Current AFS allocations Volumes Space (GB) Working group cpwrk195Neutral K kaon170Neutral K kwrk200Charged K phidec400Radiative ecl149 mc90 recwrk30 trg100 trk

19 A fair proposal? Each of the 3 physics WGs gets 1400 GB total Total disk space (incl. already installed) divided equally Physics WGs similar in size and diversity of analyses WGs can make intelligent use of space e.g.: Some degree of Ntuple sharing already present Substantial increases for everyone anyway

20 Additional information

21 Offline CPU/disk resources for 2003 Available hardware: 23 IBM B80 servers: 92 CPU’s 10 Sun E450 servers: 18 B80 CPU-equivalents 6.5 TB NFS-mounted recall disk cache Easy to reallocate between production and analysis Allocation of resources in 2003: 64 to 76 CPU’s on IBM B80 servers for production 800 GB of disk cache for I/O staging Remainder of resources open to users for analysis

22 Analysis environment for 2003 Production of histograms/Ntuples on analysis farm: 4 to 7 IBM B80 servers + 2 Sun E450 servers DST’s latent on 5.7 TB recall disk cache Output to 2.3 TB AFS cell accessed by user PC’s Analysis example: 440M K S K L events, 1.4 TB DST’s 6 days elapsed for 6 simultaneous batch processes Output on order of GB Final-stage analysis on user PC/Linux systems

23 CPU power requirements for 2004 Input rate (KHz) Avg L (10 30 cm  2 s  1 ) B80 CPU’s needed to follow acquisition MC DST recon 76 CPU offline farm

24 CPU/disk upgrades for 2004 Additional servers for offline farm: 10 IBM p630 servers: 10×4 POWER GHz Adds more than 80 B80 CPU equivalents to offline farm Additional 20 TB disk space To be added to DST cache and AFS cell More resources already allocated to users 8 IBM B80 servers now available for analysis Can maintain this allocation during 2004 data taking Ordered, expected to be on-line by January

25 Installed tape storage capacity IBM 3494 tape library: 12 Magstar 3590 drives, 14 MB/s read/write 60 GB/cartridge (upgraded from 40 GB this year) 5200 cartridges (5400 slots) Dual active accessors Managed by Tivoli Storage Manager Maximum capacity: 312 TB (5200 cartridges) Currently in use: 185 TB

26 Tape storage requirements for 2004 Stored vol. by type (GB/pb  1 ) est. Incl. streaming mods Today+780 pb  pb  pb  1 Tape library usage (TB) free raw recon DST MC

27 Tape storage for 2004 Additional IBM 3494 tape library 6 Magstar 3592 drives: 300 GB/cartridge, 40 MB/s Initially 1000 cartridges (300 TB) Slots for 3600 cartridges (1080 TB) Remotely accessed via FC/SAN interface Definitive solution for KLOE storage needs Bando di gara submitted to Gazzetta Ufficiale Reasonably expect 6 months to delivery Current space sufficient for a few months of new data

28 Machine background filter for 2004 Background filter (FILFO) last tuned on data 5% inefficiency for  events, varies with background level Mainly traceable to cut to eliminate degraded Bhabhas Removal of this cut:Reduces inefficiency to 1% Increases stream volume 5-10% Increases CPU time 10-15% New downscale policy for bias-study sample: Fraction of events not subject to veto, written to streams Need to produce bias-study sample for data To be implemented as reprocessing of a data subset with new downscale policy Will allow additional studies on FILFO efficiency and cuts

29 Other offline modifications for 2004 Modifications to physics streaming: Bhabha stream: keep only subset of radiative events Reduces Bhabha stream volume by 4  Reduces overall stream volume by >40% K S K L stream: clean up choice of tags to retain Reduces K S K L stream volume by 35% K  K  stream: new tag using dE/dx Fully incorporate dE/dx code into reconstruction Eliminate older tags, will reduce stream volume Random trigger as source of MC background for Hz of random triggers synched with beam crossing allows background simulation for L up to 2  cm  2 s  1

30 KLOE computing resources tape library IBM 3494, GB slots, 2 robots, TSM 324 TB 12 Magstar E1A drives, 14 MB/sec each managed disk space 0.8 TB SSA: offline staging 6.5 TB 2.2 TB SSA TB FC: latent disk cache offline farm 19 IBM B80 4×POWER Sun E450 4×UltraSPARC-II 400 AFS cell 2 IBM H70 4×RS64-III TB SSA TB FC disk online farm 7 IBM H50 4×PPC604e TB SSA disk analysis farm 4 IBM B80 4×POWER Sun E450 4×UltraSPARC-II 400 file servers 2 IBM H80 6×RS64-III 500 DB2 server IBM F50 4×PPC604e 166 CISCO Catalyst 6000 nfs afs 100 Mbps1 Gbps

CPU estimate: details Extrapolated from 2002 data with some MC input 2002  L  = 36  b  1 /s  T3  = 1560 Hz 345 Hz  + Bhabha 680 Hz unvetoed CR 535 Hz bkg 2004  L  = 100  b  1 /s (assumed)  T3  = 2175 Hz 960 Hz  + Bhabha 680 Hz unvetoed CR 535 Hz bkg (assumed constant) From MC:   = 3.1  b (assumed)  + Bhabha trigger:  = 9.6  b  + Bhabha FILFO:  = 8.9  b CPU(  + Bhabha) = 61 ms avg. CPU time calculation: 4.25 ms to process any event ms for 60% of bkg evts + 61 ms for 93% of  + Bha evts 2002: 19.6 ms/evt overall – OK 2004: 31.3 ms/evt overall (  10%)

tape space estimate: details 2001: 274 GB/pb  : 118 GB/pb  1 Highly dependent on luminosity 2004: Estimate a priori Assume: KB/evt Raw event size assumed same for all events (has varied very little with background over KLOE history) Assume:  L  = 100  b  1 /s 1 pb  1 = 10 4 s: 25.0 GB for 9.6M physics evts 31.7 GB for 12.2M bkg evts (1215 Hz of bkg for 10 4 s) 56.7 GB/pb-1 total Stream GB/pb  GB/pb  1 KKKK 11.6 KSKLKSKL  3.3 radiative6.4 Bhabha other0.8 Total9849 rawrecon Include effects of streaming changes: MC Assumes 1.7M evt/pb  1 produced  all (1:5) and   K S K L (1:1)