3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier11 CDF needs at Tier 1 Many details in slides for (your) future reference Will move faster.

Slides:

Advertisements

Similar presentations

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.

Advertisements

Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.

Duke Atlas Tier 3 Site Doug Benjamin (Duke University)

10 May 2002Report & Plans on computers1 Status Report for CDF Italy computing fcdfsgi2 disks Budget report CAF: now, next year GRID etc. CDF – Italy meeting.

S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.

Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.

Computer Organization and Architecture

22/04/2005Donatella Lucchesi1 CDFII Computing Status OUTLINE:  New CDF-Italy computing group organization  Usage status at FNAL and CNAF  Towards GRID:

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.

CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.

Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.

3rd Nov 2000HEPiX/HEPNT CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department

D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.

8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.

Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.

Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.

ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.

GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.

International Workshop on HEP Data Grid Nov 9, 2002, KNU Data Storage, Network, Handling, and Clustering in CDF Korea group Intae Yu*, Junghyun Kim, Ilsung.

21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.

6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.

ATLAS in LHCC report from ATLAS –ATLAS Distributed Computing has been working at large scale Thanks to great efforts from shifters.

Febryary 10, 1999Stefano Belforte - INFN Trieste1 CDF Run II Computing Workshop. A user’s perspective Stefano Belforte INFN - Trieste.

EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,

The LHCb Italian Tier-2 Domenico Galli, Bologna INFN CSN1 Roma,

Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.

Stefano Belforte INFN Trieste 1 CMS Simulation at Tier2 June 12, 2006 Simulation (Monte Carlo) Production for CMS Stefano Belforte WLCG-Tier2 workshop.

Outline: Tasks and Goals The analysis (physics) Resources Needed (Tier1) A. Sidoti INFN Pisa.

HEPiX FNAL ‘02 25 th Oct 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 25 th October 2002 HEPiX 2002, FNAL.

18-sep-021 Computing for CDF Status and Requests for 2003 Stefano Belforte INFN – Trieste.

Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.

CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.

ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.

CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.

May Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.

Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.

International Workshop on HEP Data Grid Aug 23, 2003, KNU Status of Data Storage, Network, Clustering in SKKU CDF group Intae Yu*, Joong Seok Chae Department.

Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.

Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Belle II Computing Fabrizio Bianchi INFN and University of Torino Meeting Belle2 Italia 17/12/2014.

Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.

CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)

CDF ICRB Meeting January 24, 2002 Italy Analysis Plans Stefano Belforte - INFN Trieste1 Strategy and present hardware Combine scattered Italian institutions.

A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.

DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.

ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.

Compute and Storage For the Farm at Jlab

Dynamic Extension of the INFN Tier-1 on external resources

Readiness of ATLAS Computing - A personal view

William Stallings Computer Organization and Architecture

Artem Trunov and EKP team EPK – Uni Karlsruhe

Proposal for the LHCb Italian Tier-2

ALICE Computing Upgrade Predrag Buncic

DØ MC and Data Processing on the Grid

Gridifying the LHCb Monte Carlo production system

LHCb thinking on Regional Centres and Related activities (GRIDs)

Presentation transcript:

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier11 CDF needs at Tier 1 Many details in slides for (your) future reference Will move faster with talk Not really clear what I have to say today (newcomer) Try to address both motivations (Comitato di Gestione) and hardware specifics (Comitato Tecnico) More info in my web page and links therein 

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier12 A bit of History We thought about a central analysis center in Italy in ’99 and rejected it  No INFN site volunteered  Worried about manpower needs  Worried about guaranteeing consistent software environment, data base export, network speed  Leave hardware and sysman to professionals, focus on data analysis CSN1 approved a plan to  Analyze 2ndary data sets at FNAL  Copy Ntuple to INFN sites in Italy  Spend up to 1M$ in computers at FNAL  ~150K$ assigned and spent so far

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier13 What’s new Tier1 is coming along Software distribution proved to work very well Distributed DB access is a global CDF need and will have to be solved anyhow Networks grew much more then expected GRIDs are coming and many CDF collaborators are pursuing them FNAL facility will not grow to cover full needs (esp. interactive) Mini-farms in INFN sites will not grow beyond <10 nodes

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier14 CDF needs from Tier1 Want to move computing of CDF Italian group from Fermilab to Italy  Do not bring in Italy bulk data reconstruction  Not a fixed share of overall CDF needs Common trend of other CDF collaborators Look ahead at next 8 years of CDF Run2 data analysis  Exploit fast EU network vs. slow TransAtlantic link  CDF moving toward GRID architecture of Linux clusters  CDF Italy participates in DataTAG test Looking forward to use GRID tools for  Job submission etc. now  Data management later

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier15 Time frame Tevatron Run 2a: Tevatron Run 2b: Data Analyis = continous process, from this year till after LHC starts, 2-year overlap likely  CDF data analysis: until : Data at FNAL, copy NTUPLE to INFN sites 2003: transition 2004+: Data in Italy, copy NTUPLE to INFN sites Monte Carlo (largest uncertainety)  Large scale productions (1K FNAL  Small/private/fast (< 100 Tier1

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier16 Medium term (Run2a) needs for Tier1 Limited to Run2a: 2005 Goals:  Eliminate all italian hw at FNAL but “X-term” desktop  replicate 2ndary data sets  create/store 3rtiary data sets  create/store ntuples  allow interactive analysis of 3rtiary and ntuples  Eliminate all non-desktop from INFN sites  no “CDF” environment outside Tier1  limited PAW/ROOT only at local sites on small final samples for “publication refinement”  Share resources with other CDF groups, esp. Europe  collaboration started with UK + Spain, Germany to join later this year

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier17 Data (Orders of Magnitude) for Run2a 1 PB (1mary=everything) a FNAL (care of Fermilab) 100 TB (2ndary(part)) a FNAL (care of Fermilab) 10 TB (2ndary(part)) a FNAL (INFN founded) 10 TB (2ndary(copy)) al Tier1 (growing to replace previous) 1 TB (3rtiary) al Tier1 1 TB (Ntuple) al Tier1 By Physics Data Set (i.e. selected process, e.g. W+jets): cross section = 5nb  10^7 events  2nd = 1TB (100KB/event)  a few DS at Tier1  3rd = 100GB (select 1/10)  all “interesting” DS at Tier1  Ntuple = 1GB (1/10 selection * 10KB/event or all events*1KB/event or …) x “a few” x N_users  all at Tier1

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier18 Overall Hardware Needs Data Storage ( … )  10TB + 10TB/year for 2ndary/3rtiary  3TB + 1TB/year for interactive Analysis CPU  10 “1GHz” CPU / TB of data (from 2001 benchmark) Interactive CPU/Disk  2 “up-to-date” CPU / user x 40 users  300GB / user x 40 users (growing with technology)  size this from comparison to resources available to US students at Fermilab (typical University owned PC’s in offices: 5~7K$ per desk every 3 years ) MonteCarlo CPU (Gen+Sim)  ~40 “up-to-date” CPU’s (possible underestimate)

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier19 How/When to get started Work is resource-limited now (in spite of low Luminosity)  Need (order of): 2TB disk, 10 analysis CPU, 10 interactive CPU as of “yesterday” Expect some relief from new CAF at FNAL by summer  Slow turn on curve for effective usage  Saturation by starved users  Summer conference rush Better start moving some activity already  Learning process  Feed in experience from FNAL  Test/Learn/Develop new tools Release early – Release often – Listen to users Start with important but not critical activity

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier110 Short Term Tasks (next 12 months) Monte Carlo (H  tau-mu, top  6jet)  Using GRID framework  1 week x 20 PC every 1~2 months  Output = Ntuple = O(10GB) saved “at home”  temporary disk store, tape store welcome Secondary data sets (no backup, maybe no NFS (rootd))  Non-Grid copy to begin with  Goal: test tools on small scale (1/10) real exercise  hadronic multijet b-tagged data set for t  6j and H  bbar  200GB+50GB/month starting “now”, 4-10 CPU’s  Add larger data once “it works”  B  hadrons tertiary data sets  2~3 times larger

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier111 Another Short Term Task (2002) Interactive facility (non absolutely needed, but much welcomed), sized for 20 users :  Single place for code development/debug  Share 3rtiary data sets and ntuple  in general all sorts of small, unstructured data sets  Avoid replication, avoid resource waste  Resource sharing allow faster startup for new persons Already a significant impact with:  500GB + 100GB/month user’s data (RAID)  4CPU + 2CPU/month  Interactive access (non-GRID)  Common home & code (10GB/user x GB) backed up  disk quotas, NFS mounts

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier112 Summary of short term desiderata Simulation  20 nodes x 10 days every 1~2 months  Starting May~June (try DataTAG testbed in April) Data analysis  200GB growing to 2TB by Christmas  1 processor / 100 GB  Starting June~July Interactive  500GB + 10 processors  Doubling by Christmas  Starting “now”

3 Apr 2002Stefano Belforte – INFN Trieste Necessita’ CDF al Tier113 Would like to know user configuration Batch ? Which ? How ?  Queues, priorities, provision for privileged users  Automated job replica ? e.g. now at Fnal submit one exe on 100 files, automatically splitted in 100 jobs potentially running in parallel  Have your own to test ? Import ours from Fnal ? Tape storage  Which DB ? HSM ? Interactive ?  May like to copy from here to Fnal  How to deal with many competing users Should/could we join a “user committee/community” ?