1 Evolving ATLAS Computing Model and Requirements Michael Ernst, BNL With slides from Borut Kersevan and Karsten Koeneke U.S. ATLAS Distributed Facilities.

Slides:



Advertisements
Similar presentations
Multiprocessing and NUMA
Advertisements

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Alastair Dewhurst, Dimitrios Zilaskos RAL Tier1 Acknowledgements: RAL Tier1 team, especially John Kelly and James Adams Maximising job throughput using.
CHEP 2012 Computing in High Energy and Nuclear Physics Forrest Norrod Vice President and General Manager, Servers.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
System Simulation Of 1000-cores Heterogeneous SoCs Shivani Raghav Embedded System Laboratory (ESL) Ecole Polytechnique Federale de Lausanne (EPFL)
Computing Systems Roadmap and its Impact on Software Development Michael Ernst, BNL HSF Workshop at SLAC January, 2015.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
1: Operating Systems Overview
OPERATING SYSTEM OVERVIEW
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Stanford University, 1997.
Energy Model for Multiprocess Applications Texas Tech University.
Chapter 1 Sections 1.1 – 1.3 Dr. Iyad F. Jafar Introduction.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
GPGPU platforms GP - General Purpose computation using GPU
Lecture 2 : Introduction to Multicore Computing Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang University.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Computer System Architectures Computer System Software
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |
Status of 2015 pledges 2016 requests RRB Report Concezio Bozzi INFN Ferrara LHCb NCB, November 3 rd 2014.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.
ATLAS in LHCC report from ATLAS –ATLAS Distributed Computing has been working at large scale Thanks to great efforts from shifters.
Alternative ProcessorsHPC User Forum Panel1 HPC User Forum Alternative Processor Panel Results 2008.
Copyright ©2003 Digitask Consultants Inc., All rights reserved Cluster Concepts Digitask Seminar November 29, 1999 Digitask Consultants, Inc.
Common software needs and opportunities for HPCs Tom LeCompte High Energy Physics Division Argonne National Laboratory (A man wearing a bow tie giving.
CPU Inside Maria Gabriela Yobal de Anda L#32 9B. CPU Called also the processor Performs the transformation of input into output Executes the instructions.
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory Review of U.S. LHC Software and Computing Projects Fermi National Laboratory November.
Template This is a template to help, not constrain, you. Modify as appropriate. Move bullet points to additional slides as needed. Don’t cram onto a single.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
Morgan Kaufmann Publishers
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Future computing strategy Some considerations Ian Bird WLCG Overview Board CERN, 28 th September 2012.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.
Abstract Increases in CPU and memory will be wasted if not matched by similar performance in I/O SLED vs. RAID 5 levels of RAID and respective cost/performance.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
PetaCache: Data Access Unleashed Tofigh Azemoon, Jacek Becla, Chuck Boeheim, Andy Hanushevsky, David Leith, Randy Melen, Richard P. Mount, Teela Pulliam,
Alessandro De Salvo CCR Workshop, ATLAS Computing Alessandro De Salvo CCR Workshop,
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Multi-Core CPUs Matt Kuehn. Roadmap ► Intel vs AMD ► Early multi-core processors ► Threads vs Physical Cores ► Multithreading and Multi-core processing.
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Readiness of ATLAS Computing - A personal view
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Large Scale Test of a storage solution based on an Industry Standard
Presentation transcript:

1 Evolving ATLAS Computing Model and Requirements Michael Ernst, BNL With slides from Borut Kersevan and Karsten Koeneke U.S. ATLAS Distributed Facilities Meeting UCSC November 13, 2012

2

3 ATLAS Computing Mar-Aug 2012

4 Computing Resource Usage in 2012,

5 Current Resource Usage

6 Resource Usage at Tier-2s

7

8 Contributions by Country (Production and Analysis) Includes beyond pledge resources

9 Contribution by Job Type Production Analysis US: 22% of available CPU used for Analysis 77% of Analysis done At Tier-2s T1 T2

10 Contribution to Simulation (Aug-Oct) (3843) (1867)(1762) (1067) (896) Avg # of fully Utilized cores

11 Contribution to Pile (Aug-Oct) (1818) (374) (1128) (526) Avg # of fully Utilized cores

12 Contribution to Analysis (Aug-Oct) (342) (1024) (590) (720) (395) Avg # of fully Utilized cores

13 Contribution to Reco (Aug-Oct) (342) (1024) (590) (720) (395) Avg # of fully Utilized cores (512) (108) (122) (112)

14 Balancing Resources across the Tier-1 and Tier-2s for cost/benefit optimization E. Lancon (ICB Chair) at the Oct ICB Meeting

15 Resource Development

16 Evolution and Prediction of price/performance of CPU Servers B. Panzer/CERN In the US we have observed prices going up slightly between 2011 and Moore’s law hasn’t helped to improve price/performance ratio – Future?

17 Multi-core vs. Many-core  A typical modern compute server has cores  Number of cores in commodity machines grows arithmetically  Number of cores in the enterprise space still grows geometrically  Number of cores in our datacenters grows between the two, expected to slow down in the long run  Many-core is not multi-core  Observing memory hierarchy issues  Cache coherency  NUMA  Memory b/w or I/O paths may be constraining  Multiprocess is a convenient model, but it’s neither sustainable nor scalable

18 Evolution and Prediction of Price for Disk Space B. Panzer/CERN Disk prices are ~1.5x compared to 2010 predictions

19 Medium-term hardware trends  Pricing follows market pressure, not technology  I/O, disk and memory not progressing at the same rate as compute power  Bulk of improvements in x86 still comes from Moore’s Law  Enterprise and HPC-targeted developments, where cost-effective, trickle down to our datacenter environment  Heterogeneous architectures  Cross-platform, cross socket, hybrid CPUs, accelerators, throughput vs. classic computing

20 Non-Intel Hardware  GPUs  NVIDIA working hard but process technology lacking  P2P communication improved  Software getting better  MIC: Tesla might be no longer competitive  ARM  Slow penetration of the server space  64-bit instruction set defined (you can buy today 32 bit CPUs)  Software improvements make ARM look like a viable option  AMD  Lagging behind, recent experiments not compelling  FPGA  Still too far off for mainstream accelerators, software issues  Upcoming: low power/micro servers 192 cores, 1 GB/core $35k

21 First Projections for

22 Computing Requirements vs LHC Bunch Spacing

23 Computing Requirements vs LHC Bunch Spacing S. McMahon

24 Resource requests rising after LS1

25 Computing Model Changes

26

27 Offline Core Analysis Simulation

28

29

30

31

32

33

34 Summary  The Facilities have reliably delivered in all areas according to our obligations, and in may areas beyond  The overall system, comprising facility hardware and services, and the ATLAS software needs to evolve to improve the efficiency and to cope with sharply growing requirements after LS1  Resources were used more effectively with “Life w/o “ESDs”, PD2P, reduced # of DS replicas but the potential for more – significant- is shrinking  A combined effort, driven by analysis and software experts, is needed to get ATLAS Computing prepared for the challenges ahead  Convinced the LHC machine will deliver …  LS1 is around the corner but my impression from the last SW&C week is, there is not much activity in the SW area to address issues