Download presentation
Presentation is loading. Please wait.
1
LHC experimental data: From today’s Data Challenges to the promise of tomorrow B. Panzer – CERN/IT, F. Rademakers – CERN/EP, P. Vande Vyvre - CERN/EP Academic Training CERN
2
Computing Infrastructure and Technology Day 2 Academic Training CERN 12-16 May 2003 Bernd Panzer-Steindel CERN-IT
3
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 3 Outline tasks, requirements, boundary conditions component technologies building farms and the fabric into the future
4
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 4 Before building a computing infrastructure some questions need to be answered : what are the tasks ? what is the dataflow ? what are the requirements ? what are the boundary conditions ? Questions
5
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 5 Interactive physics analysis Interactive physics analysis Interactive physics analysis Interactive physics analysis Experiment dataflow Event reconstruction Event reconstruction High Level Trigger selection, reconstr. High Level Trigger selection, reconstr. Processed Data Raw Data Event Simulation Data Acquisition Event Summary Data Interactive physics analysis Interactive physics analysis Physics analysis
6
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 6 Physics result Detector channel digitization Level 1 and Level 2 Trigger Event building High Level Trigger Detector channel digitization Level 1 and Level 2 Trigger Event building High Level Trigger Offline data reprocessing Offline data analysis Interactive data analysis and visualization Offline data reprocessing Offline data analysis Interactive data analysis and visualization Simulated data production (Monte Carlo) Simulated data production (Monte Carlo) Tasks Data storage Data calibration Online data processing
7
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 7 Tape server Disk server CPU server DAQ Central Data Recording Online processing Online filtering MC production + pileup Analysis Dataflow Examples Re-processing 5 GB/s2 GB/s1 GB/s 50 GB/s 100 GB/s CPU intensive scenario for 2008
8
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 8 Requirements and Boundaries (I) The HEP applications require integer processor performance and less floating point performance choice of processor type, benchmark reference Large amount of processing and storage needed, but optimization is for aggregate performance, not the single tasks + the events are independent units many components, moderate demands on the single components, coarse grain parallelism
9
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 9 the major boundary condition is cost, staying within the budget envelope + maximum amount of resources commodity equipment, best price/performance values ≠ cheapest ! take into account reliability, functionality and performance together == total-cost-of-ownership basic infrastructure, environment availability of space, cooling and electricity Requirements and Boundaries (II)
10
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 10 Component technologies processor disk tape network and packaging issues
11
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 11 Level of complexity Batch system, load balancing, Control software, Hierarchical Storage Systems Hardware Software CPU Physical and logical coupling Disk PC Storage tray, NAS server, SAN element Storage tray, NAS server, SAN element Motherboard, backplane, Bus, integrating devices (memory,Power supply, controller,..) Operating system, driver Network (Ethernet, fibre channel, Myrinet, ….) Hubs, switches, routers Cluster World wide cluster Grid middleware Wide area network Coupling of building blocks
12
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 12 Processors focus on integer price/performance (SI2000) PC mass market INTEL and AMD price/performance optimum is changing frequently between the two weak point of AMD : heat protection, heat production current CERN strategy is to use INTEL processors
13
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 13 Price/performance evolution
14
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 14 Industry tries now to fulfill Moore’s Law
15
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 15 best price/performance per node comes today with dual processors and desk side cases processors are only 25-30% of the box costs mainboard, memory, power-supply, case, disk today a typical configuration is : 2 x 2.4 GHz PIV processors, 1 GB memory, 80 GB disk, fast ethernet about two ‘versions’ behind == 2.8 GHz, 3 Ghz are available but don’t give a good price/performance value one has to add 10% of the box costs for infrastructure ( racks, cabling, network, control system) Processor packaging 1U rack mounted case desk side case thin units can be up to 30% more expensive cooling and space
16
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 16 Computer center Experiment control room SPACE
17
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 17 seeing effects of market saturation for desktops + moving into the laptop direction we are currently using “desktop+” machines more expensive to use server CPU’s Moore’s Second Law : the cost of a fabrication facility increases at an even greater rate as the transistor density (doubling every 18 month) current fabrication plants cost : ~ 2.5 billion $ (INTEL profit in 2002 : 3.2 billion $) heat dissipation, currently heat production increases linear with performance tera herz transistors (2005 - ), reduce leakage currents power saving processors BUT careful to compare effective performance measures for mobile computing do not help in case of 100% CPU utilization 24*7 operation Problems
18
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 18 Processor power consumption Heat production
19
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 19 Electricity and cooling large investments necessary, long planning and implementation period we use today about 700 KW in the center, upgrade to 2.5 MW has started i.e. 2.5 for electricity + 2.5 for cooling need extra buildings, will take several years and costs up to 8 million SFr this infrastructure evolves not linear but in larger step functions much more complicated for the experimental areas with their space and access limitations Basic infrastructure
20
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 20 Disk storage density improving every year (doubling every ~14 month) single stream speed (sequential I/O) increasing considerably (up to 100 MB/s) transactions per second (random I/O, access time) very little improvement (factor 2 in 4 years, from 8 ms to 4 ms) data rates drop considerably when moving from sequential to random I/O online/offline processing works with sequential streams analysis using random access patterns and multiple,parallel sequential streams =~ random access disks come in different ‘flavours’, connection type to the host same hardware with different electronics SCSI, IDE, fiber channel different quality selection criteria MTBF (Mean-Time-Between-Failure) mass market == lower values
21
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 21 Disk performance
22
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 22 Price/performance evolution
23
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 23 Storage density evolution
24
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 24 10-12 IDE disks are attached to a RAID controller inside a modified PC with a larger housing, connected with gigabit ethernet to the network NAS Network Attached Storage good experience with this approach, current practice alternatives : SAN Storage Area Networks based on disks directly attached to a fiber channel network iSCSI SCSI commands via IP, disk trays with iSCSI controller attached to ethernet R&D, evaluations advantages of SAN versus NAS which would justify the higher costs factor 2-4 not only the ‘pure’ costs per GB of storage throughput, reliability, manageability, redundancy Storage packaging
25
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 25 for disk servers coupling of disks, processor, memory and network defines the performance + LINUX PCI 120 – 500 MB/s PCI-X 1 – 8 GB/s
26
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 26 Tape storage not a mass market, aimed at backup (write once - read never) we need high throughput reliable under constant read/write stress need automated reliable access to a large amount of data large robotic installations major players are IBM and StorageTek (STK) improvements are slow, not comparable with processors or disks trends ; current generation : 30 MB/s tape drives with 200 GB cartridges disk and tape storage prices are getting closer factor 2-3 difference two types of read/write technologies : helical scan “video recorder” complicated mechanics linear scan “audio recorder” simpler, density lower linear is prefered, had some bad experience with helical scan
27
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 27 Network commodity Ethernet 10 / 100 / 1000 / 10000 Mbits/s sufficient in the offline world and even partly in the online world (HLT) level1 triggers need lower latency times special network, Cluster interconnect : Myrinet 1,2,10 Gbits/s GSN 6.4 Gbits/s infiniband 2.5 Gbits/s * 4 (12) storage network fiber channel 1 Gbits/s, 2 Gbits/s very high performance with low latency, small processor ‘footprint’, small market, expensive
28
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 28 nano technology (carbon nanotubes) molecular computing, (kilohertz plastic processors, single molecule switches) biological computing, (DNS computing) quantum computing, (quantum dots, ion traps, few qbits only) very interesting and fast progress in the last years, but far away from any commodity production less fancy game machines (X-Box, GameCube, Playstation 2) advantage : large market (>10 billion $), cheap high power nodes disadvantage : little memory, networking capabilities graphics cards several times the raw power of normal CPUs not easy to use in our environment “Exotic” technology trends
29
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 29 Technology evolution exponential growth rates everywhere
30
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 30 Building farms and the fabric
31
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 31 Building the Farm Processors “desktop+” node == CPU server CPU server + larger case + 6*2 disks == Disk server CPU server + Fiber Channel Interface + tape drive == Tape server
32
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 32 Software ‘glue’ management of the basic hardware and software : installation, configuration and monitoring system (from the European Data Grid project) management of the processor computing resources : Batch system (LSF from Platform Computing) management of the storage (disk and tape) : CASTOR (CERN developed Hierarchical Storage Management system)
33
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 33 tape servers disk servers application servers Generic model of a Fabric to external network network
34
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 34 Fast Ethernet, 100 Mbit/s Gigabit Ethernet, 1000 Mbit/s WAN Disk ServerTape Server CPU Server Backbone Today’s schematic network topology Multiple Gigabit Ethernet, 20 * 1000 Mbit/s Gigabit Ethernet, 1000 Mbit/s
35
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 35 LCG Testbed Structure 100 cpu servers on GE, 300 on FE, 100 disk servers on GE (~50TB), 20 tape server on GE 3 GB lines 8 GB lines 64 disk server Backbone Routers Backbone Routers 36 disk server 20 tape server 100 GE cpu server 200 FE cpu server 100 FE cpu server 1 GB lines GigaBit Gigabit Ethernet Fast Ethernet
36
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 36 Benchmark,performance and testbed clusters (LCG prototype resources) computing data challenges, technology challenges, online tests, EDG testbeds, preparations for the LCG-1 production system, complexity tests 500 CPU server, 100 disk server, ~390000 Si2000, ~ 50 TB Main fabric cluster (Lxbatch/Lxplus resources) physics production for all experiments Requests are made in units of Si2000 1000 CPU server, 160 disk server, ~ 950000 Si2000, ~ 100 TB 50 tape drives (30MB/s, 200 GB cart.) 10 silos with 6000 slots each == 12 PB capacity Computer center today
37
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 37 2-3 hardware generations 2-3 OS/software versions 4 Experiment environments Service control and management (e.g. stager, HSM, LSF master, repositories, GRID services, CA, etc Service control and management (e.g. stager, HSM, LSF master, repositories, GRID services, CA, etc Main fabric cluster Certification cluster Main cluster ‘en miniature’ Certification cluster Main cluster ‘en miniature’ R&D cluster (new architecture and hardware) R&D cluster (new architecture and hardware) Benchmark and performance cluster (current architecture and hardware) Benchmark and performance cluster (current architecture and hardware) New software, new hardware (purchase) oldcurrentnew Development cluster GRID testbeds Development cluster GRID testbeds General Fabric Layout
38
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 38 View of different Fabric areas Infrastructure Electricity, Cooling, Space Infrastructure Electricity, Cooling, Space Network Batch system (LSF, CPU server) Batch system (LSF, CPU server) Storage system (AFS, CASTOR, disk server) Storage system (AFS, CASTOR, disk server) Purchase, Hardware selection, Resource planning Purchase, Hardware selection, Resource planning Installation Configuration + monitoring Fault tolerance Installation Configuration + monitoring Fault tolerance Prototype, Testbeds Benchmarks, R&D, Architecture Benchmarks, R&D, Architecture Automation, Operation, Control Coupling of components through hardware and software GRID services !?
39
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 39 Into the future
40
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 40 current state of performance, functionality and reliability is good and technology developments look still promising more of the same for the future !?!? How can we be sure that we are following the right path ? How to adapt to changes ? Considerations
41
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 41 continue and expand the current system BUT do in parallel : R&D activities SAN versus NAS, iSCSI, IA64 processors, …. technology evaluations infiniband clusters, new filesystem technologies,….. Data Challenges to test scalabilities on larger scales “bring the system to it’s limit and beyond “ we are very successful already with this approach, especially with the “beyond” part Fridays talk watch carefully the market trends Strategy
42
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 42 CERN computer center 2008 Hierarchical Ethernet network tree topology (280 GB/s) ~ 8000 mirrored disks ( 4 PB) ~ 3000 dual CPU nodes (20 million SI2000) ~ 170 tape drives (4 GB/s) ~ 25 PB tape storage The CMS High Level Trigger will consist of about 1000 nodes with 10 million SI2000 !! all numbers : IF exponential growth rate continues !
43
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 43 Gigabit Ethernet, 1000 Mbit/s WAN Disk Server Tape Server CPU Server Backbone Tomorrow’s schematic network topology Multiple 10 Gigabit Ethernet, 200 * 10000 Mbit/s 10 Gigabit Ethernet, 10000 Mbit/s
44
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 44 quite confident in the technological evolution quite confident in the current architecture LHC computing is not a question of pure technology efficient coupling of components, hard- + software commodity is a must for cost efficiency boundary conditions are important market development can have large effects Summary
45
CERN Academic Training 12-16 May 2003 Bernd Panzer-Steindel CERN-IT 45 Tomorrow Day 1 (Pierre VANDE VYVRE) –Outline, main concepts –Requirements of LHC experiments –Data Challenges Day 2 (Bernd PANZER) –Computing infrastructure –Technology trends Day 3 (Pierre VANDE VYVRE) –Data acquisition Day 4 (Fons RADEMAKERS) –Simulation, Reconstruction and analysis Day 5 (Bernd PANZER) –Computing Data challenges –Physics Data Challenges –Evolution
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.