Presentation is loading. Please wait.

Presentation is loading. Please wait.

1999 Summer Student Lectures Computing at CERN Lecture 2 — Looking at Data Tony Cass —

Similar presentations


Presentation on theme: "1999 Summer Student Lectures Computing at CERN Lecture 2 — Looking at Data Tony Cass —"— Presentation transcript:

1 1999 Summer Student Lectures Computing at CERN Lecture 2 — Looking at Data Tony Cass — Tony.Cass@cern.ch

2 2 Tony Cass Data and Computation for Physics Analysis batch physics analysis batch physics analysis detector event summary data raw data event reconstruction event reconstruction event simulation event simulation interactive physics analysis analysis objects (extracted by physics topic) event filter (selection & reconstruction) event filter (selection & reconstruction) processed data

3 3 Tony Cass Central Data Recording  CDR marks the boundary between the experiment and the central computing facilities.  It is a loose boundary which depends on an experiment’s approach to data collection and analysis.  CDR developments are also affected by –network developments, and –event complexity. detector raw data event filter (selection & reconstruction) event filter (selection & reconstruction)

4 4 Tony Cass Monte Carlo Simulation  From a physics standpoint, simulation is needed to study –detector response –signal vs. background –sensitivity to physics parameter variations.  From a computing standpoint, simulation –is CPU intensive, but –has low I/O requirements. Simulation farms are therefore good testbeds for new technology: –CSF for Unix and now PCSF for PCs and Windows/NT. event simulation event simulation

5 5 Tony Cass Data Reconstruction  The event reconstruction stage turns detector information into physics information about events. This involves –complex processing »i.e. lots of CPU capacity –reading all raw data »i.e lots of input, possibly read from tape –writing processed events »i.e. lots of output which must be written to permanent storage. event summary data raw data event reconstruction event reconstruction

6 6 Tony Cass Batch Physics Analysis  Physics analysis teams scan over all events to find those that are interesting to them. –Potentially enormous input »at least data from current year. –CPU requirements are high. –Output is “small” »O(10 2 )MB –but there are many different teams and the output must be stored for future studies »large disk pools needed. batch physics analysis batch physics analysis event summary data analysis objects (extracted by physics topic)

7 7 Tony Cass Symmetric MultiProcessor Model Experiment Tape Storage TeraBytes of disks

8 8 Tony Cass Scalable model—SP2/CS2 Experiment Tape Storage TeraBytes of disks

9 9 Tony Cass Distributed Computing Model Experiment Tape Storage Disk Server CPU Server Switch

10 10 Tony Cass Today’s CORE Computing Systems 1998!

11 11 Tony Cass Today’s CORE Computing Systems PaRC Engineering Cluster PaRC Engineering Cluster CERN Network Home directories & registry Central Data Services Shared Disk Servers CORE Physics Services CER N 32 IBM, DEC, SUN servers SHIFT Data intensive services 200 computers, 550 processors (DEC, H-P, IBM, SGI, SUN, PC) 25 TeraBytes embedded disk 200 computers, 550 processors (DEC, H-P, IBM, SGI, SUN, PC) 25 TeraBytes embedded disk 2 TeraByte disk 10 SGI, DEC, IBM servers 2 TeraByte disk 10 SGI, DEC, IBM servers 4 tape robots 90 tape drives Redwood, 9840 DLT, IBM 3590, 3490, 3480 EXABYTE, DAT, Sony D1 4 tape robots 90 tape drives Redwood, 9840 DLT, IBM 3590, 3490, 3480 EXABYTE, DAT, Sony D1 Shared Tape Servers Data Recording, Event Filter and CPU Farms for NA45, NA48, COMPASS consoles & monitors DXPLUS, HPPLUS, RSPLUS,LXPLUS, WGS Interactive Services 70 systems (HP, SUN, IBM, DEC, Linux) RSBATCH Public BatchService 32 PowerPC 604 NAP - accelerator simulation service NAP - accelerator simulation service 10-CPU DEC 8400 10 DEC workstations 10-CPU DEC 8400 10 DEC workstations Simulation Facility 25 H-P PA-RISC 25 H-P PA-RISC CSF - RISC servers PCSF - PCs & NT 10 PentiumPro 25 Pentium II 10 PentiumPro 25 Pentium II 60 dual processor PCs 60 dual processor PCs 13 DEC workstations 3 IBM workstations 13 DEC workstations 3 IBM workstations PC Farms

12 12 Tony Cass Interactive Physics Analysis  Interactive systems are needed to enable physicists to develop and test programs before running lengthy batch jobs. –Physicists also »visualise event data and histograms »prepare papers, and »send Email  Most physicists use workstations—either private systems or central systems accessed via an Xterminal or PC.  We need an environment that provides access to specialist physics facilities as well as to general interactive services. analysis objects (extracted by physics topic)

13 13 Tony Cass Unix based Interactive Architecture Backup & Archive Reference Environments CORE Services Optimized Access X Terminals PCs Private Workstations. WorkGroup Server Clusters PLUS CLUSTERS Central Services (mail, news, ccdb, etc.) ASIS : Replicated AFS Binary Servers AFS Home Directory Services General Staged Data Pool X-terminal Support CERN Internal Network

14 14 Tony Cass PC based Interactive Architecture

15 15 Tony Cass Event Displays Event displays, such as this ALEPH display help physicists to understand what is happening in a detector. A Web based event display, WIRED, was developed for DELPHI and is now used elsewhere. Clever processing of events can also highlight certain features—such as in the V-plot views of ALEPH TPC data. Standard X-Y view V-plot view

16 16 Tony Cass Data Analysis Work By selecting a dE/dx vs. p region on this scatter plot, a physicist can choose tracks created by a particular type of particle. Most of the time, though, physicists will study event distributions rather than individual events. RICH detectors provide better particle identification, however. This plot shows that the LHCb RICH detectors can distinguish pions from kaons efficiently over a wide momentum range. Using RICH information greatly improves the signal/noise ratio in invariant mass plots.

17 17 Tony Cass CERN’s Network Connections CERN RENATER C-IXP IN2P3 TEN-155 C&W (US) ATM Test Beds SWITCH 39/155 Mb/s 6 Mb/s 2Mb/s 100 Mb/s 12/20Mb/s 100 Mb/s 155 Mb/s National Research Networks Mission Oriented Link Public Test Commercial WHO TEN-155: Trans- European Network at 155Mb/s 2Mb/s

18 18 Tony Cass CERN’s Network Traffic May - June 1999 CERN 4.5Mb/s Out 3.7Mb/s In C&W (US)RENATERTEN-155IN2P3SWITCH 100Mb/s 2Mb/s 20Mb/s 40Mb/s 6Mb/s Link Bandwidth 0.6Mb/s 1.9Mb/s 2.5Mb/s 1Mb/s 0.1Mb/s 1.7Mb/s 1.8Mb/s 0.1Mb/s ~1 TB/month in each direction 1TB/month = 3.86Mb/s 1Mb/s = 10GB/day Incoming data rate Outgoing data rate

19 19 Tony Cass Outgoing Traffic by Protocol May 31 st -June 6 th 1999 0 50 100 150 200 250 300 350 ftpwwwXafsintrfiomailnewsotherTotal Protocol GigaBytes Transferred Elsewhere USA Europe

20 20 Tony Cass Incoming Traffic by Protocol May 31 st -June 6 th 1999 0 50 100 150 200 250 300 350 ftpwwwXafsintrfiomailnewsotherTotal Protocol GigaBytes Transferred Elsewhere USA Europe

21 21 Tony Cass European & US Traffic Growth Feb ’97-Jun ’98 USA EU Start of TEN-34 connection 1998!

22 22 Tony Cass European & US Traffic Growth Feb ’98-Jun ’99 USA EU

23 23 Tony Cass Traffic Growth Jun 98 - May/Jun 99 Total Outgoing Incoming 0.00 1 2.00 3.00 4 ftpwwwXafsintrfiomailnewsotherTotal 0.00 1 2.00 3.00 4.00 5.00 6.00 7.00 8 ftpwwwXafsintrfiomailnewsotherTotal 0.00 1 2.00 3.00 4.00 5.00 6.00 7.00 8 ftpwwwXafsintrfiomailnewsotherTotal 0.00 1 2.00 3.00 4.00 5.00 6.00 7.00 8 ftpwwwXafsintrfiomailnewsotherTotal EU OtherUS

24 24 Tony Cass Round Trip times and Packet Loss rates Round trip times for packets to SLAC 5 seconds! Packet Loss rates to/from the US on the CERN link [But traffic to, e.g., SLAC passes over other links in the US and these may also lose packets.] [This is measured with ping; A packet must arrive and be echoed back; if it is lost, it does not give a Round Trip Time value.] 1998 Figures.

25 25 Tony Cass Looking at Data—Summary  Physics experiments generate data! –and physcists need to simulate real data to model physics processes and to understand their detectors.  Physics data must be processed, stored and manipulated.  [Central] computing facilities for physicists must be designed to take into account the needs of the data processing stages –from generation through reconstruction to analysis  Physicists also need to –communicate with outside laboratories and institutes, and to –have access to general interactive services.


Download ppt "1999 Summer Student Lectures Computing at CERN Lecture 2 — Looking at Data Tony Cass —"

Similar presentations


Ads by Google