Download presentation
Presentation is loading. Please wait.
Published byPolly Kennedy Modified over 8 years ago
1
W.Murray PPD 1 Bill Murray RAL, CCLRC w.murray@rl.ac.uk RAL 5 th February 2007 Particle Physics Particle Physics Why do we want all those computers?Why do we want all those computers? Reconstruction of events? How do we measure things? Why all the Monte Carlo?
2
W.Murray PPD 2 Who am I? Read physics at Oxford (1984) Followed by a particle physics Ph. D. in Cambridge (1987) Been at RAL ever since Now largely working on ATLAS Preparing physics analysis The search for the Higgs Boson
3
W.Murray PPD 3 Ideas Data volumes Signal and background separation Monte Carlo – why? Difference with astrophysics, biology etc.
4
W.Murray PPD 4 Typical process (ATLAS) Proton beams circulate LHC in bunches 3564 bunches Bunches collide at fixed points around ring Every 25ns (40MHz) Each time bunches pass, some protons will be broken 'inelastic collision' These are called 'events' The protons have 7000GeV (~7000 proton masses) Enough energy to make new particles in collision We want to analyse events to extract properties, Generally of the new particles
5
W.Murray PPD 5 1 of 730 detectors built here: 12cm by 6cm 1536 readout strips 12 chips read at 40MHz Built with minimum mass Precise to a few μm
6
W.Murray PPD 6 The Semi-Conductor tracker 4 barrels: The last is being inserted in the other 3 in the picture 1 is a UK task Each has ~600 detector modules mounted The detectors are from previous slide
7
W.Murray PPD 7 Debris of a proton collision? This is a simulated proton-proton collision Pieces fly everywhere A single event can look complicated We will get 23 at once.
8
W.Murray PPD 8 ATLAS event size Total Data size 1.5MBytes 40MHz: 1PB/second We will record 200Hz i.e 0.0005% or a mere 3PB/year
9
W.Murray PPD 9 LHC Possible runs? Don't shoot me...just random guesses PB limited by bandwidth, not luminosity!
10
W.Murray PPD 10 Event Reconstruction Two main detectors components Tracking Measure particles position without disturbing them Follow their flight path Extrapolate to where they came from Calorimetry Measure their energy Stop the particle in the process Don't miss anything!
11
W.Murray PPD 11 The challenges in tracking Pattern recognition Can we work out where the particles went? And reconstruct their position, momentum etc. Occupancy: What if two particles hit 1 detector? Alignment Where are the detector components? Essential for the first problem
12
W.Murray PPD 12 ATLAS pattern recognition We do not detect particle tracks We detect energy deposited at points A segment of the ATLAS silicon tracker Each dot says a particle went here Some are noise Need to do pattern recognition ('Straw tubes' not shown)
13
W.Murray PPD 13 Part of an event Detectors measure 3D So this presentation is misleading Can look at rz or xy Still a mess
14
W.Murray PPD 14 Now do track finding Coloured lines indicated reconstructed particles Curvature shows low momentum Still many unused hits
15
W.Murray PPD 15 Pattern Recognition A combinatorics problem Find the tracks of charged particles in a known magnetic field These follow a helical path Circle in xy ≈straight in rz 5 parameters (d 0, z 0, φ, θ, p T ) 3 points define a circle in xy plane 2 would give the angle in z plane Just need to try all combinations! Not possible – clever algorithms to reduce CPU
16
W.Murray PPD 16 Track fitting We found the hits left by the track Need to extract the 5 parameters Fit for them Complex in detail Need to allow for scattering in measuring material And energy loss as particle progresses And reject possible incorrect points. Occupancy sometimes an issue: What if EVERY detector element records a hit?
17
W.Murray PPD 17 Alignment We need to know where the detectors are Precisions of order 10μm in a detector 50m long Build as well as possible Measure where you built it – and how gravity bent it Still not good enough Add laser calibration systems Measure length as detector creaks Track changes But they typically measure support structures, not sensitive elements Ultimately check: particle track self-consistency.
18
W.Murray PPD 18 Alignment via tracks If detector elements misplaced, tracks will appear distorted Adjust positions to achieve good helices Also use known quantities. e.g. Z mass is known Find pairs of tracks from Z's Constrain alignment to give correct mass etc. CDF example True mass is 91.187GeV Obtained 91.184 – not bad!
19
W.Murray PPD 19 Alignment issues To align ATLAS silicon Some 6000 modules Each has position & angle to be fixed 6 parameters Makes for 60,000 parameters Just invert 60,000 square matrix We have never yet managed this (to my knowledge) But it will take hours to days
20
W.Murray PPD 20 Calorimetry Calibration Energy measurement needs calibration too Known quantities help e.g. Z mass, measured via Z→ee in calorimeter Or compare with momentum in tracker: Here electron energy is shown, divided by momentum Also need to simulate this..red line CDF
21
W.Murray PPD 21 Circularity We need alignment parameters in order to reconstruct tracks We need tracks to obtain alignment parameters But we need alignment parameters to reconstruct tracks “So 'twas on the Modnay Morning that the gas-man came to call” We will have to iterate. Reconstruct data as it is recorded Produce improved alignment parameters Re-construct data again later.
22
W.Murray PPD 22 Making Measurements We have calibrated our detectors Reconstructed tracks, energy Now...what can we measure/discover?
23
W.Murray PPD 23 Analyse Events? That is: measure properties Mass, strengths of forces, charges, particle spins, decay rates, etc. Virtually all done by reducing data to a histogram Comparing with a simulated histogram Adjusting parameters of the simulated histogram until it matches the observed Declaring the parameter values 'measured' Why not just count e.g. The number of W bosons Efficiency and Background!
24
W.Murray PPD 24 Efficiency You have to define you signal somehow: e.g. For W boson. It decays in 10 -26 seconds. Lets look for the decay to muon plus neutrino. Need: A muon measured With momentum over some threshold A neutrino not seen (momentum does not add up) These are called 'cuts' What fraction of my signal do I see? You need to know the efficiency of the cuts They will not be 100%...partly because of need to remove background
25
W.Murray PPD 25 Background All processes other than the one we want e.g. For W to μυ: A b quark decaying to μυ A jet which is mistaken for a μ, and the mistake generates apparent momentum imbalance These look somewhat like our signal. (not very) Some of them will pass out selection cuts We need to know how many σ is the probability of making a W l is the proton collision rate Need N background and ε W
26
W.Murray PPD 26 Why is Particle Physics unusual? We have a model! There is a theory to be tested which completely specifies the expected (average) properties of the data Many branches of science do not have that It means we can be (excessively) quantitative Or make optimal use of our data
27
W.Murray PPD 27 W mass measurement Backgrounds very low But need to be known Agreement of data with fit looks great But it has to be! Mass extracted by varying model, minimising χ 2 CDF muons CDF muons CDF electrons
28
W.Murray PPD 28 Example: τ pairs in CDF Find τ particle pairs Calculate mass of parent Assuming they have one Compare data with simulated backgrounds Good signal for Z to ττ Is that all?
29
W.Murray PPD 29 Example: Φ in CDF The 'Φ' is a sort of Higgs boson Might exist Find 'tau' particle pairs Calculate mass of parent Assuming they have one Compare data with simulated backgrounds
30
W.Murray PPD 30 Example: Φ in CDF Now compare with expectation if the 'Φ' exists with mass 160GeV Could be a signal! Not clear yet.
31
W.Murray PPD 31 How will CDF decide? Collecting more data Excess seen probably just statistical fluctuation Could just go away If not, then want to study it Plot excess against other quantities Needs many more events Test background estimation Estimating the known background rate is crucial Are they all understood? Need to study the background process in some other place (where definitely no signal) Validate/tune simulation using that Then simulate the background here The best simulations are Monte Carlo
32
W.Murray PPD 32 What is Monte Carlo? The standard model of particle physics is quantum mechanical Quantum Mechanics is inherently random It tells you the probability of all possible outcomes of an experiment But not which one will happen When proton beams collide we want to simulate what will happen The Monte Carlo technique mimics real life, by rolling dice to decide which possibility will happen
33
W.Murray PPD 33 Geant 4 Geant is a package for 'Geometry and Tracking' You describe the detector geometry e.g. ATLAS You tell it the particles produced in the centre e.g. A proton, or a whole debris of a pp collision It will simulate particle progress Out through the silicon detectors Does it scatter? How much? To the calorimeter How does it deposit its energy Does any energy leak out further? It also simulates the detector response Needs detailed models of geometry, materials, interactions etc.
34
W.Murray PPD 34 Geant 4 example Scattering of low energies muons by a thin steel target Data falls by orders of magnitude Geant not perfect So problem reported
35
W.Murray PPD 35 Monte Carlo is for integration You efficiency for a certain type of event 'Certain type' must be defined The cuts to pass Monte Carlo will tell you what fraction of your signal and background pass those cuts It is a very good way to integrate. Many, many details can be included Basics cuts Resolution of detectors Unusual behaviour of particles (decay, interaction) Detector defects Very hard to write an integral for the above Would have huge number of dimensions
36
W.Murray PPD 36 Monte Carlo is slow Simulating one pion can take 500 CPU seconds Typical event more likely 30 minutes 30 minutes And LHC will spit out 1000,000,000 per second. Um.
37
W.Murray PPD 37 Simulation of a H ee event in ATLAS Geant 4 model of ATLAS
38
W.Murray PPD 38 What to simulate? We want to compare data with simulation The data was expensive We want to make best use of it We do not want to be limited by the statistics of the simulation We would like to simulate say 10 times the data In order to have smaller errors Both for the signal we are interested in And for all the backgrounds. BUT
39
W.Murray PPD 39 Background? LHC backgrounds! Every event at a lepton collider is physics; every event at a hadron collider is background Sam Ting 10
40
W.Murray PPD 40 So what should we simulate? For my current analysis 'ttH' ~1000 signal events per year 1000 x 10 10 background Or 0.5 10 13 CPU hours That is half-a billion good CPU years Perhaps if every computer on earth were used I could generate the background for my analysis
41
W.Murray PPD 41 This is ridiculous The above is not possible And we are not that stupid We can pre-select events, Adjust physics generator so it only generates things ATLAS might trigger Reject many of those at level of simulated protons and electrons i.e. before running Geant The slow step But these are approximations which need to be validated
42
W.Murray PPD 42 What do we want to achieve? We aim to make optimal use of the data collected Data are expensive: Use powerful techniques Data processing is also expensive: Mathematical perfection is not the only criterion Systematic errors may well dominate We need to be able to justify our results.
43
W.Murray PPD 43 Searching for a new particle Have we found a new signal?
44
W.Murray PPD 44 Signal Recognition Consider separating a dataset into 2 classes Call them background and signal A simple cut is not optimal Background Signal n-Dimensional space of observables. e.g. E T miss, num. leptons
45
W.Murray PPD 45 The right answer II What is optimal? Background Signal Maybe something like this might be...
46
W.Murray PPD 46 The right answer III For an given efficiency, we want to minimize background Leading to the Likelihood Ratio, L s /L b ● Sort by signal to background ratio around event ● Accept all areas with s/b above some threshold
47
W.Murray PPD 47 Determination of s, b densities We may know matrix elements Not for e.g. a b-tag But anyway there are detector effects Usually taken from simulation
48
W.Murray PPD 48 Using MC to calculate density Brute force: Divide our n-D space into hypercubes with m divisions of each axis m n elements, need 100 m n events for 10% estimate. e.g. 1,000,000,000 for 7 dimensions and 10 bins in each This assumed a uniform density – actually need far more The purpose was to separate different distributions
49
W.Murray PPD 49 Better likelihood estimation Clever binning Starts to lead to tree techniques Kernel density estimators Size of kernel grows with dimensions Edges are an issue Ignore correlations in variables Very commonly done ‘I used likelihood’ Pretend measured=true, correct later Again using Monte Carlo to correct
50
W.Murray PPD 50 Alternative approaches Neural nets Frequently used in HEP, good for high-dimensions Support vector machines Computationally easier than kernel Decision trees Boosted or not?
51
W.Murray PPD 51 How to calculate densities
52
W.Murray PPD 52 Kernel Likelihoods ● Directly estimate Probability Density Function of distributions based upon training sample events. ● Some kernel, usually Gaussian, smears the sample ● increases widths ● Width of kernel must be optimised ● Fully optimal if infinite MC statistics ● Then we can use narrow kernels
53
W.Murray PPD 53 Smearing 5 events, 1D The kernel width is crucial Problem much worse in higher dimensions
54
W.Murray PPD 54 Varying event nos. More events always helps
55
W.Murray PPD 55 Kernel Likelihoods: nDim ● Size and aspect ratio hard to optimize ● Watch kernel size dependence on stats. ● Kernel size must grow with dimensions; ● Lose precision if unnecessary dimensions added ● Need to choose which variables to use ● Big storage/computational requirements
56
W.Murray PPD 56 Summary of classification: Looking for Needles in haystacks – the Higgs particle ‘Optimal’ statistics have poor scaling likelihood techniques N 3 For large data sets main errors are not statistical As data and computers grow with Moore’s Law, we can only keep up with N logN A way out? Discard notion of optimal (data is fuzzy, answers are approximate) Don’t assume infinite computational resources or memory Requires combination of statistics & computer science
57
W.Murray PPD 57 Conclusions The LHC will start soon Producing a torrent of data The conclusions of the 43 year old hunt for the Higgs boson is close But it will take a lot of analysis to get there The data volumes & CPU times mean all our ingenuities will be taxed to deal with them But any exciting new discoveries would still required using all that data – and wanting more.
58
W.Murray PPD 58 H → ZZ → l + l - l + l - Golden channel m H >140GeV/c 2 Above ~200 two real Z's Good mass resolution, trigger Backgrounds: Irreducible QCD ZZ to llll Reducible Zbb, tt Multivariate (p t, η ) methods for low m H ATLAS toroids help
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.