PHYSTAT 05P. Catastini Bias-Free Estimation in Multicomponent Maximum Likelihood Fits with Component-Dependent Templates Pierluigi Catastini I.N.F.N. -

PHYSTAT 05P. Catastini Bias-Free Estimation in Multicomponent Maximum Likelihood Fits with Component-Dependent Templates Pierluigi Catastini I.N.F.N. - Pisa and Siena University Giovanni Punzi S.N.S. and I.N.F.N - Pisa

PHYSTAT 05P. Catastini Problem Suppose we have a sample of particles generated by a certain physics process produced by our experiment. Suppose we know that the sample is a mixture of different particle types, for example, Pions, Protons and Kaons, but the proportion of each particle type is completely unknown. Of course, our experiment is also equipped with some kind of Particle IDentification (PID) device, providing the measurement of some quantity related to the particle type. We want to measure the fractions of each particle type : f , f p, f k.

PHYSTAT 05P. Catastini A “Real Life” Problem * … * At least for a Physicist… Measuring the particle type fractions is common in particle physics: e.g. understanding the particle produced during the fragmentation of the B mesons (flavor tagging), separating different particle decays... Usually PID information provided by energy loss of charged particle in gas (dE/dx), measurement of Time of Flight, Cherenkov light… Solution obtained performing an unbinned Maximum Likelihood Fit. But remember… The mean of the PID observable strongly depends on particle momentum (which is an additional observable, known event-by- event): Component Dependent Templates ! Electrons Protons Muons Kaons Pions

PHYSTAT 05P. Catastini Please write the Likelihood ! Unfortunately, the Likelihood is not simply:  i ( f i P(pid i | , Mom i ) (WRONG!) Using the above, you may get strongly biased results if the additional observables have different distributions [1]. The reason for the failure is, quoting from [1]: “Whenever the templates used in a multi-component fit depend on additional observables, one should always use the correct, complete Likelihood expression, including the explicit distributions of all observables for all classes of events“ In our problem, the above means that we need to include the momentum distributions of each particle type (they are almost always different in practice). [1] physics/0401045 (G.Punzi, PHYSTAT03)

PHYSTAT 05P. Catastini Writing the Likelihood… Particle IDentification information is represented by a certain observable called pid; we than write the likelihood as: Given: f  + f P + f K = 1 j = Pion, Proton, Kaon L (f  f P f K ) =  i ( f  P(pid i, Mom i |  ) + f P P(pid i, Mom i | P) + (1 - f  - f P )P(pid i, Mom i | K) ) =  i (  j f j P(pid i | Mom i, type j ) P(Mom i | type j ) )

PHYSTAT 05P. Catastini A toy study of the “Real Life” Problem f  = 0.50 f P = 0.15 f K = 0.35   = 1.00 GeV/c  P = 1.25 GeV/c  K = 1.50 GeV/c   = 0.25 GeV  P = 0.25 GeV  K = 0.25 GeV Momentum (GeV/c) We generate a sample with known particle types composition as follow: PID variable is distributed according a typical resolution function (i.e. the template used in the fit) defined as PID mes - PID exp (mom): Momenta are distributed according a gaussian N( ,  ) : P(pid i | Mom i, type j ) P(Mom i | type j )

PHYSTAT 05P. Catastini Result of the Fits PionsProtons OK ! If we wouldn’t take into account the momentum distributions… PionsProtons Bias !

PHYSTAT 05P. Catastini Often in “Real Life”… Writing the complete likelihood with all observables distribution is almost straightforward. Of course, provided the assumption you can easily obtain a parameterization of those distribution… Often we have poor information about those distribution (barely acceptable, after a very hard work!), sometimes they could be even completely unknown. If, for example, the goal of the particle type fit we have been performing in the previous slides is to estimate the fractions of particle produced during the heavy quarks fragmentation… Grate! We have no idea about the functional form of each particle type’s momentum distribution. How can we write the correct Likelihood ?

PHYSTAT 05P. Catastini A solution No functional form is known in order to parameterize the missing P(Mom | type). Use a general functional form Series Expansion P(Mom | type j ) =  m a mj F m (Mom) with a mj free parameters of the fit We decide to use Orthogonal Polynomials, among them: Legendre Polynomials P i [-1,1] First type Chebyshev Polynomials T i [-1,1] Second type “ “ U i [-1,1] Lagerre Polynomials L i [0,+  ] Hermite Polynomials H i [- ,+  ] Used from 0 th to 6 th term.

PHYSTAT 05P. Catastini Our toy Replacing the exact distribution N( ,  ) with  m a mj F m (Mom) for each particle type, we fitted again our toy sample: The Bias is again corrected ! Pions OK ! Protons OK !

PHYSTAT 05P. Catastini Some Comments Of course, we are happy: although we didn’t know a priori the P(Mom | type) we have been able to avoid the bias. Please, notice that resolution on the parameter is not degraded a lot ! Just 7 terms of the series expansion were used! Not so many. Projections of P(Mom | type j ) =  m a mj F m (Mom) : Pions Protons Kaons Momentum (GeV/c)

PHYSTAT 05P. Catastini Another Complication Suppose our PID information is obtained by the measurement of the Time Of Flight (TOF). The expression of the Expected TOF is a function of 2 obsevables : TOF exp = arclength / c  sqrt(1 + m j 2 /Mom 2 ) It means that our pdf is ( after having verified that the correlation between arclength and momentum is almost zero ) : P(Mom, Arcl, pid |type j, )  P(pid | Mom, Arc, type j ) * P(Mom | type j ) * P(Arc | type j ) Both unknown ! We want to apply the same technique of series expansion both for momentum and arclength !

PHYSTAT 05P. Catastini Back to our toy L (f j, a mj, b lj ) =  i (  j f j ( P(pid i | Mom i, Arc i, type j )  m a mj F m (Mom i )  l b lj F l (Arc i )) ) Again we used 7 terms for the momenta series expantion We used 3 terms for the arclegth series expantion Fractions, pid and momentum variables generated as before Arclength distributed according a gaussian N( ,  )   =  K =  P   =  K =  P Same distribution for all particle types but in principle you don’t know !

PHYSTAT 05P. Catastini Results Pions OK ! Protons OK ! Projections of P(Arc | type j ) =  l b lj F l (Arc) : Pions Protons Kaons

PHYSTAT 05P. Catastini Conclusions We faced a common problem of particle physics were the incomplete Likelihood expression is cause of a detectable bias. We had cure it ! The proposed problem has also the complication of the lack of information about the distribution of an observable! We solved the problem, removing the bias in the fit results, including series expansion as a parameterization of the unknown distributions (coefficients free parameters determined by the fit). We even faced the case where two observables have unknown distributions. Again we used two different series expansions in order to parameterize those distribution and avoid the bias.

PHYSTAT 05P. Catastini Bias-Free Estimation in Multicomponent Maximum Likelihood Fits with Component-Dependent Templates Pierluigi Catastini I.N.F.N. -

Similar presentations

Presentation on theme: "PHYSTAT 05P. Catastini Bias-Free Estimation in Multicomponent Maximum Likelihood Fits with Component-Dependent Templates Pierluigi Catastini I.N.F.N. -"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PHYSTAT 05P. Catastini Bias-Free Estimation in Multicomponent Maximum Likelihood Fits with Component-Dependent Templates Pierluigi Catastini I.N.F.N. -

Similar presentations

Presentation on theme: "PHYSTAT 05P. Catastini Bias-Free Estimation in Multicomponent Maximum Likelihood Fits with Component-Dependent Templates Pierluigi Catastini I.N.F.N. -"— Presentation transcript:

Similar presentations

About project

Feedback