A Toolkit for the modeling of Multi-parametric fit problems

A Toolkit for the modeling of Multi-parametric fit problems
Luca Lista INFN Napoli

Motivation Initially developed while rewriting a fortran fitter for BaBar analysis Simultaneous estimate of: B(B J/) / B(B J/K) direct CP asymmetry More control on the code was needed to justify a bias appeared in the original fitter As much as possible of the code has to be under control and testable separately

Requirements Provide Tools for modeling parametric fit problems
Unbinned Maximum Likelihood (UML[*]) fit of: PDF parameters Yields of different sub-samples Both, mixed 2 fits Toy Monte Carlo to study the fit properties Fitted parameter distributions Pulls, Bias, Confidence level of fit results [*] not Unified Modeling Language …  …

Design issues Trying to optimize as much as possible the PDF code
Gets called a large number of times Yes, it can be done in C++: Addressed with Template Metaprogramming No need to use virtual functions The underlying minimization engine is Minuit, as always Wrapped in different flavours (ROOT, …)

PDF interface class PdfFlat { public: typedef double type; enum { variables = 1 }; PdfFlat( double a, double b ) : min( a ), max( b ) { } double operator()( type * v ) const type x = *v; return ( x < min || x > max ? 0 : 1 / ( max - min ) ); } double min, max; }; class PdfPoissonian { public: typedef int type; enum { variables = 1 }; PdfPoissonian( double m ) : mean( m ) { } double operator()( type * v ) const type n = *v; return ( exp( - mean ) * pow( mean, n ) / TMath::Factorial( n ) ); } double mean; }; Variable type Returns P(x) Variable set Returns dP(x)/dx The user can define its own pdfs with the above interface

Random number generators
template< class Pdf, class Generator = RootRandom > class RandomGenerator { public: typedef typename Pdf::type type; RandomGenerator( const Pdf& pdf ); void generate( type * v ) const; }; template< class Generator > class RandomGenerator< PdfFlat, Generator > { public: typedef PdfFlat Pdf; typedef typename Pdf::type type; RandomGenerator( const Pdf& pdf ) : _min( pdf.min ), _max( pdf.max ) { } void generate( double * v ) const v[ 0 ] = Generator::shootFlat( _min, _max ); } private: const double& _min, &_max; }; Partial specialization Generic Random engine: Root, CLHEP, … The user can define its own generators with the preferred method

Combining PDFs n1 + n2 variables Product of pdfs
template<class Pdf1, class Pdf2> class PdfIndependent2 { public: typedef typename Pdf1::type type; enum { variables = Pdf1::variables + Pdf2::variables }; PdfIndependent2( const Pdf1& pdf1, const Pdf2& pdf2 ) : _pdf1( pdf1 ), _pdf2( pdf2 ) { } double operator()( double * val ) const { return _pdf1( val ) * _pdf2( val + Pdf1::variables ); } private: const Pdf1 &_pdf1; const Pdf2 &_pdf2; }; template<class Pdf1, class Pdf2, class Pdf3> class PdfIndependent3 { ... }; template<class Pdf1, class Pdf2, class Pdf3, class Pdf4> class PdfIndependent4 { ... }; n1 + n2 variables Product of pdfs

Transformation of variables
template<class Pdf, class Transformation> class PdfTransformed { public: typedef typename Pdf::type type; enum { variables = Pdf::variables }; PdfTransformed( const Pdf& pdf, const Transformation& trans ) : _pdf( pdf ), _trans( trans ) { } double operator()( double * val ) const double x[ variables ]; copy( val, val + variables, x ); _trans( x ); return _pdf( x ); } private: const Pdf &_pdf; Transformation _trans; }; The Jacobian must be 1! typedef PdfIndependent2<PdfGaussian, PdfGaussian> PdfBasic; typedef PdfTransformed<PdfBasic, Rotation2D> Pdf; PdfGaussian g1( 0, 0.1 ), g2( 0, 1 ); Pdf pdf( PdfBasic( g1, g2 ), Rotation2D( M_PI / 4 ) ); Client code example

A data sample to be fitted
template< int n, class type = double > class Sample { public: typedef std::vector<type *> container; typedef typename container::size_type size_type; typedef typename container::iterator iterator; typedef typename container::const_iterator const_iterator; ~Sample() { for( iterator i = begin(); i != end(); i ++ ) delete [] *i; } size_type size() const { return _v.size(); } const type* operator[]( int i ) const { return _v[ i ]; } iterator begin() { return _v.begin(); } iterator end() { return _v.end(); } const_iterator begin() const { return _v.begin(); } const_iterator end() const { return _v.end(); } type * extend() { _v.push_back( new type[ n ] ); return _v.back(); } private: container _v; }; Fixed number of variables Basically, a vector of double*

UML PDF Parameter fit Fixed PDF for MC generation Variable PDF
const int sig = 100; double mean = 0, sigma = 1; PdfConstant p( sig ); // alternative: PdfPoissonian PdfGaussian q( mean, sigma ); Experiment<PdfConstant, PdfGaussian> experiment( p, q ); PdfGaussian pdf( mean, sigma ); Likelihood<PdfGaussian> like( pdf ); UMLParameterFitter<Likelihood<PdfGaussian> > fitter( like ); fitter.addParameter( "mean", & pdf.mean ); fitter.addParameter( "sigma", & pdf.sigma ); for ( int i = 0; i < 5000; i++ ) { Sample< 1 > sample; experiment.generate( sample ); double par[ 2 ] = { mean, sigma }, err[ 2 ] = { 1, 1 }, logLike; logLike = fitter.fit( sample.begin(), sample.end(), par, err ); double pullm = ( par[ 0 ] - mean ) / err[ 0 ]; double pulls = ( par[ 1 ] - sigma ) / err[ 1 ]; } Fixed PDF for MC generation Variable PDF For fitting Parameters linked to the fitter

Parameter fit Results (Pulls)
There is a bias (as expected): 2 = 1/ni(xi-)2  1/n-1i(xi-)2

UML Yield fit In 2 dimensions: Flat background Gaussian signal
const int sig = 10, bkg = 5; typedef PdfPoissonian Fluctuation; // alternative: PdfConstant Fluctuation fluctuationSig( sig ), fluctuationBkg( bkg ); typedef PdfIndependent2< PdfGaussian, PdfGaussian > PdfSig; typedef PdfIndependent2< PdfFlat, PdfFlat > PdfBkg; PdfSig pdfSig( PdfGaussian( 0, 1 ), PdfGaussian( 0, 0.5 ) ); PdfBkg pdfBkg( PdfFlat( -5, 5 ), PdfFlat( -5, 5 ) ); typedef Experiment< Fluctuation, PdfSig > ToySig; typedef Experiment< Fluctuation, PdfBkg > ToyBkg; ToySig toySig( fluctuationSig, pdfSig ); ToyBkg toyBkg( fluctuationBkg, pdfBkg ); Experiment2< ToySig, ToyBkg > toy( toySig, toyBkg ); typedef ExtendedLikelihood2< PdfSig, PdfBkg > Likelihood; Likelihood like( pdfSig, pdfBkg ); UMLYieldFitter< Likelihood > fitter( like ); for ( int i = 0; i < 5000; i++ ) { Sample< 2 > sample; toy.generate( sample ); double s[] = { sig, bkg }, err[] = { 1, 1 }; double logLike = fitter.fit( sample.begin(), sample.end(), s, err ); double pull1 = ( s[0] - sig ) / err[0] ), pull2 = ( ( s[1] - bkg ) / err[1] ); } In 2 dimensions: Flat background Gaussian signal

Yield fit Results (Pulls)
<b> = 5 Discrete structure because of low statistics Poisson fluctuation

Combined Yield and parameter fit
const int sig = 10, bkg = 5; typedef PdfPoissonian Fluctuation; Fluctuation fluctuationSig( sig ), fluctuationBkg( bkg ); typedef PdfIndependent2< PdfGaussian, PdfGaussian > PdfSig; typedef PdfIndependent2< PdfFlat, PdfFlat > PdfBkg; PdfGaussian g1( 0, 1 ), g2( 0, 0.5 ); PdfFlat f1( -5, 5 ), f2( -5, 5 ); PdfSig pdfSig( g1, g2 ); PdfBkg pdfBkg( f1, f2 ); typedef Experiment<Fluctuation, PdfSig> ToySig; PdfBkg> ToyBkg; ToySig toySig( fluctuationSig, pdfSig ); ToyBkg toyBkg( fluctuationBkg, pdfBkg ); Experiment2<ToySig, ToyBkg> toy( toySig, toyBkg ); typedef ExtendedLikelihood2<PdfSig, PdfBkg> Likelihood; PdfGaussian G1( 0, 1 ); PdfSig pdfSig1( G1, g2 ); Likelihood like( pdfSig1, pdfBkg ); UMLYieldAndParameterFitter<Likelihood> fitter( like ); fitter.addParameter( "mean", & G1.mean ); double pull1, pull2, pull3; for ( int i = 0; i < 5000; i++ ) { Sample< 2 > sample; toy.generate( sample ); double s[] = { sig, bkg, 0 }; double err[] = { 1, 1, 1 }; double logLike = fitter.fit( sample.begin(), sample.end(), s, err ); pull1 = ( s[ 0 ] - sig ) / err[ 0 ]; pull2 = ( s[ 1 ] - bkg ) / err[ 1 ]; pull3 = ( s[ 2 ] - 0 ) / err[ 2 ]; }

Combined fit Results (Pulls)

Support for 2 fit Still no support for Correlated errors! class Line
{ public: Line( double A, double B ) : a( A ), b( B ) { } double operator()( double v ) const { return a + b * v; } double a, b; }; Line line( 0, 1 ); Chi2<Line> chi2line( line, partition ); Chi2Fitter<Chi2<Line> > fitter1( chi2line ); fitter1.addParameter( "a", &line.a ); fitter1.addParameter( "b", &line.b ); Parabola para( 0, 1, 0 ); Chi2<Parabola> chi2para( para, partition ); Chi2Fitter<Chi2<Parabola> > fitter2( chi2para ); fitter2.addParameter( "a", &para.a ); fitter2.addParameter( "b", &para.b ); fitter2.addParameter( "c", &para.c ); // ... } Still no support for Correlated errors!

Application to B(B J/) / B(B J/K)
Four variables: B reconstructed mass Beam - B energy in the  mass hypothesis Beam – B energy in the K mass hypothesis B meson charge Two samples: J/ , J/ ee Simultaneous fit of: Total yield of B J/, B J/K and background Charge asymmetry Resolution and energy shitfs separately for J/ , J/ ee

B(B J/) / B(B J/K), cont.
“Peculiar” distribution of kinematical variables Non trivial variable transformation to factorize the pdfs Different samples factorize w.r.t. different variable combinations Real experience: Code much more manageable and under control Different components are testable separately Pdf, random generation Different approaches to the fits Getting confidence in the fit results require massive testing

Model for independent PDFs
EK D D E

Dealing with kinematical pre-selection
-120 MeV < E, EK < 120MeV A B D C A B D C The area is preserved after the trasformation

Extracting the signal B J/K B J/ Background Likelihood
J/y  ee events Background Likelihood projection J/y  mm events

Possible improvement Tools for upper limit extraction based on Toy Monte Carlo Adaptable code available (from B  analysis) Support for 2 fit with full covariance matrix Provide more “standard” pdfs and random generators Exponential, Argus, Crystal ball, … Generic Hit-or-miss generator, etc. Managing singular PDF Delta-Dirac components Factorize out dependence on ROOT, CLHEP, etc. Done for random generators Can be improved for Minuit interface More thoughts about the interface Is passing a double* as set of PDF variables suitable?

A Toolkit for the modeling of Multi-parametric fit problems

Similar presentations

Presentation on theme: "A Toolkit for the modeling of Multi-parametric fit problems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Toolkit for the modeling of Multi-parametric fit problems

Similar presentations

Presentation on theme: "A Toolkit for the modeling of Multi-parametric fit problems"— Presentation transcript:

Similar presentations

About project

Feedback