Download presentation
Presentation is loading. Please wait.
1
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper1 Statistical Tools A Few Comments Harrison B. Prosper Florida State University PHYSTAT Workshop 2004 1-2 March 2004
2
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper2 Outline Issues Wish List Example Summary
3
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper3 Statistical Tools: Issues Some difficulties with tools used in HEP Difficult to express ideas cleanly and clearly Tools scattered over different (typically, monolithic) programs Interface between heterogeneous data formats and disparate tools is a headache Histograms are tightly coupled to their viewers Algebra of histograms relatively crude Inadequate support for systematic study of ensembles
4
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper4 Issues – II In a systematic statistical study one may wish to: Generate different ensembles of observations, possibly with conditioning, and study various statistical properties (bias, variance, coverage etc.) Assess robustness with respect to prior densities and likelihoods Study different confidence limit procedures Study different optimization criteria
5
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper5 Issues – III One may wish to study: Type I and type II error rates Consistency – both convergence to, and rate of convergence to, the true answer as sample size increases Probability densities p(z) given underlying distributions p(x)
6
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper6 Wish List Decoupling Statistical tool separate from, and independent of, the environment in which it might be used. However, provide bindings for different environments/languages (R, Root, Python, Java, etc.) Modularity Each statistical tool encapsulates a single coherent statistical idea. Avoid monoliths. Histograms Histogram and histogram viewers independent of each other. (A sensible idea from Marc Paterno!) Elegant algebra of histograms h = a*h 1 +b*h 2 /h 3 etc. Powerful, intuitive tools for multi-dim. data exploration
7
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper7 Wish List – II Likelihoods Flexible method for reporting them; maybe as swarms of points generated via MCMC? Frequency Methods Flexible ensemble generator, which allows easily extracted sub-ensembles Flexible query of ensembles (to get coverage, error rates, variances, bias etc.) Bayesian Methods Flexible robustness studies (prior family, likelihood family etc.) Multi-dimensional integration (adaptive and Markov chain MC)
8
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper8 Example: A Current Statistical Problem From DØ Single Top Group Set limit on (p+pbar → t + X) given an histogram for each of 4 signal channels tq(EC), tqb(EC), tq(CC), tqb(CC) 4 background sources per signal channel QCD, ttbar(l+jets), ttbar(ll), W+Jets Some histograms are weighted, some unweighted We would like to study different limit procedures, including Bayesian, and study their frequency properties. Currently using ad hoc and rather inflexible pieces of homegrown C++!
9
Statistical Tools PhyStat Workshop 2004 Harrison B. Prosper9 Summary The Good Lots of statistical tools already exist A lot more needed – opportunity for creativity! The Bad Use of current tools, however, often requires familiarity with several frameworks/languages The Ugly Lack of a simple, but powerful, language for expression of statistical ideas. Rapid “what if” analyses done with C++. This is crazy! I don’t want to think about pointers and de-referencing when I’m trying to think about mathematics.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.