Sept. 12-15, 2005M. Block, Phystat 05, Oxford PHYSTAT 05 - Oxford 12th - 15th September 2005 Statistical problems in Particle Physics, Astrophysics and.

Slides:



Advertisements
Similar presentations
What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Sept. 7-13, 2005M. Block, Prague, c2cr The Elusive p-air Cross Section.
8. Statistical tests 8.1 Hypotheses K. Desch – Statistical methods of data analysis SS10 Frequent problem: Decision making based on statistical information.
4/28/05 M. BlockAspen Cosmic Ray Workshop1 The Elusive p-air Cross Section Martin Block Northwestern University 1)Sifting data in the real world 2) Obtaining.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
458 More on Model Building and Selection (Observation and process error; simulation testing and diagnostics) Fish 458, Lecture 15.
April 15-19, 2007M. Block, Aspen Workshop Cosmic Ray Physics Imposing the Froissart Bound on Hadronic Interactions: Part I, p-air cross sections.
Statistics.
September 17-20, 2003.Kenichi Hatakeyama1 Soft Double Pomeron Exchange in CDF Run I Kenichi Hatakeyama The Rockefeller University for the CDF Collaboration.
April 15-19, 2007M. Block, Aspen Workshop Cosmic Ray Physics Imposing the Froissart Bound on Hadronic Interactions: Part II, Deep Inelastic Scattering.
Diffractive and Total pp cross sections at LHC K. Goulianos EDS 2009, June 29-July 3 1 Diffractive and Total pp Cross Sections at LHC Konstantin Goulianos.
Horng-Chyi HorngStatistics II41 Inference on the Mean of a Population - Variance Known H 0 :  =  0 H 0 :  =  0 H 1 :    0, where  0 is a specified.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Atmospheric Neutrino Oscillations in Soudan 2
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
M. Giorgini University of Bologna, Italy, and INFN Limits on Lorentz invariance violation in atmospheric neutrino oscillations using MACRO data From Colliders.
Forward - Backward Multiplicity in High Energy Collisions Speaker: Lai Weichang National University of Singapore.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Introduction to Linear Regression
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7.4 Estimation of a Population Mean  is unknown  This section presents.
Jan, 2007M. Block, Aspen Winter Physics Conference 1 Imposing the Froissart bound on DIS ---> New PDF's for the LHC Martin Block Northwestern University.
Introduction 2. 2.Limitations involved in West and Yennie approach 3. 3.West and Yennie approach and experimental data 4. 4.Approaches based on.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Practical Statistics for Particle Physicists Lecture 3 Harrison B. Prosper Florida State University European School of High-Energy Physics Anjou, France.
A taste of statistics Normal error (Gaussian) distribution  most important in statistical analysis of data, describes the distribution of random observations.
A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Introduction The statistical model approach is established by analysis of particle ratios of the high energy heavy ion collisions in GSI-SIS to CERN-SPS.
Hadron emission source functions measured by PHENIX Workshop on Particle Correlations and Fluctuations The University of Tokyo, Hongo, Japan, September.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Lesson 9 - R Chapter 9 Review.
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 8 Sept 23, 2005 Nanjing University of Science & Technology.
C2CR07-Lake Tahoe February 28, 2007 Rick Field – Florida/CDFPage 1 C2CR07 Rick Field University of Florida (for the CDF Collaboration) CDF Run 2 Min-Bias.
Path Integral Quantum Monte Carlo Consider a harmonic oscillator potential a classical particle moves back and forth periodically in such a potential x(t)=
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Probability and statistics review ASEN 5070 LECTURE.
Optimization of  exclusion cut for the  + and  (1520) analysis Takashi Nakano Based on Draft version of Technical Note 42.
HEP Tel Aviv University LumiCal (pads design) Simulation Ronen Ingbir FCAL Simulation meeting, Zeuthen Tel Aviv University HEP experimental Group Collaboration.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
Statistics Presentation Ch En 475 Unit Operations.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
Feb, 2006M. Block, Aspen Winter Physics Conference 1 A robust prediction of the LHC cross section Martin Block Northwestern University.
Search for exotic contributions to Atmospheric Neutrino Oscillations Search for exotic contributions to Atmospheric Neutrino Oscillations - Introduction.
Venice, 3/9/01"Neutrino Telescopes"1 High Energy Cosmic Ray and Accelerator Cross Sections Reconciled Martin M. Block Northwestern University Evanston.
Departamento de Física Teórica II. Universidad Complutense de Madrid José R. Peláez ON THE NATURE OF THE LIGHT SCALAR NONET FROM UNITARIZED CHIRAL PERTURBATION.
DirectFit reconstruction of the Aya’s two HE cascade events Dmitry Chirkin, UW Madison Method of the fit: exhaustive search simulate cascade events with.
Stano Tokar, slide 1 Top into Dileptons Stano Tokar Comenius University, Bratislava With a kind permissison of the CDF top group Dec 2004 RTN Workshop.
A New Upper Limit for the Tau-Neutrino Magnetic Moment Reinhard Schwienhorst      ee ee
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
NASSP Masters 5003F - Computational Astronomy Lecture 4: mostly about model fitting. The model is our estimate of the parent function. Let’s express.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Jixie’s Analysis Status Jixie Zhang Sep. 22th, 2010.
April 4-11, 2003 Frascati Photon 2003 Nucleon-nucleon,  p and  scattering  using factorization: the Aspen Model and analytic.
Data Modeling Patrice Koehl Department of Biological Sciences
Variational Quality Control
The Analysis of Elastic pp Scattering in the Forward Direction for PAX Experiment Energy Range. S.B. Nurushev, M.F. Runtso, Moscow Engineering Physics.
Chapter 4 Basic Estimation Techniques
Covariant Formulation of the Deuteron
Robot & Vision Lab. Wanjoo Park (revised by Dong-eun Seo)
Statistics Review ChE 477 Winter 2018 Dr. Harding.
Modelling data and curve fitting
Modeling Min-Bias and Pile-Up University of Oregon February 24, 2009
Uncertainties of Parton Distributions
Inference on the Mean of a Population -Variance Known
Probabilistic Surrogate Models
Optimization under Uncertainty
Presentation transcript:

Sept , 2005M. Block, Phystat 05, Oxford PHYSTAT 05 - Oxford 12th - 15th September 2005 Statistical problems in Particle Physics, Astrophysics and Cosmology “Sifting data in the real world” Martin Block Northwestern University

Sept , 2005M. Block, Phystat 05, Oxford “Sifting Data in the Real World”, M. Block, arXiv:physics/ (2005). “Fishing” for Data

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford Generalization of the Maximum Likelihood Function

Sept , 2005M. Block, Phystat 05, Oxford Hence,minimize  i  (z), or equivalently, we minimize  2   i  2 i

Sept , 2005M. Block, Phystat 05, Oxford Problem with Gaussian Fit when there are Outliers

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford Robust Feature: w(z)  1/  i 2 for large  i 2

Sept , 2005M. Block, Phystat 05, Oxford Lorentzian Fit used in “Sieve” Algorithm

Sept , 2005M. Block, Phystat 05, Oxford Why choose normalization constant  =0.179 in Lorentzian  0 2 ? Computer simulations show that the choice of  =0.179 tunes the Lorentzian so that minimizing  0 2, using data that are gaussianly distributed, gives the same central values and approximately the same errors for parameters obtained by minimizing these data using a conventional  2 fit. If there are no outliers, it gives the same answers as a  2 fit. Hence, using the tuned Lorentzian  0 2, much like using the Hippocratic oath, does “no harm”.

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford “Sieve’’ Algorithm: SUMMARY

Sept , 2005M. Block, Phystat 05, Oxford All cross section data for E cms > 6 GeV, pp and pbar p, from Particle Data Group

Sept , 2005M. Block, Phystat 05, Oxford All  data (Real/Imaginary of forward scattering amplitude), for E cms > 6 GeV, pp and pbar p, from Particle Data Group

Sept , 2005M. Block, Phystat 05, Oxford We use real analytical amplitudes that saturate the Froissart bound with the term ln 2 ( /m), where is the laboratory energy and m is the proton (pion) mass. We simultaneously fit the cross section  and  (the ratio of the real to the imaginary portion of the forward scattering amplitude), where: Fitting the “Sieved” pp and  p data with analytic amplitudes

Sept , 2005M. Block, Phystat 05, Oxford Only 3 Free Parameters However, only 2, c 1 and c 2, are needed in cross section fits !

Sept , 2005M. Block, Phystat 05, Oxford Cross section model fits for E cms > 6 GeV, anchored at 4 GeV, pp and pbar p, after applying “Sieve” algorithm to Real World data

Sept , 2005M. Block, Phystat 05, Oxford  -value fits for E cms > 6 GeV, anchored at 4 GeV, pp and pbar p, after applying “Sieve” algorithm

Sept , 2005M. Block, Phystat 05, Oxford What the “Sieve” algorithm accomplished for the pp and pbar p data Before imposing the “Sieve algorithm:  2 /d.f.=5.7 for 209 degrees of freedom; Total  2 = After imposing the “Sieve” algorithm: Renormalized  2 /d.f.=1.09 for 184 degrees of freedom, for  2 i > 6 cut; Total  2 = Probability of fit ~0.2. The 25 rejected points contributed 981 to the total  2, an average  2 i of ~39 per point. Similar results were found when fitting  + p and  - p data from the Particle Data Group (not shown due to lack of time!)

Sept , 2005M. Block, Phystat 05, Oxford Cross section and  -value predictions for pp and pbar-p The errors are due to the statistical uncertainties in the fitted parameters LHC prediction Cosmic Ray Prediction

Sept , 2005M. Block, Phystat 05, Oxford 100 data points, gaussianly distributed on the straight line y=1-2x; 20 noise points, randomly distributed, with  2 i >6. After  2 i >6 cut: Best fit is y= x; R  2 min / =1.01; fit to all data has  2 min / =4.8

Sept , 2005M. Block, Phystat 05, Oxford 100 data points, gaussianly distributed about the constant y=10; 40 noise points, randomly distributed, with  2 i >4. After  2 i >4 cut: Best fit is y=9.98  R  2 min / =1.09; fit to all data has  2 min / =4.39.

Sept , 2005M. Block, Phystat 05, Oxford Lessons learned from computer studies of a straight line and a constant model where  is the parameter error found in the  2 fit

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford  2 renorm =  2 obs / R -1  renorm = r  2  obs, where  is the parameter error

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford 100 data points, gaussianly distributed about the parabola y=1+2x +0.5x 2 ; 35 noise points, randomly distributed about nearby parabola y=12+2x+0.2x 2 ; We have 13 “inliers”. After  2 i >6 cut: 113 points are kept; Best fit is y= x+0.48x 2 BONUS: Seems to also work reasonably well in separating two similar distributions! What happens when we try to separate two similar distributions?

Sept , 2005M. Block, Phystat 05, Oxford log 2 ( /m p ) fit compared to log( /m p ) fit: All known n-n data

Sept , 2005M. Block, Phystat 05, Oxford  p log 2 ( /m) fit, compared to the  p even amplitude fit M. Block and F. Halzen, Phys Rev D 70, , (2004)

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford

Sept , 2005M. Block, Phystat 05, Oxford  2 renorm =  2 obs / R -1  renorm = r  2  obs, where  is the parameter error

Sept , 2005M. Block, Phystat 05, Oxford All cross section data for E cms > 6 GeV,  + p and  - p, from Particle Data Group

Sept , 2005M. Block, Phystat 05, Oxford All  data (Real/Imaginary of forward scattering amplitude), for E cms > 6 GeV,  + p and  - p, from Particle Data Group

Sept , 2005M. Block, Phystat 05, Oxford Cross section model fits for E cms > 6 GeV, anchored at 2.6 GeV,  + p and  - p, after applying “Sieve” algorithm to Real World data

Sept , 2005M. Block, Phystat 05, Oxford  -value fits for E cms > 6 GeV, anchored at 2.6 GeV,  + p and  - p, after applying “Sieve” algorithm