Bill Atwood, July, 2003 GLAST 1 A GLAST Analysis Agenda Overarching Approach & Strategy Flattening Analysis Variables Classification Tree Primer Sorting.

Slides:



Advertisements
Similar presentations
STAR Status of J/  Trigger Simulations for d+Au Running Trigger Board Meeting Dec5, 2002 MC & TU.
Advertisements

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Inpainting Assigment – Tips and Hints Outline how to design a good test plan selection of dimensions to test along selection of values for each dimension.
Bill Atwood, Dec GLAST 1 GLAST Energy or Humpty-Dumpty’s Revenge A Statement of the Problem Divide and Conquer strategy Easiest: Leakage (Depth)
The loss function, the normal equation,
Bill Atwood, August, 2003 GLAST 1 Covariance & GLAST Agenda Review of Covariance Application to GLAST Kalman Covariance Present Status.
Iterative Optimization of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial Intelligence.
Assessment and Proposal W. B. Atwood UCSC, April 6, 2005.
Off-axis Simulations Peter Litchfield, Minnesota  What has been simulated?  Will the experiment work?  Can we choose a technology based on simulations?
T. Burnett: IRF status 30-Jan-061 DC2 IRF Status Meeting agenda, references at:
A Search for Point Sources of High Energy Neutrinos with AMANDA-B10 Scott Young, for the AMANDA collaboration UC-Irvine PhD Thesis:
MC Study of B°  S Jianchun Wang Syracuse University BTeV meeting 06/27/01.
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status at UW Implement “nested trees” Generate “boosted” trees for all of the Atwood categories.
Hardware Failure Impacts Exercises S. Ritz. Issues Recent results from design reliability analysis (system engineering, Thurston et al.) Probabilities.
Bill Atwood, Core Meeting, 9-Oct GLAS T 1 Finding and Fitting A Recast of Traditional GLAST Finding: Combo A Recast of the Kalman Filter Setting.
Bill Atwood, SCIPP/UCSC, Dec., 2003 GLAST 1 DC1 Instrument Response Agenda Where are we? V3R3P7 Classification Trees Covariance Scaled PSF Pair Energies.
Bill Atwood, SCIPP/UCSC, Nov, 2003 GLAST 1 Update on Rome Results Agenda A eff Fall-off past 10 GeV V3R3P7 Classification Trees Covariance Scaled PSFs.
Bill Atwood, SCIPP/UCSC, August, 2005 GLAST 1 Energy Method Selections Data Set: All Gamma (GR-HEAD1.615) Parametric Tracker Last Layer Profile 4 Methods.
Bill Atwood, SCIPP/UCSC, Dec., 2003 GLAST ALL GAMMAs using the New Recons First runs of ALL GAMMA – from glast-ts and SLAC Unix farm glast-ts sample: no.
Onboard Filter Update Performance after updated cuts David Wren 26 January 2004.
C&A 8 May 06 1 Point Source Localization: Optimizing the CTBCORE cut Toby Burnett University of Washington.
Source detection at Saclay Look for a fast method to find sources over the whole sky Provide list of positions, allowing to run maximum likelihood locally.
Bill Atwood, Rome, Sept, 2003 GLAST 1 Instrument Response Studies Agenda Overarching Approach & Strategy Classification Trees Sorting out Energies PSF.
Surplus Hits Ratio, etc. Leon R. C&A Meeting &
Bill Atwood, Nov. 2002GLAST 1 Classification PSF Analysis A New Analysis Tool: Insightful Miner Classification Trees From Cuts Classification Trees: Recasting.
1 Using Insightful Miner Trees for Glast Analysis Toby Burnett Analysis Meeting 2 June 2003.
Bill Atwood - July 2002 GLAS T 1 A Kalman Filter for GLAST What is a Kalman Filter and how does it work. Overview of Implementation in GLAST Validation.
Bill Atwood, SCIPP/UCSC, DC2 Closeout, May 31, 2006 GLAST 1 Bill Atwood & Friends.
MC Study on B°  J/  ° With J/      °     Jianchun Wang Syracuse University BTeV meeting 03/04/01.
Classification and Regression Trees for Glast Analysis: to IM or not to IM? Toby Burnett Data Challenge Meeting 15 June 2003.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 Back Ground Rejection Status for DC2 a la Ins. Miner Analysis.
Bill Atwood, SCIPP/UCSC, Jan., 2006 GLAST 1 The 3 rd Pass Back Rejection Analysis using V7R3P4 (repo) 1)Training sample: Bkg: Runs (SLAC) v7r3p4.
Tracker Reconstruction SoftwarePerformance Review, Oct 16, 2002 Summary of Core “Performance Review” for TkrRecon How do we know the Tracking is working?
Ensemble Learning (2), Tree and Forest
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
30 Ge & Si Crystals Arranged in verticals stacks of 6 called “towers” Shielding composed of lead, poly, and a muon veto not described. 7.6 cm diameter.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Decision Trees Advanced Statistical Methods in NLP Ling572 January 10, 2012.
Chapter 9 – Classification and Regression Trees
DC2 at GSFC - 28 Jun 05 - T. Burnett 1 DC2 C++ decision trees Toby Burnett Frank Golf Quick review of classification (or decision) trees Training and testing.
Pass 6: GR V13R9 Muons, All Gammas, & Backgrounds Muons, All Gammas, & Backgrounds Data Sets: Muons: allMuon-GR-v13r9 (14- Jan-2008) AG: allGamma-GR-v13r9.
A statistical test for point source searches - Aart Heijboer - AWG - Cern june 2002 A statistical test for point source searches Aart Heijboer contents:
Point Source Search with 2007 & 2008 data Claudio Bogazzi AWG videconference 03 / 09 / 2010.
Training of Boosted DecisionTrees Helge Voss (MPI–K, Heidelberg) MVA Workshop, CERN, July 10, 2009.
Bill Atwood, SCIPP/UCSC, May, 2005 GLAST 1 A Parametric Energy Recon for GLAST A 3 rd attempt at Energy Reconstruction Keep in mind: 1)The large phase-space.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 5-Inducción de árboles de decisión (2/2) Eduardo Poggi.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Feb. 7, 2007First GLAST symposium1 Measuring the PSF and the energy resolution with the GLAST-LAT Calibration Unit Ph. Bruel on behalf of the beam test.
Radiation Detection and Measurement, JU, 1st Semester, (Saed Dababneh). 1 Radioactive decay is a random process. Fluctuations. Characterization.
Resolution and radiative corrections A first order estimate for pbar p  e + e - T. H. IPN Orsay 05/10/2011 GDR PH-QCD meeting on « The nucleon structure.
Preliminary results for the BR(K S  M. Martini and S. Miscetti.
ANOVA, Regression and Multiple Regression March
Optimization of Analysis Cuts for Oscillation Parameters Andrew Culling, Cambridge University HEP Group.
Beam Extrapolation Fit Peter Litchfield  An update on the method I described at the September meeting  Objective;  To fit all data, nc and cc combined,
Event Analysis for the Gamma-ray Large Area Space Telescope Robin Morris, RIACS Johann Cohen-Tanugi SLAC.
00 Cooler CSB Direct or Extra Photons in d+d  0 Andrew Bacher for the CSB Cooler Collaboration ECT Trento, June 2005.
GLAST/LAT analysis group 31Jan05 - T. Burnett 1 Creating Decision Trees for GLAST analysis: A new C++-based procedure Toby Burnett Frank Golf.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 15: Mixtures of Experts Geoffrey Hinton.
1 HBD Update Itzhak Tserruya DC meeting, May 7, 2008 May7, 2008.
January 21, 2005Peter Rovegno and David Williams, Milagro Collaboration Meeting 1 Background Rejection using Angle Fit Quality Source analysis has an implicit.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Gamma-ray Large Area Space Telescope ACD Final Performance
Comparison of GAMMA-400 and Fermi-LAT telescopes
GLAST Large Area Telescope:
Issues in Decision-Tree Learning Avoiding overfitting through pruning
EE513 Audio Signals and Systems
EVENT PROJECTION Minzhao Liu, 2018
Overview of Beam Test Benoit, Ronaldo and Eduardo Nov 8 , 2005 thanks to Steve and Bill for helping to consolidate all the information.
F test for Lack of Fit The lack of fit test..
Presentation transcript:

Bill Atwood, July, 2003 GLAST 1 A GLAST Analysis Agenda Overarching Approach & Strategy Flattening Analysis Variables Classification Tree Primer Sorting out Energies PSF Analysis Background Rejection Assessment

Bill Atwood, July, 2003 GLAST 2 Strategy Terminology and GLAST Phase space: Light Gathering Power: A eff x  GLAST S.R.: 8000 cm 2 x 2.0 str = cm 2 -str Goal: cm 2 x 2.4 str = cm 2 -str Triggerable: x.65 x 2.4str = cm 2 -str EGRET: ~ 1000 cm 2 x.6 str = 600 cm 2 -str Input Data: All Gamma: 18 MeV – 18 GeV into 6 m 2 x 2  str (= 37.7 m 2 -str) Energy Spectrum: 1/E (Flat in Log(E)) “Pre-Cuts”: AcdActiveDist 0 Background: Generic On-Orbit Mix - same A eff x  Variables: To cover GLAST Phase space – make variables independent of Energy and cos(  ) Alternative: make analysis “cuts” energy and angle dependent Key Methodology: Classification Trees

Bill Atwood, July, 2003 GLAST 3 Strategy 2 Game Plan: 1) Flatten important variables used in the analysis 2) Use CT technology to determine events with “well measured” energies 3) Use CT technology to determine events with “well measured” directions 4) Filter background events (BGE’s) and  ’s through the above CT scripts and form training and testing samples for background rejection 5) Use CT technology to separate  ’s from BGE’s

Bill Atwood, July, 2003 GLAST 4 Flattening the Variables Many analysis variables vary (albeit) slowly with energy and cos(  ). Assume averages can be modeled by Least Squares Fit to 2 nd order. First do log(E) dependence:

Bill Atwood, July, 2003 GLAST 5 Flatten 2 Next do cos(  ): Variables which have been “flattened” include: Tkr1Chisq Tkr2Chisq EvtTkrComptonRatio EvtCalXtalTrunc Tkr1FirstChisq Tkr2FirstChisq EvtCalTLRatio EvtCalTrackDoca Tkr1Qual Tkr2Qual EvtCalXtalRatio EvtCalTrackSep EvtVtxEAngle EvtVtxDoca EvtVtxHeadSep

Bill Atwood, July, 2003 GLAST 6 Classification Tree Primer Origin: Social Sciences How a CT works is simple: A series of “cuts” parse the data into a “tree” like structure where final nodes (leaves) are “pure” How the Cuts are determined is harder: (Called Partitioning) Total Likelihood for a tree is: where p ik are the probabilities and n ik are the number of events. For each node define a deviance Splitting node i into two smaller nodes s & t Results in a reduction in deviances given by where

Bill Atwood, July, 2003 GLAST 7 Tree Primer (2) The probabilities are not know a priori so the event counts in the training sample are used. Example: From this the value of a split can be determined by Note that splitting nodes with large numbers of events is favored. Splitting of each node continues until change in deviance is too small or the number of events in the node has fallen below a minimum. Tree construction is a “look one step ahead” process – it does not necessarily find the ultimate optimal tree. Trees readily adapt to the “training” data if the event count in the leaves or the deviance reduction at each split is allowed to be too small.

Bill Atwood, July, 2003 GLAST 8 Sorting Out the Energies Energy Types: 1) No CAL Events: < 5 MeV OR < 2 r.l. in CsI 2) Low CAL Events: < 100 MeV 3)High CAL Events: > 100 MeV Percentages: 46% 13% 41% Good Energy Definition Model: (Maps energy errors onto a common scale. Example: for  Energy =.1 (GLAST Nominal)  E 100 MeV = 12 MeV &  E 1000 MeV = 84 MeV) No CAL: -.4 <  E/E < 1.5 (-60% + 150%) Low/Hi CAL: -.5 <  E/E <.5 (+- 50%) Break Down of Energy Classes

Bill Atwood, July, 2003 GLAST 9 Energy Classes NoCal CalLow

Bill Atwood, July, 2003 GLAST 10 Energy Classes Energy Class Break Down Prob. > 50% 100 MeV 1 GeV 10 GeV CalHigh

Bill Atwood, July, 2003 GLAST 11 Energy Summary Event Loss = 15% A eff x  2.74 m 2 -str A eff x  2.33  m 2 -str No Probability Cut Prob. >.50 Low & High CAL Classes

Bill Atwood, July, 2003 GLAST 12 Energy Summary (cont’) Energy is “FLAT” in dimensions of  and E. Remaining A eff x  2.33  m 2 -str Remaining “Bad Energy”: 6.3% Remaining “Good Energy”: 84% 4 – 6  events Fraction ~.7 x Horizontal Events - Not so easy to remove at this stage. Note: This is where they are generated – NOT where they are reconstructed

Bill Atwood, July, 2003 GLAST 13 PSF Analysis Goals: Separate well measured events from poor ones Maintain the highest A eff x  Provide a “tune-able” handle to improve resolution allowing for flexibility in applications to science topics

Bill Atwood, July, 2003 GLAST 14 PSF Classes Conversion Location: Thick & Thin First hit occurs in Thin radiator section Thin First hit occurs in Thick radiator section Thick Analysis Type: VTX & 1Trk > 50% of Events have a “VTX” solution (VTX solution 2 tracks combined to give  direction) VTX Solution not always better than the “Best Track Solution” Types sorted out via a Classification Tree 4 PSF Classes x 3 Energy Classes

Bill Atwood, July, 2003 GLAST 15 The VTX Decision

Bill Atwood, July, 2003 GLAST 16 VTX Thin Clip Bad Events using a CT Predict how “good” using a Regression Tree This process is repeated for the 4 Tracking Event Classes

Bill Atwood, July, 2003 GLAST 17 PSF Results – Thin Radiator NO CUTS PSF 95 /PSF 68 = 3.2 A eff x  1.01 m 2 -str Tails Clipped PSF 95 /PSF 68 = 2.8 A eff x .95 m 2 -str PSF 95 /PSF 68 = 2.9 A eff x .84 m 2 -str PSF 95 /PSF 68 = 2.4 A eff x .51 m 2 -str  core < 1.3  core <.75

Bill Atwood, July, 2003 GLAST 18 Thin Radiator PSF 2 Cos(  ) Dependence Cuts: 1) Tails Clipped 2)  core < 1.3

Bill Atwood, July, 2003 GLAST 19 PSF Results – Thick Radiator PSF 95 /PSF 68 = 2.6 A eff x .80 m 2 -str  core < 1.3 Thick Radiator Events: Expect 1)Similar to Thin yes  A eff ~95% A eff (Thin) 76% 3)~2 x worse PSF 2.1 x At high energy PSF thick PSF thin Multiple Scattering becomes less important then measurement errors.

Bill Atwood, July, 2003 GLAST 20 PSF Results – What Remains A eff x  distributions approximately the same SR case (  core < 1.3): A eff x .80 m 2 -str +.84 m 2 -str = 1.64 m 2 -str Ratio of Integral log(E) plots to flat (as generated) distribution: ~ 1.8 Hence Asymptotic A eff x  2.94 m 2 -str (lots of light gathering power left) Thick Thin

Bill Atwood, July, 2003 GLAST 21 Background Rejection Goal: remove most of the BGE’s while preserving the  signal Problem: Large imbalance between #BGE’s and #  ’ s. CT’s need sufficient #’s of events to establish unbiased model trees. Show Stopper: 11 th hour discovery of problems in ACD Sim & Analysis AcdTileCount = 0 BGE Data Set No Side Tiles Fired! The events pour in! Also there’s trouble with Top Tiles as well! (Blue ~ 1/pixel, Brown ~ 50/pixel)

Bill Atwood, July, 2003 GLAST 22 Forge Ahead! (DTFSA) Step 1: Events are first processed in the PSF Analysis script. a) Good Energy Prob. >.50 b) Determine Event Classes c) Compute CT’s for PSF Analysis d) No cuts on PSF - goodness The Formal portion of the talk is now ended! What lies ahead is presented to show the direction which is being pursued. All the quantitative results are given as illustrative only! IN SHORT: QUOTE NOTHING FROM THIS!

Bill Atwood, July, 2003 GLAST 23 Breakdown after PSF Processing Total: BGE Accounting: 1)2x10 6 generated 2)8.5x10 5 Triggered 3)8.9x10 4 Post-Prunning 4)12.4x10 3 Post Energy Selection Survival Factors* Event Class Factor Thin-VTX 6x10 -4 Thick-VTX 8x10 -4 Thin-1Tkr 3x10 -2 Thick-1Tkr 2x10 -2 *Factors relative to Triggered and are corrected for relative Signal Fractions (For SR case – factors 2x smaller) At this point  events have lost 4% due to ACD cuts & 15% due to energy cut Note: the disparity among the Event Classes Losses: minimal and the VTX Event Classes already have S/N ~ 1 : 1 ! BGE’s  ’s Total: 1.000

Bill Atwood, July, 2003 GLAST 24 BGE Rejection CT’s Step 2: Mixes of BGE’s and  ’s are formed a) Training Sample – 50:50 BGE:   (Split the BGE sample 50:50 Training/Testing)  - Leaves only ~ 6500 of each type  - Statistics allow for only shallow CT’s  - For demonstration – Lump Thick & Thin Event Classes Together b) Test Sample - 80:1 BGE:  (relative to “as-generated” totals) The available statistics don’t even allow for this! - Leaves only ~ 500  ’ s (after SR Case PSF Cuts ~ 400  ’ s) Caveat: What ratio of events should the train sample have? - Need sufficient numbers of both classes to establish patterns - At “real” analysis ratios – the CT splitting mechanism work poorly. Deviance per split will be too small. - Trial & Error shows that ratio needs to be within a factor of 2.

Bill Atwood, July, 2003 GLAST 25 BGE Rejection CT’s: VTX Events Note the sparse stats For VTX Events The CT gives > 10x more Rejection Limited Rejection due to low stats The usual suspects! (PLUS 1 –Can you find it?) BGEs ss The Tree Probabilities

Bill Atwood, July, 2003 GLAST 26 BGE Rejection CT’s: 1Trk Events The Tree Probabilities BGEs ss Stats large enough to grow a moderate size tree For 1Tkr Events The CT gives > 10x more Rejection Would do better if Thick and Thin were done separately

Bill Atwood, July, 2003 GLAST 27 Background Rejection Summary  Event Probabilities  Events BG Events Event Classes Event Types

Bill Atwood, July, 2003 GLAST 28 BGE Rejection Summary 2 VTX Events (undifferentiated w.r.t. Thin/Thick) 1) Remaining background: 3% (But recall test sample is only 80:1) 2) Good Event Loss: 17% 3) BGE Reduction Factor: 16x (post SR Case selection) 4) Further progress stop for lack of statistics (there were 3 BGE’s events left) 1Tkr Events (undifferentiated w.r.t. Thin/Thick) 1) Remaining background: 32% (No there yet!) 2) Good Event Loss: 3% 3) BGE Reduction factor: 60x (post SR case selection 4) Further progress limited by state of present software This exercise is an example of what will happen to the science if we lose two sides of the ACD and put a big hole in the top of it as well!