1 Using Insightful Miner Trees for Glast Analysis Toby Burnett Analysis Meeting 2 June 2003.

Slides:



Advertisements
Similar presentations
PANEL DATA 1. Dummy Variable Regression 2. LSDV Estimator
Advertisements

Sabino Meola Charged kaon group meeting 12 October 2006 Status of analysis.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Recent Results on Radiative Kaon decays from NA48 and NA48/2. Silvia Goy López (for the NA48 and NA48/2 collaborations) Universitá degli Studi di Torino.
Problem Solving: World Problems Brian Heins CBE 562 November 2, 2005.
A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?
TRACK DICTIONARY (UPDATE) RESOLUTION, EFFICIENCY AND L – R AMBIGUITY SOLUTION Claudio Chiri MEG meeting, 21 Jan 2004.
Analysis Meeting 24Oct 05T. Burnett1 UW classification: new background rejection trees.
Analysis Meeting 26 Sept 05T. Burnett1 UW classification: background rejection (A summary – details at a future meeting)
Bill Atwood, July, 2003 GLAST 1 A GLAST Analysis Agenda Overarching Approach & Strategy Flattening Analysis Variables Classification Tree Primer Sorting.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 DC2 Discussion – What Next? 1)Alternatives - Bill's IM analysis - ???? 2)Backing the IM Analysis into Gleam.
T. Burnett: IRF status 30-Jan-061 DC2 IRF Status Meeting agenda, references at:
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status at UW Implement “nested trees” Generate “boosted” trees for all of the Atwood categories.
Simulation / Reconstruction Working group Toby Burnett University of Washington Jan 2000 T.
Bill Atwood, Core Meeting, 9-Oct GLAS T 1 Finding and Fitting A Recast of Traditional GLAST Finding: Combo A Recast of the Kalman Filter Setting.
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status Implement pre-filters Energy estimation.
Bill Atwood, SCIPP/UCSC, Nov, 2003 GLAST 1 Update on Rome Results Agenda A eff Fall-off past 10 GeV V3R3P7 Classification Trees Covariance Scaled PSFs.
C&A 8 May 06 1 Point Source Localization: Optimizing the CTBCORE cut Toby Burnett University of Washington.
C&A 10April06 1 Point Source Detection and Localization Using the UW HealPixel database Toby Burnett University of Washington.
Analysis Meeting 17 Oct 05T. Burnett1 UW classification: background rejection Update with new all gamma and background samples.
Bill Atwood, Nov. 2002GLAST 1 Classification PSF Analysis A New Analysis Tool: Insightful Miner Classification Trees From Cuts Classification Trees: Recasting.
Analysis Meeting 4/09/05 - T. Burnett 1 Classification tree Update Toby Burnett Frank Golf.
ML ALGORITHMS. Algorithm Types Classification (supervised) Given -> A set of classified examples “instances” Produce -> A way of classifying new examples.
T. Burnett: IRF status 5-feb-061 DC2 IRF Status Meeting agenda, references at:
Analysis Meeting 14Nov 05T. Burnett1 Classification, etc. status at UW New application to apply trees to existing tuple Background rejection for the v7r2.
Classification and Regression Trees for Glast Analysis: to IM or not to IM? Toby Burnett Data Challenge Meeting 15 June 2003.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 Back Ground Rejection Status for DC2 a la Ins. Miner Analysis.
Bill Atwood, SCIPP/UCSC, Jan., 2006 GLAST 1 The 3 rd Pass Back Rejection Analysis using V7R3P4 (repo) 1)Training sample: Bkg: Runs (SLAC) v7r3p4.
Tracker Reconstruction SoftwarePerformance Review, Oct 16, 2002 Summary of Core “Performance Review” for TkrRecon How do we know the Tracking is working?
Lecture 3 Forestry 3218 Avery and Burkhart, Chapter 3 Shiver and Borders, Chapter 2 Forest Mensuration II Lecture 3 Elementary Sampling Methods: Selective,
Testing -- Part II. Testing The role of testing is to: w Locate errors that can then be fixed to produce a more reliable product w Design tests that systematically.
International Institute for Geo-Information Science and Earth Observation (ITC) ISL 2004 RiskCity Exercise 4: Generating an elements at risk database Cees.
DC2 at GSFC - 28 Jun 05 - T. Burnett 1 DC2 C++ decision trees Toby Burnett Frank Golf Quick review of classification (or decision) trees Training and testing.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Measurement of the Atmospheric Muon Neutrino Energy Spectrum with IceCube in the 79- and 86-String Configuration Tim Ruhe, Mathis Börner, Florian Scheriau,
E. De LuciaNeutral and Charged Kaon Meeting – 7 May 2007 Updates on BR(K +  π + π 0 ) E. De Lucia.
QA and Testing. QA Activity Processes monitoring Standards compliance monitoring Software testing Infrastructure testing Documentation testing Usability.
Explain the Marketing Research Process
Basic Science Terms  Observation: using the five senses to gather information, which can be proven (facts)  Inference: an opinion based on facts from.
Feb. 7, 2007First GLAST symposium1 Measuring the PSF and the energy resolution with the GLAST-LAT Calibration Unit Ph. Bruel on behalf of the beam test.
GLAST Calorimeter Crystal Position Measurement Zach Fewtrell, NRL/Praxis GLAST Integration & Test Workshop SLAC July 14, 2005.
Progress on F  with the KLOE experiment (untagged) Federico Nguyen Università Roma TRE February 27 th 2006.
BME 353 – BIOMEDICAL MEASUREMENTS AND INSTRUMENTATION MEASUREMENT PRINCIPLES.
Tutorial I: Missing Value Analysis
GLAST/LAT analysis group 31Jan05 - T. Burnett 1 Creating Decision Trees for GLAST analysis: A new C++-based procedure Toby Burnett Frank Golf.
N. Saoulidou & G. Tzanakos1 Analysis for new Period 3 Events N. Saoulidou and G. Tzanakos University of Athens, Department of Physics’ Div. Of Nuclear.
Building Consensus C&S 563. BAD Consensus n Those who oppose do not speak up at meeting. n Everyone nodding in unison but not really agreeing with the.
Kalanand Mishra June 29, Branching Ratio Measurements of Decays D 0  π - π + π 0, D 0  K - K + π 0 Relative to D 0  K - π + π 0 Giampiero Mancinelli,
Kalanand Mishra February 23, Branching Ratio Measurements of Decays D 0  π - π + π 0, D 0  K - K + π 0 Relative to D 0  K - π + π 0 decay Giampiero.
Feb. 3, 2007IFC meeting1 Beam test report Ph. Bruel on behalf of the beam test working group Gamma-ray Large Area Space Telescope.
Paolo Massarotti Kaon meeting March 2007  ±  X    X  Time measurement use neutral vertex only in order to obtain a completely independent.
T. Burnett: Anagroup 5/24/04 1 Latest PSF and Point-source sensitivity prediction status Anagroup meeting Toby Burnett 24 May 04.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
The Residential Electricity Rates Pharr Texas. There has been a considerable amount of inflation in the electricity rates in Texas, owing to the various.
BRANDON SHERMAN BETH SPINDLER Predicting Book Acquisitions for a Public Library: Will It Circulate?
© 2013 IBM Corporation IBM Predictive Maintenance & Quality Orchestration BA Technical Seller Training 1.
CPH Dr. Charnigo Chap. 9 Notes To begin with, have a look at Figure 9.5 on page 315. One can get an intuitive feel for how a tree works by examining.
Statistics Introduction.
Background Rejection Activities in Italy
Background Rejection Prep Work
Basic Science Terms Observation: using the five senses to gather information, which can be proven (facts) Inference: an opinion based on facts from observations.
Observation of a “cusp” in the decay K±  p±pp
Charged Particle Multiplicity in DIS
Charged Particle Multiplicity in DIS
DC1 Energy Evaluation.
Gamma-ray Large Area Space Telescope
Charged Particle Multiplicity in DIS
Tower A: A First Look Finding the Data: Login: THIS ONE! user: glast
CR Photon Working Group
Presentation transcript:

1 Using Insightful Miner Trees for Glast Analysis Toby Burnett Analysis Meeting 2 June 2003

2 The problem Bill is using IM classification and regression tree analysis to achieve very good PSF results IM is proprietary, and very expensive

3 Bill’s IM worksheet (PSFAnalysis_14) Training region Analyze results Input tuple

4 The Trees: calculate 4 values with 11 nodes Good calorimeter measurement [1 node] vertex vs. 1 track (thin and thick) [2 nodes] Core vs tail (thin/thick and vtx/1 trk) [4 nodes] Prediction of recon direction error [ 4 nodes] Example: A Good CAL/Bad Cal prediction node CalTwrEdge =26.58, CalTwrEdge 3,611.48, CalTrackDoca>3.96, CalXtalRatio 1.76

5 Bill’s result* * Flawed by G4 problems 100 MeV, with tail cuts and best estimate

6 A Solution IM saves its results as XML files, which are easy to interpret A new package, “classification” defines a class classification::Tree that does the following: –accepts a “lookup” object to obtain a pointer to the value associated with named quantities –parses the XML file, creating a tree structure for each prediction tree found –for a given event, returns a value from each tree Merit creates and fills the new tuple variables, in a new class ClassificationTree. –duplicates the logic defining the 4 categories –evaluates each of the 4 variables

7 Current Procedure Bill releases an IM file. I strip it down, removing nodes not required for analysis –size reduced by 1/2, to 500 Kb. Rename it, and check it in to cvs as classification/xml/PSF_Analysis.xml Create a tuple with merit, containing the new tuple quantities Feed that tuple to this IM worksheet, which writes a new tuple with both versions of the same variables

8 Results: the good The comparisons were with generated 100 MeV normal The vertex classification (used to select vertex vs. 1 Track direction estimate) is perfect, as is the core vs. tail

9 Results: the bad The results of the “regression tree” to predict the psf error has two populations! The agreement is rather poor for the “thin vertex” category; otherwise perfect. An explanation: Bill generated two different trees from different data sets, of 1000, and 243 events. (The latter has only two nodes and can only generate 3 values.) –The merit evaluation is only the first tree –The IM evaluation uses an average of the two trees. –Note that there are three branches.

10 Results: the ugly This is the comparison of the prediction for good energy measurement Again, Bill created two trees, which are apparently being averaged.

11 Observations, plans? Two possibilities to fix the “disagreement” –Bill: train only one tree –me: average all the trees Using IM to train the classification or regression trees –The current procedure is exploratory –If we decide to use these trees in the final analysis, they must be trained systematically –Another possibility (idea from Tracy): use the classification/regression analysis in S-PLUS, which manages tree objects.

MeV analysis w/ merit analysis Example only: G4 5.0 is too flawed to take seriously Tail cuts are clearly effective