Bill Atwood, Nov. 2002GLAST 1 Classification PSF Analysis A New Analysis Tool: Insightful Miner Classification Trees From Cuts Classification Trees: Recasting.

Slides:



Advertisements
Similar presentations
CHAPTER 9: Decision Trees
Advertisements

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Decision Tree Approach in Data Mining
Bill Atwood, Dec GLAST 1 GLAST Energy or Humpty-Dumpty’s Revenge A Statement of the Problem Divide and Conquer strategy Easiest: Leakage (Depth)
A Quick Overview By Munir Winkel. What do you know about: 1) decision trees 2) random forests? How could they be used?
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Bill Atwood, July, 2003 GLAST 1 A GLAST Analysis Agenda Overarching Approach & Strategy Flattening Analysis Variables Classification Tree Primer Sorting.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 DC2 Discussion – What Next? 1)Alternatives - Bill's IM analysis - ???? 2)Backing the IM Analysis into Gleam.
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
Analysis Meeting 31Oct 05T. Burnett1 Classification, etc. status at UW Implement “nested trees” Generate “boosted” trees for all of the Atwood categories.
Rule induction: Ross Quinlan's ID3 algorithm Fredda Weinberg CIS 718X Fall 2005 Professor Kopec Assignment #3.
Bill Atwood, Core Meeting, 9-Oct GLAS T 1 Finding and Fitting A Recast of Traditional GLAST Finding: Combo A Recast of the Kalman Filter Setting.
Bill Atwood, SCIPP/UCSC, Dec., 2003 GLAST 1 DC1 Instrument Response Agenda Where are we? V3R3P7 Classification Trees Covariance Scaled PSF Pair Energies.
Three kinds of learning
Bill Atwood, SCIPP/UCSC, Nov, 2003 GLAST 1 Update on Rome Results Agenda A eff Fall-off past 10 GeV V3R3P7 Classification Trees Covariance Scaled PSFs.
Using CTW as a language modeler in Dasher Phil Cowans, Martijn van Veen Inference Group Department of Physics University of Cambridge.
Bill Atwood, SCIPP/UCSC, August, 2005 GLAST 1 Energy Method Selections Data Set: All Gamma (GR-HEAD1.615) Parametric Tracker Last Layer Profile 4 Methods.
Bill Atwood, Rome, Sept, 2003 GLAST 1 Instrument Response Studies Agenda Overarching Approach & Strategy Classification Trees Sorting out Energies PSF.
Analysis Meeting 4/09/05 - T. Burnett 1 Classification tree Update Toby Burnett Frank Golf.
1 Using Insightful Miner Trees for Glast Analysis Toby Burnett Analysis Meeting 2 June 2003.
Bill Atwood, SCIPP/UCSC, DC2 Closeout, May 31, 2006 GLAST 1 Bill Atwood & Friends.
Classification and Regression Trees for Glast Analysis: to IM or not to IM? Toby Burnett Data Challenge Meeting 15 June 2003.
Bill Atwood, SCIPP/UCSC, Oct, 2005 GLAST 1 Back Ground Rejection Status for DC2 a la Ins. Miner Analysis.
Bill Atwood, SCIPP/UCSC, Jan., 2006 GLAST 1 The 3 rd Pass Back Rejection Analysis using V7R3P4 (repo) 1)Training sample: Bkg: Runs (SLAC) v7r3p4.
Chapter 6 Decision Trees
R OBERTO B ATTITI, M AURO B RUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.
Ensemble Learning (2), Tree and Forest
Microsoft Enterprise Consortium Data Mining Concepts Introduction to Directed Data Mining: Decision Trees Prepared by David Douglas, University of ArkansasHosted.
Introduction to Directed Data Mining: Decision Trees
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
Data Mining Techniques
Classifiers, Part 1 Week 1, video 3:. Prediction  Develop a model which can infer a single aspect of the data (predicted variable) from some combination.
Next Generation Techniques: Trees, Network and Rules
ROOT: A Data Mining Tool from CERN Arun Tripathi and Ravi Kumar 2008 CAS Ratemaking Seminar on Ratemaking 17 March 2008 Cambridge, Massachusetts.
Lecture Notes 4 Pruning Zhangxi Lin ISQS
Building And Interpreting Decision Trees in Enterprise Miner.
Decision Trees.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Chapter 9 – Classification and Regression Trees
Introduction, or what is data mining? Introduction, or what is data mining? Data warehouse and query tools Data warehouse and query tools Decision trees.
Clustering II. 2 Finite Mixtures Model data using a mixture of distributions –Each distribution represents one cluster –Each distribution gives probabilities.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
DC2 at GSFC - 28 Jun 05 - T. Burnett 1 DC2 C++ decision trees Toby Burnett Frank Golf Quick review of classification (or decision) trees Training and testing.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Scaling up Decision Trees. Decision tree learning.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Decision Trees. MS Algorithms Decision Trees The basic idea –creating a series of splits, also called nodes, in the tree. The algorithm adds a node to.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Copyright © 2010 SAS Institute Inc. All rights reserved. Decision Trees Using SAS Sylvain Tremblay SAS Canada – Education SAS Halifax Regional User Group.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
GLAST/LAT analysis group 31Jan05 - T. Burnett 1 Creating Decision Trees for GLAST analysis: A new C++-based procedure Toby Burnett Frank Golf.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 4-Inducción de árboles de decisión (1/2) Eduardo Poggi.
24 October 2002Data Mining & Visualization1 Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
Decision Trees (suggested time: 30 min)
Ch9: Decision Trees 9.1 Introduction A decision tree:
Data Science Algorithms: The Basic Methods
MANAGING DATA RESOURCES
Roberto Battiti, Mauro Brunato
Welcome to E-Prime E-Prime refers to the Experimenter’s Prime (best) development studio for the creation of computerized behavioral research. E-Prime is.
Data Mining – Chapter 3 Classification
Figure 1 Relationships between probability of prey discard and each parameter explored. To show the most robust ... Figure 1 Relationships between probability.
Decision trees MARIO REGIN.
Presentation transcript:

Bill Atwood, Nov. 2002GLAST 1 Classification PSF Analysis A New Analysis Tool: Insightful Miner Classification Trees From Cuts Classification Trees: Recasting of the GLAST PSF Analysis Energy Dependencies Present status of GLAST PSFs

Bill Atwood, Nov. 2002GLAST 2 A Data Mining Tool An Miner Analysis Program!

Bill Atwood, Nov. 2002GLAST 3 Miner Details What is a Data Miner? o A graphical user programming environment o An ensemble of Data Manipulation Tools o A Set of Data Modelling Tools o A “widget” scripting language o An interface to data bases Why use a Data Miner? o Fast and Easy prototyping of Analysis o Encourages “exploration” o Allows a more “Global” View of Analysis INPUTOUTPUT A Properties Browser to set parameters A Traditional “CUT”

Bill Atwood, Nov. 2002GLAST 4 Classification Trees Root Branch 1 Branch 2 Given a “catagorical varible” split the data into two pieces using “best” independent continuous varible Example: VTX.Type = 1 if “vertex” direction is best 2 if “best-track” direction is best Use “Entropy” to deside which Independent varible to use: Entropy = Where k is over catagories and i is the i th Node (There are other criteria) Continue process – treating each branch as a new “root.” Terminate according to statistics in last node and/or change in Entropy Example: Classification Tree from Miner

Bill Atwood, Nov. 2002GLAST 5 Classification Trees Why use Classification Trees? 1. Simplicity of method – recursive application of a decision making rule 2. Easily captures non-linear behavior in predictors as well As interactions amoung them 3. Not limited to just 2 catagories There are numerous text on this subject…… In the following analysis Classification Trees will be used to:  Separate out the good “vertex” events  Predict how “good” and event really is

Bill Atwood, Nov. 2002GLAST 6 GLAST PSF Analysis This portion of the code  Reads in the data  Culls out bad data  Adds new columns for analysis  Makes Global Cuts  Splits the data into 2 pieces  Thin Radiators  Thick Radiators (TKR.1.z0 > 250) ( ACD.DOCA > 350 & Energy >.5*MC.Energy)

Bill Atwood, Nov. 2002GLAST 7 The VTX Classification Tree Relative amounts of Catagories Relative amount of Data

Bill Atwood, Nov. 2002GLAST 8 CPA: To Vertex or not to Vertex? Probability is not continuous – its essentially binned by the finite number of leaves (ending nodes) There is a “gap” at.5 - Use that to determine which solution to use

Bill Atwood, Nov. 2002GLAST 9 Do the Vertex Split! Use 2-Track Solution Use 1-Track Solution The data are now divided into 2 subsets according to the Probability that the 2-Track (“vertex”) solution is best. No data have been eliminated – Failed Vertexed solutions Are tried again as 1-Track events Predictor created by Classification Tree Rename probability column From “Thin” Split

Bill Atwood, Nov. 2002GLAST 10 Bin the PSF Continuous Variable Catagroical Variable Target Class: Class #1 – MS PSF Limited Bin

Bill Atwood, Nov. 2002GLAST 11 2 Track Classification Tree

Bill Atwood, Nov. 2002GLAST 12 1 Track Classification Tree

Bill Atwood, Nov. 2002GLAST 13 Combining Results

Bill Atwood, Nov. 2002GLAST 14 Example PSF’s At FoM Max 100 MeV PSF-68 =2.7 o 95/68 = MeV: PSF-68 =.35 o 95/68 = MeV : PSF-68 =.1 o 95/68 = 2.9

Bill Atwood, Nov. 2002GLAST 15 Before and After Trees PSF: 2.1 o 95%/68% :2.34 A eff : 1387 cm 2 Using Classification Trees

Bill Atwood, Nov. 2002GLAST 16 Before and After Trees 95/68 Ratio A eff Best results obtained using the “cuts” to achieve a good PSF PSF: 2.1 o 95%/68% :2.34 A eff : 1387 cm 2 Using Classification Trees