N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett A Lightweight Histogram Interface Layer CHEP 2000 Session F (F320) Thursday.

Slides:



Advertisements
Similar presentations
Components of a Data Analysis System Scientific Drivers in the Design of an Analysis System.
Advertisements

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Mixed Language Programming on Seaborg Mark Durst NERSC User Services.
Chapter 3 Loaders and Linkers
Extensibility, Safety and Performance in the SPIN Operating System Presented by Allen Kerr.
Abstract Data Types Data abstraction, or abstract data types, is a programming methodology where one defines not only the data structure to be used, but.
File Management Systems
Module: Definition ● A logical collection of related program entities ● Not necessarily a physical concept, e.g., file, function, class, package ● Often.
Tutorial 6 & 7 Symbol Table
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
1 / 31 CS 425/625 Software Engineering User Interface Design Based on Chapter 15 of the textbook [SE-6] Ian Sommerville, Software Engineering, 6 th Ed.,
Russell Taylor Lecturer in Computing & Business Studies.
Guide To UNIX Using Linux Third Edition
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
7/13/20151 Topic 3: Run-Time Environment Memory Model Activation Record Call Convention Storage Allocation Runtime Stack and Heap Garbage Collection.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Chapter 3 Software Two major types of software
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
Overview of Mini-Edit and other Tools Access DB Oracle DB You Need to Send Entries From Your Std To the Registry You Need to Get Back Updated Entries From.
Chapter 9: Coupling & Cohesion Omar Meqdadi SE 273 Lecture 9 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
Concept demo System dashboard. Overview Dashboard use case General implementation ideas Use of MULE integration platform Collection Aggregation/Factorization.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Advanced Analysis Environments What is the role of Java in physics analysis? Will programming languages at all be relevant? Can commercial products help.
LHCb Software week November, 1999 M.Frank LHCb/CERN N-Tuples within Gaudi ã Definition ã Usage ä Run through simplified example ã The persistent.
DCS Overview MCS/DCS Technical Interchange Meeting August, 2000.
JAS3 + AIDA LC Simulations Workshop SLAC 19 th May 2003.
Data Acquisition Data acquisition (DAQ) basics Connecting Signals Simple DAQ application Computer DAQ Device Terminal Block Cable Sensors.
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
Introduction to MDA (Model Driven Architecture) CYT.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Development of ORBIT Data Generation and Exploration Routines G. Shelburne K. Indireshkumar E. Feibush.
Review C Language Features –control flow –C operators –program structure –data types –I/O and files Problem Solving Abilities.
Chapter 10: File-System Interface 10.1 Silberschatz, Galvin and Gagne ©2011 Operating System Concepts – 8 th Edition 2014.
Copyright 2003 Scott/Jones Publishing Standard Version of Starting Out with C++, 4th Edition Chapter 13 Introduction to Classes.
The european ITM Task Force data structure F. Imbeaux.
07 Apr, 2000 GAUDI Histograms Pavel Binko, LHCb / CERN 1 LHCb Software Week GAUDI Histograms Pavel Binko LHCb / CERN.
Using JAS3 for LCD Analysis Tony Johnson 20 th May 2003.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
1 Control Software (CAT) Introduction USB Interface implementation Calorimeter Electronics Upgrade Meeting Frédéric Machefert Wednesday 5 th May, 2010.
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen & Jussi Rasinmäki Dept. of Forest Resource Management.
ITC Research Computing Support Using Matlab Effectively By: Ed Hall Research Computing Support Center Phone: Φ Fax:
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett Interval of Validity Service IOVSvc ATLAS Software Week May Architecture.
Chapter 6 Introduction to Defining Classes. Objectives: Design and implement a simple class from user requirements. Organize a program in terms of a view.
C OMPUTING E SSENTIALS Timothy J. O’Leary Linda I. O’Leary Presentations by: Fred Bounds.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 C. Leggett Intelligent (or not completely stupid) Unpacking Structure and Timing Studies of.
H.G.Essel: Go4 - J. Adamczewski, M. Al-Turany, D. Bertini, H.G.Essel, S.Linev CHEP 2003 GSI Online Offline Object Oriented Go4.
Database Management Systems (DBMS)
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett Callbacks May
Paul Scherrer Institut 5232 Villigen PSI ROME / / Matthias Schneebeli ROME Collaboration Meeting in Pisa Presented by Matthias Schneebeli.
25th May, 1999 HTL - Histogram Template Library Pavel Binko, LHCb / CERN 1 LHCb Computing Meeting HTL - Histogram Template Library Pavel Binko LHCb / CERN.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
Postgraduate Computing Lectures PAW 1 PAW: Physicist Analysis Workstation What is PAW? –A tool to display and manipulate data. Learning PAW –See ref. in.
Chapter – 8 Software Tools.
Chapter 9: Coupling & Cohesion Omar Meqdadi SE 273 Lecture 9 Department of Computer Science and Software Engineering University of Wisconsin-Platteville.
ECHO Technical Interchange Meeting 2013 Timothy Goff 1 Raytheon EED Program | ECHO Technical Interchange 2013.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
Visual Programming Borland Delphi. Developing Applications Borland Delphi is an object-oriented, visual programming environment to develop 32-bit applications.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Product Training Program
Go4 v2.2 Status & Overview CHEP 2003
INF230 Basics in C# Programming
LOCO Extract – Transform - Load
Unit# 8: Introduction to Computer Programming
Chapter 11: File System Implementation
Metadata The metadata contains
Introduction to Data Structure
Unit J: Creating a Database
Zhangxy Zhangxm Huangxt Dec 17 ,2003
Presentation transcript:

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER Charles Leggett A Lightweight Histogram Interface Layer CHEP 2000 Session F (F320) Thursday Feb

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 2 Charles Leggett CHEP 2000 Feb 10 Introduction F Current histogramming software packages, such as PAW, ROOT, JAS have enormous functionality. F They are no longer simply histogramming packages, but have added data analysis and visualization features. F The tight integration between these features has made it difficult to separate the statistical data gathering feature from the analysis and graphical presentation features. F This results in significant overheads, if only the histogramming aspect is needed.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 3 Charles Leggett CHEP 2000 Feb 10 Introduction (cont) F Many histogramming packages are wedded to a specific i/o format. F Very few translation programs exist to convert between various formats. F Makes it very hard to use analysis and visualization tools that are not part of the package used to generate the histogram. F Users have very little freedom to chose the package best suited to their needs, or the ones they are most familiar with.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 4 Charles Leggett CHEP 2000 Feb 10 Why an “Interface Layer” F Since it is format independent, and has no i/o (file or visual) requirements, it is not wedded to a specific part of the analysis procedure. F It can sit between components, such as between the data acquisition component and the analysis component, offering the ability to use various formats in different applications.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 5 Charles Leggett CHEP 2000 Feb 10 Design Requirements F Platform and i/o format independent F Lightweight - low overhead, minimal non-histogram features F Possibility to histogram any data type F Ability to use within an analysis schema, as an interface between different components, or as a standalone utility F Ability to use as a translator between various i/o formats F i/o formats user extensible F Easy implementation by user

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 6 Charles Leggett CHEP 2000 Feb 10 Required Qualities of a Histogram F A collection of statistical data related to a particular process. F Should not contain any information unrelated to the statistical data, such as colour, fitting parameters, line width, cuts, etc. F Number of bins + overflow/underflow F Bin edges F Entries per bin + associated errors F Identification information, such as an ID or name = n+3 + 2n

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 7 Charles Leggett CHEP 2000 Feb 10 Minimal Set of Useful Methods F weighted entries F reset() F bin contents, errors, centers, edges F bin numbers bin edges/centers F simple operations: =, +, - F mean(), rms() F min(), max() F rebin(), resize() F change title

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 8 Charles Leggett CHEP 2000 Feb 10 What Gets Histogrammed  Normally we used to histogram ints and floats. F What about entire objects? F To histogram an object, have to define which aspect of the object is used to order the histogram. F Can provide this ordering every time a histogram is filled, but nicer to associate an ordering mechanism with the histogram itself. F Define a function which provides this ordering, give pointer to histogram object.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 9 Charles Leggett CHEP 2000 Feb 10 Types of Histograms F BINNED –bin edges defined when created. –Either fixed or variable width F UNBINNED –only for very small data samples –can be converted to BINNED F AUTO-BINNED –starts off as UNBINNED, automatically converted to BINNED after a set number of entries. –Conversion routines calculate bin edges with either fixed width, or to maximize occupancy in each bin.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 10 Charles Leggett CHEP 2000 Feb 10 Use Overview Book as: Binned Unbinned Auto Output : hbook/PAW ROOT JAS text User Defined Basic Operations : Fill Weighted Fill Add, Subtract,... Resize, Rebin Convert Type etc User defined quantization function User Object Continued Analysis

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 11 Charles Leggett CHEP 2000 Feb 10 Internal Storage F If memory utilization is very tight, the user may want to limit the precision of the statistical data F User can chose between 4 and 8 byte internal record keeping –bin contents –bin errors –number of entries –number of equivalent entries

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 12 Charles Leggett CHEP 2000 Feb 10 Memory Usage F Dynamic memory allocation is neat, but implementation (often) sucks. Will always be an overhead to using it. F Pre-allocate memory - fairly easy to do with a BINNED histogram. F Limit use of dynamic structures. F Only run into trouble if need to re-size or re-bin a histogram after it’s been created. F UNBINNED histograms can either pre-allocate memory, or dynamically allocate on the fly. F Total overhead per histogram: 80 bytes.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 13 Charles Leggett CHEP 2000 Feb 10 Implementation Details F The requirement to be able to histogram objects has a serious implication - use of templates. F The histogram object becomes a templated object, with parameters the type of object to be histogrammed and the type of internal record keeping data: Histogram F For UNBINNED histograms, STL vectors are used if dynamic memory management is chosen. F Similar syntax for 2D histograms.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 14 Charles Leggett CHEP 2000 Feb 10 Usage F Simple histogram of floats, fixed bin width Histogram<> h1(-10.,10.,100); h1.Fill(X); F Histogram of ints, variable bin width, double precision Histogram h2(Xedge); F Histogram of Muon object, automatically binned to maximize occupancy float MuonQuantFunction(const Muon &M){}; Histogram h3(AUTOBINNED); h3.SetQuantFunction( MuonQuantFunction );

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 15 Charles Leggett CHEP 2000 Feb 10 I/O F File manager class used to read and write histograms from/to disk in a variety of formats F Internal histograms are only converted to a particular format when they are written. F File manager can easily be extended to encompass new file formats. F Current formats: –ASCII flat file –HBOOK –ROOT –XDR / DSL

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 16 Charles Leggett CHEP 2000 Feb 10 Ntuples F ntuples are trickier than histograms, as there are several different types (column-wise vs. row-wise, ROOT trees, etc) F For the moment, have implemented them in the most trivial way: arrays/vectors of structs. struct S { float E; int np; Muon M; }; ntuple nt; S.E =.... ; nt.Fill(S); F Simple accessor methods also provided.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 17 Charles Leggett CHEP 2000 Feb 10 Additional Functionality F Even though no complex functions are provided within the package, users may find it necessary to create them at needed. F Library functions can easily be added to provide user-specific histogram/ntuple operations. F For instance, if a user needs to perform a double gaussian fit to a histogram, it is very easy to add this function in an external library, declared as a friend.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 18 Charles Leggett CHEP 2000 Feb 10 Additions in the Pipeline F Ability to use shared memory F Extend i/o format to include JAS F Internal conversion to ROOT/HBOOK/JAS F Profile histograms F Further support for ntuples F Adhere to AIDA interface

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 19 Charles Leggett CHEP 2000 Feb 10 Pipedreams F Create an adaptor to a memory resident histogram object to allow multi-format access. F Basic histogram object sits in memory, presents different representations of itself to various components - eg looks like an HBOOK histogram to minuit, a ROOT histogram to a ROOT specific process. If modifications are made to histogram by other applications, can re-synchronize and update itself.

N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 20 Charles Leggett CHEP 2000 Feb 10 Conclusions F Makes a clean break between statistical data gathering, and analysis and visualization tasks. F Enables histogramming of complex types. F Simple and small implementation that is well suited to memory restricted tasks, such as online data taking. F Provides the user with the freedom to chose a wide variety of different analysis and visualization tools. F Easily extensible, whether to new i/o formats or specific analysis functions.