EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks EasyGrid: a job submission system for distributed.

Slides:



Advertisements
Similar presentations
Annalisa Mastroserio ALICE meeting Italia – Vietri Sul Mare Φ meson detection with HMPID Annalisa Mastroserio University of Bari, INFN (Bari) Outline Importance.
Advertisements

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MyProxy and EGEE Ludek Matyska and Daniel.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Why Grids Matter to Europe Bob Jones EGEE.
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
James Cunha Enabling Grid Computer for HEP Babar Team at University of Manchester Resources:
1 ALICE Grid Status David Evans The University of Birmingham GridPP 16 th Collaboration Meeting QMUL June 2006.
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini.
Your university or experiment logo here BaBar Status Report Chris Brew GridPP16 QMUL 28/06/2006.
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Particle physics – the computing challenge CERN Large Hadron Collider –2007 –the worlds most powerful particle accelerator –10 petabytes (10 million billion.
Fighting Malaria With The Grid. Computing on The Grid The Internet allows users to share information across vast geographical distances. Using similar.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Experimental Particle Physics PHYS6011 Joel Goldstein, RAL 1.Introduction & Accelerators 2.Particle Interactions and Detectors (2) 3.Collider Experiments.
Computing for LHC Dr. Wolfgang von Rüden, CERN, Geneva ISEF students visit CERN, 28 th June - 1 st July 2009.
15 May 2006Collaboration Board GridPP3 Planning Executive Summary Steve Lloyd.
EasyGrid: the job submission system that works! James Cunha Werner GridPP18 Meeting – University of Glasgow.
Hardeep Bansil Thomas McLaughlan University of Birmingham MINERVA Z Mass Exercise.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Recent Results on Radiative Kaon decays from NA48 and NA48/2. Silvia Goy López (for the NA48 and NA48/2 collaborations) Universitá degli Studi di Torino.
Investigations of Semileptonic Kaon Decays at the NA48 Еxperiment Milena Dyulendarova (University of Sofia “St. Kliment Ohridski”) for NA48 Collaboration.
Grid in action: from EasyGrid to LCG testbed and gridification techniques. James Cunha Werner University of Manchester Christmas Meeting
14 Sept 2004 D.Dedovich Tau041 Measurement of Tau hadronic branching ratios in DELPHI experiment at LEP Dima Dedovich (Dubna) DELPHI Collaboration E.Phys.J.
Particle Identification in the NA48 Experiment Using Neural Networks L. Litov University of Sofia.
Search for B     with SemiExclusive reconstruction C.Cartaro, G. De Nardo, F. Fabozzi, L. Lista Università & INFN - Sezione di Napoli.
Τ ± → π ± π + π - π 0 ν τ decays at BaBar Tim West, Jong Yi, Roger Barlow The University of Manchester Carsten Hast, SLAC IOP HEP meeting Warwick, 12 th.
Study of e + e  collisions with a hard initial state photon at BaBar Michel Davier (LAL-Orsay) for the BaBar collaboration TM.
Measurement of the Branching fraction B( B  D* l ) C. Borean, G. Della Ricca G. De Nardo, D. Monorchio M. Rotondo Riunione Gruppo I – Napoli 19 Dicembre.
Exploiting the Grid to Simulate and Design the LHCb Experiment K Harrison 1, N Brook 2, G Patrick 3, E van Herwijnen 4, on behalf of the LHCb Grid Group.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
James Cunha Job Submission for Babar Analysis James Werner Resources:
Chinorat Kobdaj SPC May  What is heavy ion physics?  What is ALICE?
EasyGrid Job Submission System and Gridification Techniques James Cunha Werner Christmas Meeting University of Manchester.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
1 Introduction to Dijet Resonance Search Exercise John Paul Chou, Eva Halkiadakis, Robert Harris, Kalanand Mishra and Jason St. John CMS Data Analysis.
Analysis of E852 Data at Indiana University Ryan Mitchell PWA Workshop Pittsburgh, PA February 2006.
Irakli Chakaberia Final Examination April 28, 2014.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Nick Brook Current status Future Collaboration Plans Future UK plans.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
SiD performance for the DBD Jan Strube CERN. Overview Software Preparation (CERN, SLAC) Machine Environment (CERN, SLAC) Tracking Performance (C. Grefe)
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
BaBar and the Grid Roger Barlow Dave Bailey, Chris Brew, Giuliano Castelli, James Werner, Fergus Wilson and Will Roethel GridPP18 Glasgow March 20 th 2007.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Abel Carrión Ignacio Blanquer Vicente Hernández.
AI in HEP: Can “Evolvable Discriminate Function” discern Neutral Pions and Higgs from background? James Cunha Werner Christmas Meeting 2006 – University.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
JPS 2003 in Sendai Measurement of spectral function in the decay 1. Motivation ~ Muon Anomalous Magnetic Moment ~ 2. Event selection 3. mass.
Mitchell Naisbit University of Manchester A study of the decay using the BaBar detector Mitchell Naisbit – Elba.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Mitchell Naisbit Manchester Jan2004 spectral function → a μ, α 1. Motivation Dominant uncertainty comes from experimentally determined hadronic loops:
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Introduction to Particle Physics II Sinéad Farrington 19 th February 2015.
Grid development at University of Manchester Hardware architecture: - 1 Computer Element and 10 Work nodes Software architecture: - EasyGrid to submit.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
Eleonora Luppi INFN and University of Ferrara - Italy
EasyGrid: a job submission system for distributed analysis using grid
Search For Pentaquark Q+ At HERMES
Plans for checking hadronic energy
Gridifying the LHCb Monte Carlo production system
Study of e+e collisions with a hard initial state photon at BaBar
M. Kezunovic (P.I.) S. S. Luo D. Ristanovic Texas A&M University
Measurement of b-jet Shapes at CDF
The LHCb Computing Data Challenge DC06
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EasyGrid: a job submission system for distributed analysis using grid James Cunha Werner

Enabling Grids for E-sciencE EGEE-II INFSO-RI Develop grid software for BaBar experiment at University of Manchester BaBar is a high-energy physics experiment running since 1999 at Stanford University/SLAC to throw light on how the matter-antimatter symmetric Big Bang can have given rise to todays matter-dominated universe. BaBar analysis was a conventional centralized software (850 packages). The project goal was to study grid performance and develop gridification algorithms – 5 papers published and 20 international talks.

Enabling Grids for E-sciencE EGEE-II INFSO-RI TauUser data: 18,000 files each user has thousands of different results 500,000,000 events raw data 800,000,000 simulated Monte Carlo events Raw data: 1,000,000 files / 20,000 categories 4,000,000,000 events raw data 4,000,000,000 simulated Monte Carlo events Massive computational resources are required. Grid computing is a strong candidate to provide them! Challenge: data distributed analysis…

Enabling Grids for E-sciencE EGEE-II INFSO-RI Complex data management: Distributed datasets around the world and several other support databases (conditions, configuration, bookkeeping metadata, and parameters). Distributed and heterogeneous hardware platform around the world (standards). Users do not have grid skills.Their interests were high energy physics, not grid. Reliability/performance should be at least the same as SLAC. Users have a fixed time to do their research, they will use the more efficient resource. Main issues…

Enabling Grids for E-sciencE EGEE-II INFSO-RI LCG Grid Software Grid middleware developed by CERN / Switzerland and GridPP/UK. Homogeneous common ground in a heterogeneous platform. –User interface –Information system –Resource broker –Computer elements –Worker node –Storage Element Integration can be difficult for outsider users!

Enabling Grids for E-sciencE EGEE-II INFSO-RI LCG around the world

Enabling Grids for E-sciencE EGEE-II INFSO-RI EasyGrid: Job Submission system for grid It is an intermediate layer between Grid middleware and users software. It integrates data, parameters, software, and grid middleware doing all submission and management of several users software copies to grid. Performs DATA and TASK parallelism in grid. Web page: Paper:

Enabling Grids for E-sciencE EGEE-II INFSO-RI User software User computer Datasets Grid enabled software User software + Gridification algorithms Workload management Data Management Performance analysis Grid resources Gridification Process: from conventional to grid computing. Data Gridification Functional Gridification EasyGrid Job Submission system Submit jobs Manage datasets Recover results Recover reports > BetaMiniApp Tau11-Run3.tcl > Easygrid BetaMiniApp Tau11-Run3 See for more information File name

Enabling Grids for E-sciencE EGEE-II INFSO-RI Job submission block diagram

Enabling Grids for E-sciencE EGEE-II INFSO-RI Execution diagram

Enabling Grids for E-sciencE EGEE-II INFSO-RI Data parallelism in Grid Each data file will be read by each copy of the binary code in parallel. EasyGrid Tasks: –Copy binary code at closest storage elements. –Set environment in each worker node. –Start the binary code. –Recover results in users directory. –Provide information in case software fails. –Tools for data management and replication.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Data gridification in action

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Data gridification benchmarks

Enabling Grids for E-sciencE EGEE-II INFSO-RI Pions Kaons Monte Carlo Simulation Real data Particle identification Energy x Momentum for Tau 1N dataset. It contains 18,700,000 events. See

Enabling Grids for E-sciencE EGEE-II INFSO-RI Neutral pion decays BbkDatasetTcl selected 482,303,947 events in dataset Tau11-Run[1,2,3,4]- OnPeak-R14. Using easymoncar 4,890,000 events were simulated using Monte Carlo. Grid platform was used to run in parallel every data file selected by BbkDatasetTcl. Run3 run at Manchester and Run1,2,4 at RAL. Processing performance was 70,000 events per hour. See

Enabling Grids for E-sciencE EGEE-II INFSO-RI Parameters from Breit-Wigner mass distribution are: resonant mass 770 MeV, width 160 MeV and normalisation 4,500,000. Rho 770 reconstruction from hadronic tau decay

Enabling Grids for E-sciencE EGEE-II INFSO-RI Search for anti deuteron The first task is to find where deuterons (and anti-deuterons) strapes will be in de/dx by momentum biparametric plots. The strapes correspond to Pions, kaons,protons and deuterons respectively. The anti-matter plot almost does not have anti-deuteron events. There were 800 jobs searching in 2 million events each. See

Enabling Grids for E-sciencE EGEE-II INFSO-RI

Enabling Grids for E-sciencE EGEE-II INFSO-RI NP hard optimization using Genetic Algorithms Job Shop Scheduling optimization using an always feasible map with genetic algorithm. 161 data tests running in GA and MC.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Source Dr Marta Tavera Source Dr Mitchell Naisbit Some results from HEP users…

Enabling Grids for E-sciencE EGEE-II INFSO-RI Task parallelism in grid One master binary code (or client) requesting services and managing load flow. EasyGrid Tasks: –Set a task queue. –Search information system for services published in grid. –Establish sections in each worker node. –Start services and initialize software. –Send data for processing in each server. –Manages processing and re-submit in case of fail. –Manages notification and recover results in master.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Task gridification in action

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Task gridification benchmark

Enabling Grids for E-sciencE EGEE-II INFSO-RI Neutral Pion discrimination Neutral Pions decays into 2 Gammas, detected by BaBars Electromagnetic Calorimeter. Two background gammas could have neutral pion invariant mass just by chance. How to discriminate them using artificial intelligence ???

Enabling Grids for E-sciencE EGEE-II INFSO-RI Discriminate Functions Mathematical model obtained with GP maps the variables hyperspace to a real value through the discriminator function, an algebraic function of kinematics variables. Applying the discriminator to a given pair of gammas: – if the discriminate value is bigger than zero, the pair of gammas is deemed to come from pion decay. – Otherwise, the pair is deemed to come from another (background) source. Paper: Poster:

Enabling Grids for E-sciencE EGEE-II INFSO-RI Methodology MC data Training data Test data Discriminate function Raw data Select Real / background events MC data 1. Obtaining Discriminate Function (DF): 2. Test DF accuracy: 3. Selecting events for superposition: GP

Enabling Grids for E-sciencE EGEE-II INFSO-RI Selection criteria: 0 – red 1 - green Training data: 2 red(0) 2 green(1) Test data: 3 red(0) 3 green(1) DF

Enabling Grids for E-sciencE EGEE-II INFSO-RI Running Genetic Programming with Grid computing Reverse Polish Notation The population size is 500 individuals; Crossover and mutation probabilities are 60% and 20% respectively. Every generation, 20 best individuals are copied as they are (without crossover and mutation) and half population is generated randomly and replace the worse individuals. Algebraic operators have been used with kinematics data. The service we have distributed in grid was fitness evaluation, in parallel by many WN. 482,303,947 BaBars detector events and 20,489,668 MC events

Enabling Grids for E-sciencE EGEE-II INFSO-RI Training GP to obtain NPDF Monte Carlo (MC) generators integrates particle decays models with detectors system transfer function. MC events contain all information from each track particle and gamma radiation, which allows select high purity training dataset (96%+). Events with real neutral pion were selected and marked as 1. Events without real pions into MC truth and invariant mass reconstruction in the same region of real neutral pions where also selected and marked as 0.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Energy cuts all gammas without energy cut (60,000 real and background records for training, and 60,000 real and background for test), more energetic than 30 MeV electronics noise threshold (32,000 real and background records for training and test), more energetic than 50 MeV (15,000 real and background records for training and test), more energetic than 30MeV, lateral moment between 0.0 and 0.8, and have hit more than one crystal in the electromagnetic calorimeter - the conventional cut for neutral pion(16,000 real and background records for training and test).

Enabling Grids for E-sciencE EGEE-II INFSO-RI α: Sensitivity or efficiency. -β: specificity or purity. -γ: accuracy. NPDF Final results

Enabling Grids for E-sciencE EGEE-II INFSO-RI Neutral Pion Energy Distribution Cumulative plot of energy distribution for 1, 2, 3 and 4 neutral pion decays using all gammas NPDF. Contamination effect can be seen from MC energy distribution. The agreement between Monte Carlo and experimental data is conclusive about methods convergence and accuracy.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Hadronic tau decays results

Enabling Grids for E-sciencE EGEE-II INFSO-RI Summary Available since GridPP11 - September/2004 : Several benchmarks with BaBar experiment data : Data Gridification: –Particle identification: –Neutral pion decays: –Search for anti deuteron: Functional gridification: –Evolutionary neutral pion discriminate function: Documentation (main web page): html files and 327 complementary files 60 CPUs production and 10 CPUs development farms running independently without any problem between November/2005 and September /2006.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Dissemination 20 international events: 5 refereed papers Int. Conferences. GridPP stand at IoP2006 and IoP2007. Contributions at GridPP web pages.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Further development in LHC: Higgs to +0j H +0j

Enabling Grids for E-sciencE EGEE-II INFSO-RI Conclusion EasyGrid is a framework for distributed analysis that works very well providing task and functional gridification capabilities. Genetic programming approach obtains neutral pion discriminate function to discern between background and real neutral pion particles. Background can produce a critical influence in systematic errors and constrain qualitative analysis. Results from hadronic tau decays analyzed in this paper showed genetic programming discriminate function has an important role in background reduction, improving analysis quality. The use of NPDF will allow the study of observable and check with values obtained from theoretical Standard Model, from a sample of events with high purity.