UK Tony Doyle - University of Glasgow title.open ( ); revolution {execute}; LHC Computing Challenge Methodology? Hierarchical Information in a Global Grid.

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
GridPP Project Overview
UK Tony Doyle - University of Glasgow Grid Data Management Introduction Introduction Physics Analysis Data Hierarchy GRID Services Virtual Data Scenario.
GridPP Presentation to PPARC Grid Steering Committee 26 July 2001 Steve Lloyd Tony Doyle John Gordon.
SAM £17m 3-Year Project I: £2.49m £1.2m Experiment Objectives H*: 5.4% H: 3.2% Software Support G: Prototype Grid 9.7% F* 1.5% F 1.9% CERN J 2.6% E.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
LHCb Computing Activities in UK Current activities UK GRID activities RICH s/w activities.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
Slide: 1 Welcome to the workshop ESRFUP-WP7 User Single Entry Point.
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
6/4/20151 Introduction LHCb experiment. LHCb experiment. Common schema of the LHCb computing organisation. Common schema of the LHCb computing organisation.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Application Use Cases NIKHEF, Amsterdam, December 12, 13.
Exploiting the Grid to Simulate and Design the LHCb Experiment K Harrison 1, N Brook 2, G Patrick 3, E van Herwijnen 4, on behalf of the LHCb Grid Group.
CERN/IT/DB Multi-PB Distributed Databases Jamie Shiers IT Division, DB Group, CERN, Geneva, Switzerland February 2001.
The Grid Prof Steve Lloyd Queen Mary, University of London.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
LHCb Software Meeting Glenn Patrick1 First Ideas on Distributed Analysis for LHCb LHCb Software Week CERN, 28th March 2001 Glenn Patrick (RAL)
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Andrew McNab - Manchester HEP - 5 July 2001 WP6/Testbed Status Status by partner –CNRS, Czech R., INFN, NIKHEF, NorduGrid, LIP, Russia, UK Security Integration.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
LHC Computing Plans Scale of the challenge Computing model Resource estimates Financial implications Plans in Canada.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
Grid Glasgow Outline LHC Computing at a Glance Glasgow Starting Point LHC Computing Challenge CPU Intensive Applications Timeline ScotGRID.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
CLRC and the European DataGrid Middleware Information and Monitoring Services The current information service is built on the hierarchical database OpenLDAP.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
…building the next IT revolution From Web to Grid…
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
Grid Glasgow Outline LHC Computing at a Glance Glasgow Starting Point LHC Computing Challenge CPU Intensive Applications Timeline ScotGRID.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
AstroGrid NAM 2001 Andy Lawrence Cambridge NAM 2001 Andy Lawrence Cambridge Belfast Cambridge Edinburgh Jodrell Leicester MSSL.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Computing R&D and Milestones LHCb Plenary June 18th, 1998 These slides are on WWW at:
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Storage Management on the Grid Alasdair Earl University of Edinburgh.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
LHCb computing model and the planned exploitation of the GRID Eric van Herwijnen, Frank Harris Monday, 17 July 2000.
Moving the LHCb Monte Carlo production system to the GRID
ALICE analysis preservation
The LHCb Software and Computing NSS/IEEE workshop Ph. Charpentier, CERN B00le.
UK GridPP Tier-1/A Centre at CLRC
Gridifying the LHCb Monte Carlo production system
LHCb thinking on Regional Centres and Related activities (GRIDs)
Use Of GAUDI framework in Online Environment
Development of LHCb Computing Model F Harris
Presentation transcript:

UK Tony Doyle - University of Glasgow title.open ( ); revolution {execute}; LHC Computing Challenge Methodology? Hierarchical Information in a Global Grid Supernet Aspiration?HIGGSDataGRID-UKAspiration? ALL Data Intensive Computation Teamwork

UK Tony Doyle - University of Glasgow Outline Starting Point Starting Point The LHC Computing Challenge The LHC Computing Challenge Data Hierarchy Data Hierarchy DataGRID DataGRID Analysis Architectures Analysis Architectures GRID Data Management GRID Data Management Industrial Partnership Industrial Partnership Regional Centres Regional Centres Todays World Todays World Tomorrows World Tomorrows World Summary Summary

UK Tony Doyle - University of Glasgow Starting Point

UK Tony Doyle - University of Glasgow Starting Point Current technology would not be able to scale data to such an extent, which is where the teams at Glasgow and Edinburgh Universities come in. The funding awarded will enable the scientists to prototype a Scottish Computing Centre which could develop the computing technology and infrastructure needed to cope with the high levels of data produced in Geneva, allowing the data to be processed, transported, stored and mined. Once scaled down, the data will be distributed for analysis by thousands of scientists around the world. The project will involve participation from Glasgow University's Physics & Astronomy and Computing Science departments, Edinburgh University's Physics & Astronomy department and the Edinburgh Parallel Computing Centre, and is funded by the Scottish Higher Education Funding Council's (SHEFC Joint Research Equipment Initiative). It is hoped that the computing technology developed during the project will have wider applications in the future, with possible uses in astronomy, computing science and genomics observation, as well as providing generic technology and software for the next generation Internet.

UK Tony Doyle - University of Glasgow The LHC Computing Challenge Detector for ALICE experiment Detector for LHCb experiment

UK Tony Doyle - University of Glasgow A Physics Event Gated electronics response from a proton-proton collision Gated electronics response from a proton-proton collision Raw data: hit addresses, digitally converted charges and times Raw data: hit addresses, digitally converted charges and times Marked by a unique code: Marked by a unique code: Proton bunch crossing number, RF bucket Event number Collected, Processed, Analyzed, Archived…. Collected, Processed, Analyzed, Archived…. Variety of data objects become associated Event migrates through analysis chain: may be reprocessed; selected for various analyses; replicated to various locations.

UK Tony Doyle - University of Glasgow LHC Computing Model Hierarchical, distributed tiers Hierarchical, distributed tiers GRID ties distributed resources together GRID ties distributed resources together Tier-2 Tier-1 Tier-0 Dedicated or QoS Network Links ScotGRID CERN Universities RAL

UK Tony Doyle - University of Glasgow coordination required at collaboration and group levels Data Structure Raw Data Reconstruction Data Acquisition Level 3 trigger Trigger Tags Event Summary Data ESD Event Summary Data ESD Event Tags Physics Models Monte Carlo Truth Data MC Raw Data Reconstruction MC Event Summary Data MC Event Tags Detector Simulation Calibration Data Run Conditions Trigger System

UK Tony Doyle - University of Glasgow Physics Analysis ESD: Data or Monte Carlo Event Tags Event Selection Analysis Object Data AOD Analysis Object Data AOD Calibration Data Analysis, Skims Raw Data Tier 0,1 Collaboration wide Tier 2 Analysis Groups Tier 3, 4 Physicists Physics Analysis Physics Objects Physics Objects Physics Objects INCREASING DATA FLOWINCREASING DATA FLOW

UK Tony Doyle - University of Glasgow ATLAS Parameters Running conditions at startup: Running conditions at startup: Raw event size ~2 MB (recently revised upwards...) Raw event size ~2 MB (recently revised upwards...) 2.7x10 9 event sample 5.4 PB/year, before data processing 2.7x10 9 event sample 5.4 PB/year, before data processing Reconstructed events, Monte Carlo data ~9 PB/year (2PB disk) Reconstructed events, Monte Carlo data ~9 PB/year (2PB disk) CPU: ~2M SpecInt95 CPU: ~2M SpecInt95 CERN alone can handle only 1/3 of these resources

UK Tony Doyle - University of Glasgow Data Hierarchy RAW, ESD, AOD, TAG RAW Recorded by DAQ Triggered events Detector digitisation ~2 MB/event ESD Pseudo-physical information: Clusters, track candidates (electrons, muons), etc. Reconstructedinformation ~100 kB/event AOD Physical information: Transverse momentum, Association of particles, jets, (best) id of particles, Physical info for relevant objects Selectedinformation ~10 kB/event TAG Analysisinformation ~1 kB/event Relevant information for fast event selection

UK Tony Doyle - University of Glasgow Testbed DataBase Object Model: Atlas Simulated Raw Events Object Model: Atlas Simulated Raw Events b PEvent b PEventObjVector b PEventObj b PSiDetector b PSiDigit b PMDT_Detector b PMDT_Digit b PCaloRegion b PCaloDigit b PTruthVertex b PTruthTrack System DBRaw Data DB1Raw Data DB2... Event ContainerRaw Data Container PEvent #1 PEventObjeVector PEventObjVector : PEvent #2 PEventObjVector : PSiDetector PSiDigit... PTRT_Detector PTRTDigit... PMDT_Detector PMDT_Digit... PCaloRigion PCaloDigit... PTruthVertex PTruthTrack... :

UK Tony Doyle - University of Glasgow LHC Computing Challenge Tier2 Centre ~1 TIPS Online System Offline Farm ~20 TIPS CERN Computer Centre >20 TIPS RAL Regional Centre US Regional Centre French Regional Centre Italian Regional Centre Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec Mbits/sec One bunch crossing per 25 ns 100 triggers per second Each event is ~1 Mbyte Physicists work on analysis channels Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~ Gbits/sec or Air Freight Tier2 Centre ~1 TIPS ~Gbits/sec Tier 0 Tier 1 Tier 3 Tier 4 1 TIPS = 25,000 SpecInt95 PC (1999) = ~15 SpecInt95 ScotGRID++ ~1 TIPS Tier 2

UK Tony Doyle - University of Glasgow e.g. MySQL database daemon Basic 'crash-me' and associated tests Access times for basic insert, modify, delete, update database operations e.g. (on 256Mbyte, 800MHz Red Hat 6.2 linux box) Database Access Benchmark 350k data insert operations149 seconds 10k query operations97 seconds 350k data insert operations149 seconds 10k query operations97 seconds Many applications require database functionality Currently favoured HEP DataBase application e.g. BaBar, ZEUS software

UK Tony Doyle - University of Glasgow CPU Intensive Applications Numerically intensive simulations: Minimal input and output data ATLAS Monte Carlo (gg H bb) 228 sec/3.5 Mb event on 800 MHz linux box Standalone physics applications: 1. Simulation of neutron/photon/electron interactions for 3D detector design 2. NLO QCD physics simulation CompilerSpeed (MFlops) Fortran (g77) 27 C (gcc)43 Java (jdk)41 Compiler Tests:

UK Tony Doyle - University of Glasgow Network Monitoring Prototype Tools: Java Analysis Studio over TCP/IP Instantaneous CPU Usage Scalable Architecture Individual Node Info.

UK Tony Doyle - University of Glasgow Analysis Architecture Converter Algorithm Event Data Service Persistency Service Data Files Algorithm Transient Event Store Detec. Data Service Persistency Service Data Files Transient Detector Store Message Service JobOptions Service Particle Prop. Service Other Services Histogram Service Persistency Service Data Files Transient Histogram Store Application Manager Converter The Gaudi Framework - developed by LHCb - adopted by ATLAS (Athena)

UK Tony Doyle - University of Glasgow GRID Services Grid Services Grid Services Resource Discovery Scheduling Security Monitoring Data Access Policy Athena/Gaudi Services Athena/Gaudi Services Application manager Job Options service Event persistency service Detector persistency Histogram service User interfaces Visualization Database Database Event model Object federations Extensible interfaces and protocols being specified and developed: Tools: 1. UML 2. Java Protocols:1. XML 2. MySQL DataGRID Toolkit 3. LDAP }

UK Tony Doyle - University of Glasgow Virtual Data Scenario Example analysis scenario: Example analysis scenario: Physicist issues a query from Athena for a Monte Carlo dataset Issues: How expressive is this query? What is the nature of the query: declarative Creating new queries and language Algorithms are already available in local shared libraries An Athena service consults an ATLAS Virtual Data Catalog Consider possibilities: Consider possibilities: TAG file exists on local machine (e.g. Glasgow) Analyze it ESD file exists in a remote store (e.g. Edinburgh) Access relevant event files, then analyze that RAW File no longer exists (e.g. RAL) Regenerate, re-reconstruct, re-analyze !!! GRID Data Management

UK Tony Doyle - University of Glasgow Globus

UK Globus Data GRID Tool Kit

UK Tony Doyle - University of Glasgow GRID Data Management Goal: develop middle-ware infrastructure to manage petabyte-scale data Secure Region High Level Services Medium Level Services Core Services Service levels reasonably well defined Identify Key Areas Within Software Structure

UK Tony Doyle - University of Glasgow 5 areas for development 5 areas for development Data Accessor - hides specific storage system requirements. Mass Storage Management group. Replication - improves access by wide-area caching. Globus toolkit offers sockets and a communication library, Nexus. Meta Data Management - data catalogues, monitoring information (e.g. access pattern), grid configuration information, policies. MySQL over Lightweight Directory Access Protocol (LDAP) being investigated. Security - ensuring consistent levels of security for data and meta data. Query optimisation - cost minimisation based on response time and throughput Monitoring Services group. Identifiable UK Contributions RAL Identifying Key Areas RAL

UK Tony Doyle - University of Glasgow AstroGrid WP1 PROJECT MANAGEMENT WP2 REQUIREMENTS ANALYSIS : existing functionality and future requirements; community consultation WP3 SYSTEM ARCHITECTURES: benchmark and implement WP4 GRID-ENABLE CURRENT PACKAGES : implement and test performance WP5 DATABASE SYSTEMS : requirements analysis and implementation; scalable federation tools. WP6 DATA MINING ALGORITHMS : requirements analysis, development and implementation WP7 BROWSER APPLICATIONS : requirements analysis and software development WP8 VISUALISATION : concepts and requirements analysis, software development. WP9 INFORMATION DISCOVERY : concepts and requirements analysis, software development WP10 FEDERATION OF KEY CURRENT DATASETS : e.g.. SuperCOSMOS, INT-WFS, 2MASS, FIRST, 2dF WP11 FEDERATION OF NEXT GENERATION OPTICAL-IR DATASETS : esp. Sloan, WFCAM WP12 FEDERATION of HIGH ENERGY ASTROPHYSICS DATASETS : esp. Chandra, XMM WP13 FEDERATION of SPACE PLASMA and SOLAR DATASETS : esp. SOHO, Cluster, IMAGE WP14 COLLABORATIVE DEVELOPMENT OF VISTA, VST, and TERAPIX PIPELINES WP15 COLLABORATION PROGRAMME WITH INTERNATIONAL PARTNERS WP16 COLLABORATION PROGRAMME WITH OTHER DISCIPLINES Emphasis on High Level GUIs etc WP 1 Grid Workload Management A.Martin-QMW (0.5) WP 2 Grid Data Management A.Doyle-Glasgow (1.5) WP 3 Grid Monitoring services R.Middleton-RAL (1.8) WP 4 Fabric Management A.Sansum-RAL (0.5) WP 5 Mass Storage Management J.Gordon-RAL (1.5) WP 6 Integration Testbed D.Newbold-Bristol (3.0) WP 7 Network Services P.Clarke-PPNCG/UCL (2.0) WP 8 HEP Applications N/A (?) (4.0) WP 9 EO Science Applications ( c/o R.Middleton-RAL ) (0.0) WP 10 Biology Applications ( c/o P.Jeffreys-RAL ) (0.1) WP 11 Dissemination P.Jeffreys-RAL (0.1) WP 12 Project Management R.Middleton-RAL (0.5) Replication Fragmentation Emphasis on Low Level Services etc

UK Tony Doyle - University of Glasgow Testbed = Learning by Example +Cloning SRIF Expansion = expansion of open source ideas GRID Culture

UK Tony Doyle - University of Glasgow mission to accelerate the exploitation of simulation by industry, commerce and academia mission to accelerate the exploitation of simulation by industry, commerce and academia 45 staff, £2.5M turnover - externally funded 45 staff, £2.5M turnover - externally funded solve business problems - not sell technology solve business problems - not sell technology Partnership Important

UK Tony Doyle - University of Glasgow Industrial Partnership ping service ping monitor WAN LAN Adoption of OPEN Industry Standards +OO Methods Industry Research Council Inspiration: Data-Intensive Computation

UK Tony Doyle - University of Glasgow Regional Centres SRIF Infrastructure Grid Data Management Security Monitoring Networking Local Perspective: Consolidate Research Computing Optimisation of Number of Nodes? 4-5? Relative size dependent on funding dynamics Global Perspective: V. Basic Grid Skeleton Regional Expertise Model?

UK Tony Doyle - University of Glasgow Todays World Istituto Trentino Di Cultura Helsinki Institute of Physics Science Research Council SARA

UK Tony Doyle - University of Glasgow Tomorrows World CR2 AC12 AC13 AC14 Istituto Trentino Di Cultura Helsinki Institute of Physics Science Research Council AC7 AC8 AC9 AC10 AC11 CR3 AC15 AC16 AC17 CR4 SARA AC18 AC19 CR5 AC20 AC21 CR6 CO

UK Tony Doyle - University of Glasgow Summary General Engagement (£=OK) General Engagement (£=OK) Mutual Interest (ScotGRID Example) Mutual Interest (ScotGRID Example) Emphasis on Emphasis on DataGrid Core Development (e.g. Grid Data Management) CERN lead + Unique UK Identity Extension of Open Source Idea Grid Culture = Academia + Industry Multidisciplinary Approach = University + Regional Basis Use of Existing Structures (e.g. EPCC, RAL) Hardware Infrastructure via SRIF + Industrial Sponsorship Now LHC Grid Data Management Security Monitoring Networking Detector for ALICE experiment Detector for LHCb experiment ScotGRID