Vincenzo Innocente, Beauty 2002CMS on the grid1 CMS on the Grid Vincenzo Innocente CERN/EP Toward a fully distributed Physics Analysis.

Slides:



Advertisements
Similar presentations
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
Advertisements

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Data Management Expert Panel - WP2. WP2 Overview.
Réunion DataGrid France, Lyon, fév CMS test of EDG Testbed Production MC CMS Objectifs Résultats Conclusions et perspectives C. Charlot / LLR-École.
CMS HLT production using Grid tools Flavia Donno (INFN Pisa) Claudio Grandi (INFN Bologna) Ivano Lippi (INFN Padova) Francesco Prelz (INFN Milano) Andrea.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Vincenzo Innocente, BluePrint RTAGNuts & Bolts1 Architecture Nuts & Bolts Vincenzo Innocente CMS.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
CMS Software & Computing C. Charlot / LLR-École Polytechnique, CNRS & IN2P3 for the CMS collaboration.
Other servers Java client, ROOT (analysis tool), IGUANA (CMS viz. tool), ROOT-CAVES client (analysis sharing tool), … any app that can make XML-RPC/SOAP.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Workload Management Massimo Sgaravatto INFN Padova.
ACAT Lassi A. Tuura, Northeastern University CMS Data Analysis Current Status and Future Strategy On behalf of CMS.
Chapter 9: Moving to Design
Caltech and CMS Grid Work Overview Koen Holtman Caltech/CMS May 22, 2002.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
CHEP `03 March 24, 2003 Vincenzo Innocente CERN/EP CMS Data Analysis: Present Status, Future Strategies Vincenzo.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Dave Newbold, University of Bristol24/6/2003 CMS MC production tools A lot of work in this area recently! Context: PCP03 (100TB+) just started Short-term.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Introduzione al Software di CMS N. Amapane. Nicola AmapaneTorino, Aprile Outline CMS Software projects The framework: overview Finding more.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Ramiro Voicu December Design Considerations  Act as a true dynamic service and provide the necessary functionally to be used by any other services.
Computing in CMS May 24, 2002 NorduGrid Helsinki workshop Veikko Karimäki/HIP.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
1 Grid Related Activities at Caltech Koen Holtman Caltech/CMS PPDG meeting, Argonne July 13-14, 2000.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
3 rd May’03Nick Brook – 4 th LHC Symposium1 Data Analysis – Present & Future Nick Brook University of Bristol Generic Requirements & Introduction Expt.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
April 2003 Iosif Legrand MONitoring Agents using a Large Integrated Services Architecture Iosif Legrand California Institute of Technology.
PPDG February 2002 Iosif Legrand Monitoring systems requirements, Prototype tools and integration with other services Iosif Legrand California Institute.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Claudio Grandi INFN-Bologna CHEP 2000Abstract B 029 Object Oriented simulation of the Level 1 Trigger system of a CMS muon chamber Claudio Grandi INFN-Bologna.
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
Vincenzo Innocente, CERN/EPUser Collections1 Grid Scenarios in CMS Vincenzo Innocente CERN/EP Simulation, Reconstruction and Analysis scenarios.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Geant4 User Workshop 15, 2002 Lassi A. Tuura, Northeastern University IGUANA Overview Lassi A. Tuura Northeastern University,
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
CPT Week, November , 2002 Lassi A. Tuura, Northeastern University Core Framework Infrastructure Lassi A. Tuura Northeastern.
Vincenzo Innocente, CHEP Beijing 9/01FrameAtWork1 Software Frameworks for HEP Data Analysis Vincenzo Innocente CERN/EP.
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
Database Replication and Monitoring
(on behalf of the POOL team)
Moving the LHCb Monte Carlo production system to the GRID
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Grid Data Integration In the CMS Experiment
Module 01 ETICS Overview ETICS Online Tutorials
Gridifying the LHCb Monte Carlo production system
CMS Software Architecture
Status and plans for bookkeeping system and production tools
Presentation transcript:

Vincenzo Innocente, Beauty 2002CMS on the grid1 CMS on the Grid Vincenzo Innocente CERN/EP Toward a fully distributed Physics Analysis

Vincenzo Innocente, Beauty 2002 CMS on the grid 2 Computing Architecture: Challenges at LHC Bigger Experiment, higher rate, more data Larger and dispersed user community performing non trivial queries against a large event store Make best use of new IT technologies Increased demand of both flexibility and coherence ability to plug-in new algorithms ability to run the same algorithms in multiple environments guarantees of quality and reproducibility high-performance user-friendliness

Vincenzo Innocente, Beauty 2002 CMS on the grid 3 Challenges: Complexity Detector: ~2 orders of magnitude more channels than today Triggers must choose correctly only 1 event in every 400,000 Level 2&3 triggers are software-based (must be of highest quality) Computer resources will not be available in a single location

Vincenzo Innocente, Beauty 2002 CMS on the grid 4 Challenges: Geographical Spread 1700 Physicists 150 Institutes 32 Countries CERN state 55 % NMS 45 % Major challenges associated with: Communication and collaboration at a distance Distributed computing resources Remote software development and physics analysis

Vincenzo Innocente, Beauty 2002 CMS on the grid 5 b physics: a challenge for CMS computing A large distributed effort already today ~150 physicists in CMS Heavy-flavor group > 40 institutions involved Requires precise and specialized algorithms for vertex- reconstruction and particle identification Most of CMS triggered events include B particles High level software triggers select exclusive channels in events triggered in hardware using inclusive conditions Challanges: Allow remote physicists to access detailed event-information Migrate effectively reconstruction and selection algorithms to High Level Trigger

Vincenzo Innocente, Beauty 2002 CMS on the grid 6 CMS Experiment-Data Analysis Detector Control Online Monitoring Environmental data store Request part of event Simulation store Data Quality Calibrations Group Analysis User Analysis on demand Request part of event Request part of event Store rec-Obj and calibrations Quasi-online Reconstruction Request part of event Store rec-Obj Persistent Object Store Manager Database Management System Event Filter Object Formatter PhysicsPaper

Vincenzo Innocente, Beauty 2002 CMS on the grid 7 Analysis Environments Real Time Event Filtering and Monitoring Data driven pipeline High reliability Pre-emptive Simulation, Reconstruction and Event Classification Massive parallel batch-sequential process Excellent error recovery and rollback mechanisms Excellent scheduling and bookkeeping systems Interactive Statistical Analysis Rapid Application Development environment Excellent visualization and browsing tools Human “readable” navigation

Vincenzo Innocente, Beauty 2002 CMS on the grid 8 Analysis Model Hierarchy of Processes (Experiment, Analysis Groups, Individuals) Reconstruction Selection Analysis Re-processing 3 per year Iterative selection Once per month Different Physics cuts & MC comparison ~1 time per day Experiment- Wide Activity (10 9 events) ~20 Groups’ Activity (10 9  10 7 events) ~25 Individual per Group Activity (10 6 –10 8 events) New detector calibrations Or understanding Trigger based and Physics based refinements Algorithms applied to data to get results 3000 SI95sec/event 1 job year 3000 SI95sec/event 1 job year 3000 SI95sec/event 3 jobs per year 3000 SI95sec/event 3 jobs per year 25 SI95sec/event ~20 jobs per month 25 SI95sec/event ~20 jobs per month 10 SI95sec/event ~500 jobs per day 10 SI95sec/event ~500 jobs per day Monte Carlo 5000 SI95sec/event 1GHz ~ 50SI95

Vincenzo Innocente, Beauty 2002 CMS on the grid 9 Data handling baseline CMS computing in year 2007 data model typical objects 1KB-1MB 3 PB 3 PB of storage space 10,000 10,000 CPUs 31 sites: 1 tier0+5 tier1+25 tier2 all over the world I/O rates disk->CPU: 10,000 MB/s, average 1 MB/s/CPU RAW->ESD generation: ~0.2 MB/s I/O / CPU ESD->AOD generation: ~5 MB/s I/O / CPU AOD analysis into histos: ~0.2 MB/s I/O / CPU DPD generation from AOD and ESD: ~10 MB/s I/O / CPU Wide-area I/O capacity: order of 700 MByte/s aggregate over all payload intercontinental TCP/IP streams This implies a system with heavy reliance on access to site-local (cached) data Data-Grid

Vincenzo Innocente, Beauty 2002 CMS on the grid 10 Prototype Computing Installation (T0/T1)

Vincenzo Innocente, Beauty 2002 CMS on the grid 11 Three Computing Environments: Different Challenges Centralized quasi-online processing Keep-up with the rate Validate and distribute data efficiently Distributed organized processing Automatization Interactive chaotic analysis Efficient access to data and “Metadata” Management of “private” data Rapid Application Development

Vincenzo Innocente, Beauty 2002 CMS on the grid 12 Migration Today Nobel price becomes trigger for tomorrow (and background the day after) Boundaries between running environments are fuzzy “Physics Analysis” algorithms should migrate up to the online to make the trigger more selective Robust batch systems should be made available for physics analysis of large data sample The result of offline calibrations should be fed back to online to make the trigger more efficient

Vincenzo Innocente, Beauty 2002 CMS on the grid 13 The Final Challenge: A Coherent Analysis Environment Beyond the interactive analysis tool (User point of view) Data analysis & presentation: N-tuples, histograms, fitting, plotting, … A great range of other activities with fuzzy boundaries (Developer point of view) Batch Interactive from “pointy-clicky” to Emacs-like power tool to scripting Setting up configuration management tools, application frameworks and reconstruction packages Data store operations: Replicating entire data stores; Copying runs, events, event parts between stores; Not just copying but also doing something more complicated—filtering, reconstruction, analysis, … Browsing data stores down to object detail level 2D and 3D visualisation Moving code across final analysis, reconstruction and triggers Today this involves (too) many tools

Vincenzo Innocente, Beauty 2002 CMS on the grid 14 Requirements on data processing High efficiency Processing-sites hardware optimization Processing-sites software optimization job structure depends very much on hardware setup Data quality assurance Data validation Data history (job book-keeping) Automatize Input data discovery Crash recovery Resource monitoring Identify bottlenecks and fragile components

Vincenzo Innocente, Beauty 2002 CMS on the grid 15 Analysis part Physics data analysis will be done by 100s of users Analysis part is connected to same catalogs Maintain a global view of all data Big analysis jobs can use production job handling mechanisms Analysis services based on tags

Vincenzo Innocente, Beauty 2002 CMS on the grid 16 Lizard Qt plotter ANAPHE histogram Extended with pointers to CMS events Emacs used to edit CMS C++ plugin to create and fill histograms OpenInventor-based display of selected event Python shell with Lizard & CMS modules

Vincenzo Innocente, Beauty 2002 CMS on the grid 17 Varied components and data flows One Portal Tool plugin module Production system and data repositories ORCA analysis farm(s) (or distributed `farm’ using grid queues) RDBMS based data warehouse(s) PIAF/Proof/.. type analysis farm(s) Local disk User TAGs/AODs data flow Physics Query flow Tier 1/2 Tier 0/1/2 Tier 3/4/5 Production data flow TAG and AOD extraction/conversion/transport services Data extraction Web service(s) Local analysis tool: Lizard/ROOT/… Web browser Query Web service(s)

Vincenzo Innocente, Beauty 2002CMS on the grid18 CMS TODAY Home-Made Tools Data production and analysis exercises granularity (Data Product): Data-Set (simulated physics channel) Development and deployment of a distributed data processing system (Hardware & Software) Test and integration of Grid middleware prototypes R&D on distributed interactive analysis

Vincenzo Innocente, Beauty 2002 CMS on the grid 19 Current CMS Production Pythia Zebra files with HITS HEPEVT Ntuples CMSIM (GEANT3) ORCA/COBRA Digitization (merge signal and pile-up) Objectivity Database ORCA/COBRA ooHit Formatter Objectivity Database OSCAR/COBRA (GEANT4) ORCA User Analysis Ntuples or Root files Objectivity Database IGUANA Interactive Analysis

Vincenzo Innocente, Beauty 2002 CMS on the grid 20 CMS Production stream TaskApplication InputOutput Req. on resources non-standard 1GenerationPythiaNoneNtuple (static link) Geometry files Storage 2SimulationCMSIMNtupleFZ file 3 Hit Formatting ORCA H.F.FZ fileDB Shared libs Full CMS env. Storage 4DigitizationORCA Digi.DB 5 User analysis ORCA UserDB Ntuple or root Shared libs Full CMS env. Distributed input

Vincenzo Innocente, Beauty 2002 CMS on the grid 21 CMS distributed production tools RefDB Production flow Manager Web Portal, MySql backend IMPALA (Intelligent Monte Carlo Production Local Actuator) Job scheduler “to-do” discovery, job decomposition, script assembly from templates error recovery and re-submit BOSS (Batch Object Submission System) Job control, monitoring and tracking Envelop script, filter output-stream, log in MySql Backend DAR Distribution of software in binary form (shared-libs and bin)

Vincenzo Innocente, Beauty 2002 CMS on the grid 22 Current data processing “Produce events dataset mu_MB2mu_pt4” IMPALA decomposition (Job scripts) JOBS RC BOSS DB IMPALA monitoring (Job scripts) Production “RefDB” Production Interface Production Manager distributes tasks to Regional Centers Farm storage Request Summary file RC farm Regional Center Data location through Production DB

Vincenzo Innocente, Beauty 2002 CMS on the grid 23 Production 2002, Complexity Number of Regional Centers11 Number of Computing Centers21 Number of CPU’s~1000 Largest Local Center176 CPUs Number of Production Passes for each Dataset (including analysis group processing done by production) 6-8 Number of Files~11,000 Data Size (Not including fz files from Simulation)17TB File Transfer by GDMP and by perl Scripts over scp/bbcp 7TB toward T1 4TB toward T2

Vincenzo Innocente, Beauty 2002 CMS on the grid 24 Spring02: CPU Resources Wisconsin 18% INFN 18% IN2P3 10% RAL 6% UCSD 3% UFL 5% HIP 1% Caltech 4% Moscow 10% Bristol 3% FNAL 8% CERN 15% IC 6% : 700 active CPUs plus 400 CPUs to come

Vincenzo Innocente, Beauty 2002 CMS on the grid 25 INFN-Legnaro Tier-2 prototype FastEth 32 – GigaEth 1000 BT SWITCH N1 FastEth SWITCH 1 8 S1 S16 N24 N1 Nx – Computational Node Dual PIII – 1 GHz 512 MB 3x75 GB Eide disk + 1x20 GB for O.S. Sx – Disk Server Node Dual PIII – 1 GHz Dual PCI (33/32 – 66/64) 512 MB 3x75 GB Eide Raid 0-5 disks (exp up to 10) 1x20 GB disk O.S. FastEth SWITCH N1 2 N Nodes 70 CPUs 3500 SI95 8 TB up to 190 Nodes S Servers 1100 SI TB To WAN 34 Mbps Mbps 2002

Vincenzo Innocente, Beauty 2002 CMS on the grid 26 ORCA Db Structure 1 CMSIM Job MC Info Container #1 Calo/Muon Hits Tracker Hits ooHit dB's MC Info Run1 MC Info Run2 MC Info Run3.. Concatenated MC Info from N runs. One CMSIM Job, oo-formatted into multiple Db’s. For example: Multiple sets of ooHits concatenated into single Db file. For example: FZ File ~2 GB/file ~300kB/ev ~100kB/ev Few kB/ev ~200kB/ev Physical and logical Db structures diverge...

Vincenzo Innocente, Beauty 2002 CMS on the grid 27 Production center setup Most critical task is digitization 300 KB per pile-up event 200 pile-up events per signal event  60 MB 10 s to digitize 1 full event on a 1 GHz CPU 6 MB / s per CPU (12 MB / s per dual processor client) Up to ~ 5 clients per pile-up server (~ 60 MB / s on its network card Gigabit) Fast disk access client Pile-up server client 12 MB/s ~60 MB/s ~5 clients per server Pile-up DB

Vincenzo Innocente, Beauty 2002CMS on the grid28 CMS TOMORROW Transition to Grid-Middleware Use Virtual Data tools for workflow mng at DataSet level Use Grid Security infrastructure & Workload manager Deploy Grid-enabled portal to interactive Analysis Global monitoring of Grid performances and quality of service CMS Grid workshop at CERN 11-14/6/2002

Vincenzo Innocente, Beauty 2002 CMS on the grid 29 Toward ONE Grid Build a unique CMS-GRID framework (EU+US) EU and US grids not interoperable today. Wait for help from DataTAG-iVDGL-GLUE Work in parallel in EU and US Main US activities: MOP Virtual Data System Interactive Analysis Main EU activities: Integration of IMPALA with EDG WP1+WP2 sw. Batch Analysis: user job submission & analysis farm

Vincenzo Innocente, Beauty 2002 CMS on the grid 30 PPDG MOP system u PPDG Developed MOP System u Allows submission of CMS prod. Jobs from a central location, run on remote locations, and return results u Relies on GDMP for replication u Globus GRAM u Condor-G and local queuing systems for Job Scheduling u IMPALA for Job Specification u being deployed in USCMS testbed u Proposed as basis for next CMS-wide production infrastructure

Vincenzo Innocente, Beauty 2002 CMS on the grid 31 Storage Resource Replica Mngmt Catalog Services Planner Executor User RefDB Materialized Data Catalog Virtual Data Catalog Concrete Planner/ WP1 Abstract Planner MOP/ WP1 Replica Catalog GDMP Local Grid Storage Objectivity Metadata Catalog Local Tracking DB Compute Resource BOSS CMKIN CMSIM ORCA/COBRA Wrapper Scripts Prototype VDG System (production)

Vincenzo Innocente, Beauty 2002 CMS on the grid 32 Storage Resource Replica Mngmt Catalog Services Planner Executor User RefDB Materialized Data Catalog Virtual Data Catalog Concrete Planner/ EDG-WP1 Abstract Planner MOP/ EDG-WP1 Replica Catalog GDMP Local Grid Storage Objectivity Metadata Catalog Local Tracking DB Compute Resource BOSS CMKIN CMSIM ORCA/COBRA Wrapper Scripts = no code= existing= implemented using MOP Prototype VDG System (production)

Vincenzo Innocente, Beauty 2002 CMS on the grid 33 IMPALA/BOSS integration with EDG User Environment DOLLY BOSS jobs mySQL DB RefDB at CERN CE batch manager NFS WN1WN2 CMKIN IMPALA WNn UI GRID EDG-RB UI job executer job

Vincenzo Innocente, Beauty 2002 CMS on the grid 34 Push & Pull rsh & ssh existing scripts snmp RC Monitor Service Farm Monitor Client (other service) Lookup Service Lookup Service Registration Farm Monitor Discovery Proxy  Component Factory  GUI marshaling  Code Transport  RMI data access Globally Scalable Monitoring Service

Vincenzo Innocente, Beauty 2002 CMS on the grid 35 Optimisation of “Tag” Databases  Tags (n-tuple) are small (~ kbyte) summary objects for each event  Crucial for fast selection of interesting event subsets; this will be an intensive activity  Past work concentrated in three main areas:  Development of Objectivity based Tags integrated with the CMS “COBRA[*]” framework and Lizard  Investigations of Tag bitmap indexing to speed queries  Comparisons of OO and traditional databases (SQL Server, Oracle 9i, PostGreSQL) as efficient stores for Tags  New work concentrates on tag based analysis services

Vincenzo Innocente, Beauty 2002 CMS on the grid 36 CLARENS: a Portal to the Grid  Grid-enabling the working environment for physicists' data analysis  Clarens consists of a server communicating with various clients via the commodity XML- RPC protocol. This ensures implementation independence.  The server will provide a remote API to Grid tools: Client RPC Web Server Clarens Service http/https  The Virtual Data Toolkit: Object collection access  Data movement between Tier centres using GSI-FTP  CMS analysis software (ORCA/COBRA),  Security services provided by the Grid (GSI)  No Globus needed on client side, only certificate Current prototype is running on the Caltech proto-Tier2

Vincenzo Innocente, Beauty 2002 CMS on the grid 37 Clarens Architecture Common protocol spoken by all types of clients to all types of services Implement service once for all clients Implement client access to service once for each client type using common protocol already implemented for “all” languages (C++, Java, Fortran, etc. :-) Common protocol is XML-RPC with SOAP close to working, CORBA doable, but would require different server above Clarens (uses IIOP, not HTTP) Handles authentication using Grid certificates, connection management, data serialization, optionally encryption Implementation uses stable, well-known server infrastructure (Apache) that is debugged/audited over a long period by many Clarens layer itself implemented in Python, but can be reimplemented in C++ should performance be inadequate More information at along with a web-based demo

Vincenzo Innocente, Beauty 2002CMS on the grid Grid-enable Analysis Sub-event components map to Grid Data-Products Balance of load between Network and CPU Complete Data and Software base “virtually” available at the physicist desktop

Vincenzo Innocente, Beauty 2002 CMS on the grid 39 Evolution of Computing in CMS Ramp Production systems (30%,+30%,+40% of cost each year) Match Computing power available with LHC luminosity M Reco ev/mo 100M Re-Reco ev/mo 30k ev/s Analysis M Reco ev/mo 200M Re-Reco ev/mo 50k ev/s Analysis Old schedule: new one stretched of 15 more months

Vincenzo Innocente, Beauty 2002 CMS on the grid 40 Federation wizards Detector/Event Display Data Browser Analysis job wizards Generic analysis Tools ORCA FAMOS POMtools GRID OSCAR COBRA Distributed Data Store & Computing Infrastructure CMStools Grid-enable Analysis Consistent User Interface Coherent set of basic tools and mechanisms Software development and installation

Vincenzo Innocente, Beauty 2002 CMS on the grid 41 Simulation, Reconstruction & Analysis Software System Specific Framework ODBMS Geant3/4 CLHEP Paw Replacement C++ standard library Extension toolkit Reconstruction Algorithms Data Monitoring Event Filter Physics Analysis Calibration Objects Event Objects Configuration Objects Generic Application Framework Physics modules adapters and extensions Basic Services Grid-Aware Data-Products Grid-enabled Application Framework Uploadable on the Grid

Vincenzo Innocente, Beauty 2002 CMS on the grid 42 Reconstruction on Demand Event Rec T2 Rec T1 Rec Hits Analysis Hits T1 CaloCl DetectorElement Compare the results of two different track reconstruction algorithms T2 RecCaloCl

Vincenzo Innocente, Beauty 2002 CMS on the grid 43 Conclusions CMS considers the Grid as the enabling technology for the effective deployment of a coherent and consistent data processing environment This is the only base for an efficient physics analysis program at LHC “Spring 2002” production just finished successfully: Distributed analysis started Make use of grid-middleware is next milestone CMS is engaged in an active development, test and deployment program of all software and hardware components that will constitute the future LHC grid