NA4 Applications F.Harris(Oxford/CERN) NA4/HEP coordinator

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

An overview of the EGEE project Bob Jones EGEE Technical Director DTI International Technology Service-GlobalWatch Mission CERN – June 2004.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Roberto Barbera EGEE NA4 Generic Applications coordinator.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Application Use Cases NIKHEF, Amsterdam, December 12, 13.
Enabling, facilitating and delivering quality training in the UK and Internationally The challenge of grid training and education David Fergusson, Deputy.
EGEE (EU IST project) – Networking Activities 4: biomedical applications NA4 Biomedical Activities Johan Montagnat First EGEE Meeting Cork,
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
EGEE is a project funded by the European Union under contract IST Generic Applications: strategy, organization, tools and manpower Roberto.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
A short introduction to the Worldwide LHC Computing Grid Maarten Litmaath (CERN)
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
EGEE is proposed as a project funded by the European Union under contract IST EU eInfrastructure project initiatives FP6-EGEE Fabrizio Gagliardi.
INFSO-RI Enabling Grids for E-sciencE EGEE and Industry Bob Jones EGEE-II Project Director Final EGEE Review CERN, May 2006.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
SA1/SA2 meeting 28 November The status of EGEE project and next steps Bob Jones EGEE Technical Director EGEE is proposed as.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
EGEE is a project funded by the European Union under contract IST NA4 Applications F.Harris(Oxford/CERN) NA4/HEP coordinator
Induction: Adding new applications to EGEE infrastructure –April 26-28, Adding new applications to EGEE infrastructure Roberto Barbera EGEE is.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
EGEE is a project funded by the European Union under contract IST Presentation of NA4 Generic Applications Roberto Barbera NA4 Generic Applications.
NA4 all activity meeting V. Breton on behalf of NA4.
HEP Applications within EGEE - an overview and relations with GAG F Harris (Oxford/CERN) EGEE is proposed as a project funded by the European Union under.
EGEE is a project funded by the European Union under contract IST Generic Applications in EGEE-NA4 Roberto Barbera NA4 Generic Applications.
INFSO-RI Enabling Grids for E-sciencE All activity meeting Vincent Breton On behalf of NA4.
Guy Wormser IN2P3/CNRS, EGEE Applications Manager September 2003 EGEE is proposed as a project funded by the European Union under contract IST
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE Project Review Fabrizio Gagliardi EDG-7 30 September 2003 EGEE is proposed as a project funded by the European Union under contract IST
NA4 all activity meeting V. Breton on behalf of NA4.
EGEE is a project funded by the European Union under contract IST EGEE Networking Activities Mike Mineter Training team National e-Science.
Organisation of HEP Applications within NA4 F Harris (Oxford/CERN) EGEE is proposed as a project funded by the European Union under contract IST
INFSO-RI Enabling Grids for E-sciencE NA4 status and issues.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Training in EGEE-II Mike Mineter (Some slides from Brendan Hamill)
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA: a status report Roberto Barbera NA4 Generic Applications.
EGEE is a project funded by the European Union under contract IST Generic Applications Requirements Roberto Barbera NA4 Generic Applications.
EGEE is a project funded by the European Union under contract IST GENIUS and GILDA Guy Warner NeSC Training Team Induction to Grid Computing.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Report from.
EGEE is a project funded by the European Union under contract IST NA4/NA2 activities Roberto Barbera TB INFN Grid,
1-2 March 2006 P. Capiluppi INFN Tier1 for the LHC Experiments: ALICE, ATLAS, CMS, LHCb.
EGEE is a project funded by the European Union under contract IST NA activities Roberto Barbera Technical Board INFN Grid, Bologna,
ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,
Bob Jones EGEE Technical Director
Eleonora Luppi INFN and University of Ferrara - Italy
Database Replication and Monitoring
EGEE Middleware Activities Overview
EGEE is a project funded by the European Union
GILDA t-Infrastructure
JRA3 Introduction Åke Edlund EGEE Security Head
The LHC Computing Grid Visit of Mtro. Enrique Agüera Ibañez
SA1 Execution Plan Status and Issues
Moving the LHCb Monte Carlo production system to the GRID
EO Applications Parallel Session
Ian Bird GDB Meeting CERN 9 September 2003
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Data Challenge with the Grid in ATLAS
EGEE and Induction Mike Mineter NeSC Training Team
Guy Wormser IN2P3/CNRS , EGEE Applications Manager
EGEE support for HEP and other applications
Network Requirements Javier Orellana
Simulation use cases for T2 in ALICE
US ATLAS Physics & Computing
Collaboration Board Meeting
Overview of the EGEE project and the gLite middleware
Future EU Grid Projects
Gridifying the LHCb Monte Carlo production system
ATLAS DC2 & Continuous production
LHCb thinking on Regional Centres and Related activities (GRIDs)
GRIF : an EGEE site in Paris Region
Presentation transcript:

NA4 Applications F.Harris(Oxford/CERN) NA4/HEP coordinator www.eu-egee.org EGEE is a project funded by the European Union under contract IST-2003-508833

Talk Outline The basic goals of NA4 The organisation The participants and their roles The flavour of the work for the NA4 sub-groups Biomed HEP ‘Generic’ applications Testing Industry Forum Milestones and deliverables Relations with other EGEE activities Concluding comments EGEE Induction May 17-19 - 2

NA4: Identification and support of early-user and established applications on the EGEE infrastructure To identify through the dissemination partners and a well defined integration process a portfolio of early user applications from a broad range of application sectors from academia, industry and commerce. To support development and production use of all of these applications on the EGEE infrastructure and thereby establish a strong user base on which to build a broad EGEE user community. To initially focus on two well-defined application areas – Particle Physics and Life sciences, while developing a process for supporting other application areas EGEE Induction May 17-19 - 3

NA4 organisational structure EGEE Induction May 17-19 - 4

NA4 technical organization LCG EGEE PEB NA4 AWG (V. Breton) Test team R. Météry Eric Fede HEP F. Harris M. Lamanna Biomed J. Montagnat C. Blanchet Generic R. Barbera ARDA Data challenges Biomed technical team Generic technical team EGEE Induction May 17-19 - 5

Roles and staffing Role NA3 Liaison 0,5 Italy Generic app (coord) 2 Federation Role FTE Funded Unfunded CERN HEP Applic ations (coord.) 4 4 (9) UK+Ireland NA3 Liaison 0,5 Italy Generic app (coord) 2 France General coord., BioMed, Test team, Industry diss. 7 Northern Europe Generic applications 1 Germany + Switzerland Central Europe South West Europe BioMed Russia HEP, BioMed 3 Totals 21,5 EGEE Induction May 17-19 - 6

Biomedical Requirements Complex data requirements Heterogeneous data formats (genomics, proteomics, image formats) Frequent data updates Complex data sets (medical records) Security/privacy constraints Long term archiving requirements Complex processing requirements Bioinformatics: gene/proteome databases distributions Medical applications (screening, epidemiology...): image databases distribution Parallel algorithms for medical image processing, simulation, etc Interactive application (human supervision or simulation) EGEE Induction May 17-19 - 7

BLAST: Bioinformatics on the EDG testbed BLAST is the first step for analysing new sequences: to compare DNA or protein sequences to other ones stored in personal or public databases. Ideal as a grid application. Requires resources to store databases and run algorithms Can compare one or several sequence against a database in parallel Large user community EGEE Induction May 17-19 - 8

BLAST gridification Computing element Input file UI DB DB DB Computing dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf Seq1 > dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbdfndfjvbndfbnbnfbjnbjxbnxbjk:nxbf Seq2 > dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbdfndfjvbndfbnbnfbjnbjxbnxbjk:nxbf Seqn > dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbdfndfjvbndfbnbnfbjnbjxbnxbjk:nxbf DB BLAST dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB BLAST Seq1 > dcscdssdcsdcdsc bscdsbcbjbfvbfvbvfbvbvbhvbhsvbhdvbhfdbvfd Seq2 > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdfgcsdycgdkcsqkc … Seqn > bvdfvfdvhbdfvb bhvdsvbhvbhdvrefghefgdscgdfgcsdycgdkcsqkchdsqhfduhdhdhqedezhhezldhezhfehflezfzejfv UI dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB BLAST dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf DB BLAST dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf RESULT dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbfvbfvbvfbvbvbhvbhsvbhdvbhfdbvfdbvdfvfdvhbdfvbhdbhvdsvbhvbhdvrefghefgdscgdfgcsdycgdkcsqkcqhdsqhfduhdhdhqedezhdhezldhezhfehflezfzeflehfhezfhehfezhflezhflhfhfelhfehflzlhfzdjazslzdhfhfdfezhfehfizhflqfhduhsdslchlkchudcscscdscdscdscsddzdzeqvnvqvnq! Vqlvkndlkvnldwdfbwdfbdbd wdfbfbndblnblkdnblkdbdfbwfdbfn Computing element dedzedzdzedezdzecdscsdcscdssdcsdcdscbscdsbcbjbf EGEE Induction May 17-19 - 9

Monte carlo simulation for radiotherapy planning GATE Retrieving of root output files from CEs Copy the medical image from the SE to the CE Storage Element Computing Submission of jdls to the CEs CCIN2P3 Storage Element Computing RAL Scanner slices: DICOM format Storage Element Computing User interface NIKHEF Concatenation Anonymisation Database Image: text file Storage Element Computing MARSEILLE Binary file: Image.raw Size 19M EGEE Induction May 17-19 - 10

LHC Experiments ATLAS CMS Storage – Raw recording rate 0.1 – 1 GByte/s Accumulating at 5-8 PetaByte/year 10 PetaByte of disk Processing – 200,000 of today’s fastest PCs ALICE LHCb LHCb EGEE Induction May 17-19 - 11

HEP Data Handling and Computation event filter (selection & reconstruction) detector processed data event summary data raw data batch physics analysis event reprocessing Raw data are staged to disk and archived on tape Data for subsequent processing (ESD, AOD) are cached on disk Production analysis typically involves skims and archiving according to event type for ease of replication and caching (department servers and desktop) Few cycles re-reconstruction – most data/cpu intensive Several cycles on production analysis Very many cycles on end user analysis – least data/cpu intensive analysis objects (extracted by physics topic) event simulation interactive physics analysis EGEE Induction May 17-19 - 12

Data Hierarchy “RAW, ESD, AOD, TAG” RAW Triggered events ESD AOD TAG recorded by DAQ ~2 MB/event Detector digitisation ESD Reconstructed information Pseudo-physical information: Clusters, track candidates ~100 kB/event Physical information: Transverse momentum, Association of particles, jets, (best) id of particles, AOD Analysis information ~10 kB/event TAG Classification information Relevant information for fast event selection ~1 kB/event EGEE Induction May 17-19 - 13

HEP requirements Data requirements Processing requirements Huge quantities of data( many Petabytes) Data is of characteristic WORM (write once – read many times) Data structured to allow data mining Long time archiving, and need data copying around the world Processing requirements Processing breaks down into 2 classes – production and ‘chaotic Production done regularly with sets of ~ 10**9 events Individual analyses done randomly on sets of maybe 10**7 events – many hundreds of such users Processing is highly parallel at ‘event’ level with directed graph like seqences in the processing Interactive work very important for analysis- ability to save sessions and know origins of data (provenance) Need global access to experiment databases for constants, running conditions etc. EGEE Induction May 17-19 - 14

Characteristics of CMS Data Challenge DC04 Pre-Challenge Production Uses OCTOPUS After 8 months of continuous running: 750,000 jobs 3,500 KSI2000 months 700,000 files 80 TB of data With OSCAR (Geant 4) 16 Mevts in 6 months Data Challenge Run the full data reconstruction and distribution chain at 25 Hz Achieved only for a limited amount of time: 2,200 jobs/day (about 500 CPU’s) running at Tier-0 4 MB/s produced and distributed to each Tier-1 0.4 files/s registered to RLS (with POOL metadata) NOSCAR as function of time NDST as function of time 25 Hz 15 Mevts/week EGEE Induction May 17-19 - 15

ARDA Coordination and forum activities ALICE Distr. Analysis ATLAS Distr. Analysis CMS Distr. Analysis LHCb Distr. Analysis EGEE NA4 Application identification and support GAE ARDA Collaboration Coordination Integration Specification Priorities Planning Experience  Use Cases PROOF LCG-GAG Grid Application Group SEAL POOL Resource Providers Community EGEE middleware EGEE Induction May 17-19 - 16

NA4 Generic Applications This is a key activity in the process of getting new scientific and industrial communities interested and committed to use the continental grid infrastructure built by the EGEE Project. GENIUS is a well established tool which will be fundamental in the process of interfacing new applications with the EGEE middleware hiding its complex internals to non-experts users from new communities. GILDA is a complete suite of grid elements (test-bed, CA, VO, monitoring system, web portal) and applications fully dedicated to dissemination purposes. This could also represent the ideal grid testbed where to start the porting of new generic applications. GILDA is the dissemination tool which will be used by NA3 during courses and tutorials so the important aspect of induction of the grid paradigm to new communities is also covered. It is now important to have the first meeting of the EGAAP board and define the first Generic Applications to be interfaced. EGEE Induction May 17-19 - 17

The GILDA home page (http://gilda.ct.infn.it) EGEE Induction May 17-19 - 18

The Generic Application questionnaire (contribution to MNA4.1) Questionnaire to get information and first requirements from new communities interested in using the EGEE Infrastructure (http://alipc1.ct.infn.it/grid/egee/na4/questionnaire/na4-genapp-questionnaire.doc) Feed-backs received so far (http://alipc1.ct.infn.it/grid/egee/na4/questionnaire): Astrophysics (EVO and Planck satellite) Earth Observation (ozone maps, seismology, climate) Digital Libraries (DILIGENT Project) Grid Search Engines (GRACE Project) Industrial applications (SIMDAT Project) Interest also from Computational Chemistry (Italy and Czech Republic), Civil Engineering (Spain), and Geophysics (Switzerland and France) communities EGEE Induction May 17-19 - 19

EGEE Industry Forum Objectives The main role of the Industry Forum in the Enabling Grids for E-Science in Europe (EGEE) project is to raise awareness of the project amongst industry and to encourage businesses to participate in the project. This will be achieved by making direct contact with industry, liaising between the project partners and industry in order to ensure businesses get what they need in terms of standards and functionality and ensuring that the project benefits from the practical experience of businesses. The members of the EGEE Industry Forum are companies of all sizes who have their core business or a part of their business within Europe. The Industry Forum will be managed by a steering group consisting of EGEE project partners and representatives from business http://public.eu-egee.org/industry-forum/information EGEE Induction May 17-19 - 20

NA4 Testing Group Three types of tests will be developed: (based on user requirements and on the experience gathered by the ongoing activities like LHC DCs and ARDA prototyping) Tests of service availability: This set of tests will check the EGEE services availability. All the services providing by the GRID should be tested : Job submission and management, files management, Information service, ... Tests of functionality: To verify that the functionalities required are available , usable and complete; for example, file creation, moving and deletion, information publication, errors recovery. Tests to measure performances: Their goal is to characterize the testbed from the end users/application perspective. Part of them will be time measurements ( time to submit X job, time to replicateY files,...), others will address scalability measurements ( how many jobs can be accepted by service Z, files limits size,...) while others will be more abstract (information availability, errors message access,...). This work should be done in close collaboration with ARDA , JRA1 and SA1 EGEE Induction May 17-19 - 21

Milestones for NA4 applications MNA4.1 M6 First applications migrated to the EGEE infrastructure HEP data challenges for 4 LHC experiments and for D0 Biomedicine – GATE simulation in nuclear medicine + others Plus the first ‘generic’ applications MNA4.2 M12 First external review of Applications Identification and Support with feedback MNA4.3 M24 Second external review of Applications Identification and Support with feedback EGEE Induction May 17-19 - 22

Deliverables for NA4 applications DNA4.1 M3 Definition of Common Application Interface (specially important for new applications…probably based on GENIUS flavoured work) DNA4.2 M6 Target Application Sector Strategy document (work for getting new applications onto EGEE) DNA4.3 M9 EGEE Application Migration Progress report (revision M15 and M21) All applications will include evaluations on production and pre-production services of LCG (current and ‘new’ middleware) DNA4.4 M24 Final Report of Application Identification and Support Activity EGEE Induction May 17-19 - 23

NA4 relations with other EGEE activities and other bodies (1) SA1 grid operations How to get new VOs onto LCG from different domains? How to integrate new resources(sites) into LCG coming from different application areas? Rationalistion of test procedures Working with national agencies (e.g. GridPP Application monitoring) NA3 training Estimating requirements for courses Design and implementation of courses JRA1 middleware All applications input requirements and monitor their satisfaction with feedback to middleware (process goes through the PTF-Project Technical Forum) JRA2 quality assurance NA4 have a representative on this group to define process for monitoring quality of EGEE services EGEE Induction May 17-19 - 24

NA4 relations with other EGEE activities and other bodies (2) JRA3 security Security of medical (and other application) data Security for sites SA2,JRA4 networking Global HEP requirements through LCG Biomed and other applications must similrly give global needs NA4 will give individual application use cases especially where problems have been encountered LCG NA4/HEP are presented on the LCG/GAG(grid Applications Group) This is HEP source of requirements and giving feedback to middleware on ‘customer satisfaction’. Some GAG people are on the PTF. EGEE Induction May 17-19 - 25

Conclusions NA4 Web site http://egee-na4.ct.infn.it NA4 is up and running now HEP is using LCG-2 for data challenges ARDA is well under way and waiting for first new middleware prototype Biomedicine has applications ready to go onto LCG-2 and pre-production services Generic group is very active with GILDA and excellent relations with NA3 Testing group has active dialogue with JRA1 and ARDA for rationalising testing effort Industry forum has developed links with several companies (see EGEE Cork presentations) NA4 open meeting Jul 14-16 at Catania with emphasis on inter-activity dialogue (with middleware,operations,security,networking) NA4 Web site http://egee-na4.ct.infn.it EGEE Induction May 17-19 - 26