HEPCAL, PPDG CS11 & the GAE workshop

Slides:



Advertisements
Similar presentations
Physicist Interfaces Project an overview Physicist Interfaces Project an overview Jakub T. Moscicki CERN June 2003.
Advertisements

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
The Architecture Design Process
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Interfacing Interactive Data Analysis Tools with the Grid: PPDG CS-11 Activity Doug Olson, LBNL Joseph Perl, SLAC ACAT 2002, Moscow 24 June 2002.
Use Case Analysis – continued
Development and Quality Plans
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Common Use Cases for a HEP Common Architecture Layer J. Templon, NIKHEF/WP8.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 1 ECHO Extended Services February 15, Agenda Review of Extended Services Policy and Governance ECHO’s Service Domain Model How to…
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
14-May-2003 AWG FH, JT, JJB DataGrig Barcelona 1 HEP GRID use cases Common GRID use cases F.Harris, J.Templon, J.J Blaising.
Service Proforma Middleware Workshop. Notes Please complete as much of this proforma as possible – it will help make the workshop more informative & productive.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Claudio Grandi INFN Bologna CSN1 - Perugia 11/11/2002 Gli esperimenti LHC hanno qualcosa in comune? (HEPCAL RTAG di LCG) C. Grandi INFN - Bologna.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Baseline Services Group Status of File Transfer Service discussions Storage Management Workshop 6 th April 2005 Ian Bird IT/GD.
Advanced Higher Computing Science
Unit 6 Application Design.
Evolution of storage and data management
Expected meeting output/goal proposal
EGEE Middleware Activities Overview
(on behalf of the POOL team)
JRA3 Introduction Åke Edlund EGEE Security Head
EO Applications Parallel Session
Ian Bird GDB Meeting CERN 9 September 2003
OGSA Data Architecture Scenarios
Evaluating Existing Systems
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Joint JRA1/JRA3/NA4 session
Middleware independent Information Service
Grid Scheduling Architecture – Research Group
AppDB current status and proposed extensions
Evaluating Existing Systems
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Thoughts on Applications Area Involvement in ARDA
Object oriented system development life cycle
OGSA Data Architecture Scenarios
LCG Operations Centres
Status and Future Steps
Report on GLUE activities 5th EU-DataGRID Conference
Updates about Work Track 5 Geographic Names at the Top-Level
Performance Management Training
Global Grid Forum (GGF) Orientation
Geant4 Documentation Geant4 Workshop 4 October 2002 Dennis Wright
Grid Systems: What do we need from web service standards?
Status of Grids for HEP and HENP
Reportnet 3.0 Database Feasibility Study – Approach
Joint Application Development (JAD)
Grid Computing Software Interface
New Biogeographic process
Presentation transcript:

HEPCAL, PPDG CS11 & the GAE workshop Ruth Pordes Fermilab presenting (as usual) the work of many others. HEPCALs Documenting Use Cases. A forum for coming to a common understanding and generating/checking Grid middleware requirements across 4 LHC experiments. Chair of the committees is Federico, and Jeff Templon is the chief editor. HEPCAL - Summer 02 HEPCAL Prime - HEPCAL updated - Spring 03. HEPCAL-II - Analysis Use cases - Phase 1 June ‘03; Phase 2 Nov ‘03

HEPCAL - its usefulness - Discussion and comments following the release stimulated Test Case implementations for EDG. Useful in identifying holes; thinking through details of end to end functionality. Helped to solidify how to move forward to joint “GLUE” testing project. Joint response from US and EU Grid Middleware projects helped understanding of boundaries between VDT and EDG components ability to move to to common underlying infrastructure. better appreciation of components in LCG, EDG and VDT. Good reference for glossary and definitions Willingness to have regular updates to this document will contribute to its usefulness -> Hepcal-Prime 23 June 2003 R. Pordes, GAE workshop

Hepcal aims to give input/guidance to Software in the “Grid Domain” 23 June 2003 R. Pordes, GAE workshop

HEPCAL-Prime - its relevance Gives agreed upon definitions and scope of many Concepts. These may be wrong - but there is plenty of text to critique, an active mail list for discussions, and a recognised forum for consensus and decision. E.g. “catalogues and datasets. A catalogue is a collection of data that is updateable and transactional. A dataset is a read-only collection of data. A special case of the dataset is the Virtual Dataset”. Long discussion of datasets etc. We expect the Grid to assign a unique job identifier to each Job. Classify all Jobs into 2 categories of “Organized” or “Chaotic” Some significant areas of Requirements and Use superficially addressed e.g. System Wide issues - Architecture, Requirements, Operations Security - VO, Authorization mechanisms Treatment of failures and faults Long transactions and persistent state Are the fundamental assumptions and scope correct or agreed to? Mostly FILEs LDN and GUID All events part of a tree Concept that “user” is often an “Agent” or “Role” based capability came late and there are lacks due to this. http://cern.ch/fca/HEPCAL-prime.doc 23 June 2003 R. Pordes, GAE workshop

HEPCAL-prime has added first Performance Requirements 23 June 2003 R. Pordes, GAE workshop

HEPCAL-II scope and status Goal is to provide Use Cases describing Analysis such that Requirements can be synthesized and a Software Architecture and Design started. First phase “over” for document to be delivered to the SC2 at the end of this month . Not clear that this is sufficient for the new RTAG. Really only a first pass at bringing people on the committees thinking forward to approach the differences and similarities between Analysis and Production Processing. At the moment there seem to be a couple of concepts that people agree are different: May not know the Input Data that is needed til the job is run. (job executions are preceded by Queries to define the input data.) User Interaction may be required and will have a wide range of “response” needs. System concepts like planning, prioritization, VO management not included. 23 June 2003 R. Pordes, GAE workshop

Still simple models of end to end Analysis steps 23 June 2003 R. Pordes, GAE workshop

Performance Requirements: [ This section needs considerable reworking, still looking for brilliant ideas. ] It is expected to have about 10-15 physics analysis groups in each experiment with probably 10-20 active people in each extracting the data from the earlier scenarios... For the later stages ..the produced data may not necessarily be registered on the Grid. In addition, it is expected to have about 30(?) people per subdetector in each experiment (total of 3-500? per experiment) accessing the data for detector studies and/or calibration purposes. So a total of 400-600 people in each experiment is expected to do the extraction of (possibly private) results. This number is representative; depending on the stage of the experiment the profile might be quite different. Is there a common data handling layer that is external to the application and has middleware and/or external to middleware components? Still no assumption on this. - is it time to make a decision? Query handlers as an LCG common project? Collaborating with PPDG? 23 June 2003 R. Pordes, GAE workshop

The Arrow of “increasing interactivity” The horizontal axis can be divided into general regions based largely on human time-scales: < 1 sec: Instantaneous. User's attention is continually focused upon the job. < 1 min: Fast. Time periods spent waiting for response or results is short enough that user will not start another task in the interim. < 1 hour: Slow. User will likely devote attention to another task while waiting for response/results, but will return to task in same working day. > 1 day: Glacial. User will likely release and forget. Will return to task after an extended period or only upon notification that task has completed. 23 June 2003 R. Pordes, GAE workshop

1.1.1 Persistent interactive environment For each analysis session user should be able to assign a name (in user’s private namespace) to which he/she can subsequently refer in order to get additional information about analysis status, estimated time to completion,… find and retrieve partial results of his/her analysis re-establish complete analysis environment at later stage …. 23 June 2003 R. Pordes, GAE workshop

PPDG CS-11 23 June 2003 R. Pordes, GAE workshop

PPDG CS-11 “Interactive” Physics Analysis on a Grid Cross Experiment Working Group tp discuss common requirements and interfaces. Forum to bring information about many needed parallel implementations and prototyping to gain understanding Extract the common requirements that such applications make on the grid, to influence grid middleware to contain the necessary hooks Evaluate existing interfaces and services propose extensions/ describe new interfaces as needed Particularly strong participation has come from analysis tool makers in the US: JAS, Caigee, ROOT. 23 June 2003 R. Pordes, GAE workshop

PPDG Analysis Tools Work Not focused yet on common development effort. Still a “working group” for PPDG Year3. Expect it to be a focus of Year 4&5. People in PPDG are encouraging us to make it a strong focus development -> production effort sooner? However, PPDG must avoid landing in the todays situation as for Replica Management systems ie 6 different implementations IN PRODUCTION 23 June 2003 R. Pordes, GAE workshop

..CS-11 service names to date.. Submit Abstract Job Submit Concrete Job Control Concrete Job Status of Concrete Job (Status is an exposed interface to every service) Concrete Job Capabilities. Sub-Job Management / Partition Job Estimate Performance Move Data Copy Data Query DataSet Catalog Manage Dataset Catalog Manage Data Replication Access Metadata Catalog Discover Resource Reserve Resource Matchmaker Manage Storage Login/Logout Install Software 23 June 2003 R. Pordes, GAE workshop