Download presentation
Presentation is loading. Please wait.
1
HEPCAL, PPDG CS11 & the GAE workshop
Ruth Pordes Fermilab presenting (as usual) the work of many others. HEPCALs Documenting Use Cases. A forum for coming to a common understanding and generating/checking Grid middleware requirements across 4 LHC experiments. Chair of the committees is Federico, and Jeff Templon is the chief editor. HEPCAL - Summer 02 HEPCAL Prime - HEPCAL updated - Spring 03. HEPCAL-II - Analysis Use cases - Phase 1 June ‘03; Phase 2 Nov ‘03
2
HEPCAL - its usefulness -
Discussion and comments following the release stimulated Test Case implementations for EDG. Useful in identifying holes; thinking through details of end to end functionality. Helped to solidify how to move forward to joint “GLUE” testing project. Joint response from US and EU Grid Middleware projects helped understanding of boundaries between VDT and EDG components ability to move to to common underlying infrastructure. better appreciation of components in LCG, EDG and VDT. Good reference for glossary and definitions Willingness to have regular updates to this document will contribute to its usefulness -> Hepcal-Prime 23 June 2003 R. Pordes, GAE workshop
3
Hepcal aims to give input/guidance to Software in the “Grid Domain”
23 June 2003 R. Pordes, GAE workshop
4
HEPCAL-Prime - its relevance
Gives agreed upon definitions and scope of many Concepts. These may be wrong - but there is plenty of text to critique, an active mail list for discussions, and a recognised forum for consensus and decision. E.g. “catalogues and datasets. A catalogue is a collection of data that is updateable and transactional. A dataset is a read-only collection of data. A special case of the dataset is the Virtual Dataset”. Long discussion of datasets etc. We expect the Grid to assign a unique job identifier to each Job. Classify all Jobs into 2 categories of “Organized” or “Chaotic” Some significant areas of Requirements and Use superficially addressed e.g. System Wide issues - Architecture, Requirements, Operations Security - VO, Authorization mechanisms Treatment of failures and faults Long transactions and persistent state Are the fundamental assumptions and scope correct or agreed to? Mostly FILEs LDN and GUID All events part of a tree Concept that “user” is often an “Agent” or “Role” based capability came late and there are lacks due to this. 23 June 2003 R. Pordes, GAE workshop
5
HEPCAL-prime has added first Performance Requirements
23 June 2003 R. Pordes, GAE workshop
6
HEPCAL-II scope and status
Goal is to provide Use Cases describing Analysis such that Requirements can be synthesized and a Software Architecture and Design started. First phase “over” for document to be delivered to the SC2 at the end of this month . Not clear that this is sufficient for the new RTAG. Really only a first pass at bringing people on the committees thinking forward to approach the differences and similarities between Analysis and Production Processing. At the moment there seem to be a couple of concepts that people agree are different: May not know the Input Data that is needed til the job is run. (job executions are preceded by Queries to define the input data.) User Interaction may be required and will have a wide range of “response” needs. System concepts like planning, prioritization, VO management not included. 23 June 2003 R. Pordes, GAE workshop
7
Still simple models of end to end Analysis steps
23 June 2003 R. Pordes, GAE workshop
8
Performance Requirements: [ This section needs considerable reworking, still looking for brilliant ideas. ] It is expected to have about physics analysis groups in each experiment with probably active people in each extracting the data from the earlier scenarios... For the later stages ..the produced data may not necessarily be registered on the Grid. In addition, it is expected to have about 30(?) people per subdetector in each experiment (total of 3-500? per experiment) accessing the data for detector studies and/or calibration purposes. So a total of people in each experiment is expected to do the extraction of (possibly private) results. This number is representative; depending on the stage of the experiment the profile might be quite different. Is there a common data handling layer that is external to the application and has middleware and/or external to middleware components? Still no assumption on this. - is it time to make a decision? Query handlers as an LCG common project? Collaborating with PPDG? 23 June 2003 R. Pordes, GAE workshop
9
The Arrow of “increasing interactivity”
The horizontal axis can be divided into general regions based largely on human time-scales: < 1 sec: Instantaneous. User's attention is continually focused upon the job. < 1 min: Fast. Time periods spent waiting for response or results is short enough that user will not start another task in the interim. < 1 hour: Slow. User will likely devote attention to another task while waiting for response/results, but will return to task in same working day. > 1 day: Glacial. User will likely release and forget. Will return to task after an extended period or only upon notification that task has completed. 23 June 2003 R. Pordes, GAE workshop
10
1.1.1 Persistent interactive environment
For each analysis session user should be able to assign a name (in user’s private namespace) to which he/she can subsequently refer in order to get additional information about analysis status, estimated time to completion,… find and retrieve partial results of his/her analysis re-establish complete analysis environment at later stage …. 23 June 2003 R. Pordes, GAE workshop
11
PPDG CS-11 23 June 2003 R. Pordes, GAE workshop
12
PPDG CS-11 “Interactive” Physics Analysis on a Grid
Cross Experiment Working Group tp discuss common requirements and interfaces. Forum to bring information about many needed parallel implementations and prototyping to gain understanding Extract the common requirements that such applications make on the grid, to influence grid middleware to contain the necessary hooks Evaluate existing interfaces and services propose extensions/ describe new interfaces as needed Particularly strong participation has come from analysis tool makers in the US: JAS, Caigee, ROOT. 23 June 2003 R. Pordes, GAE workshop
13
PPDG Analysis Tools Work
Not focused yet on common development effort. Still a “working group” for PPDG Year3. Expect it to be a focus of Year 4&5. People in PPDG are encouraging us to make it a strong focus development -> production effort sooner? However, PPDG must avoid landing in the todays situation as for Replica Management systems ie 6 different implementations IN PRODUCTION 23 June 2003 R. Pordes, GAE workshop
14
..CS-11 service names to date..
Submit Abstract Job Submit Concrete Job Control Concrete Job Status of Concrete Job (Status is an exposed interface to every service) Concrete Job Capabilities. Sub-Job Management / Partition Job Estimate Performance Move Data Copy Data Query DataSet Catalog Manage Dataset Catalog Manage Data Replication Access Metadata Catalog Discover Resource Reserve Resource Matchmaker Manage Storage Login/Logout Install Software 23 June 2003 R. Pordes, GAE workshop
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.