EGEE Catalogs Peter Kunszt EGEE Data Management Middleware Service Grids NeSC, 22-23 July 2004 EGEE is a project funded by the.

Slides:



Advertisements
Similar presentations
INFSO-RI Enabling Grids for E-sciencE EGEE and gLite Slides by: Erwin Laure EGEE Deputy Middleware Manager.
Advertisements

Data Management Expert Panel - WP2. WP2 Overview.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Grid for CBM Kilian Schwarz, GSI. What is Grid ? ● Sharing of distributed resources within one Virtual Organisations !!!!
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
GLite, the next generation middleware for Grid computing Oxana Smirnova (Lund/CERN) Nordic Grid Neighborhood Meeting Linköping, October 20, 2004 Uses material.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
INFSO-RI Enabling Grids for E-sciencE Comparison of LCG-2 and gLite Author E.Slabospitskaya Location IHEP.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
95-843: Service Oriented Architecture 1 Master of Information System Management Service Oriented Architecture Lecture 10: Service Component Architecture.
INFSO-RI Enabling Grids for E-sciencE gLite Data Management Services - Overview Mike Mineter National e-Science Centre, Edinburgh.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
DataGrid is a project funded by the European Union CHEP 2003 – March 2003 – Next Generation Data Mgmt... – n° 1 James Casey CERN
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
LCG LCG Workshop – March Generic Middleware Services LCG Workshop March 2004 EGEE is proposed as a project funded by the European.
INFSO-RI Enabling Grids for E-sciencE Status and Plans of gLite Middleware Erwin Laure 4 th ARDA Workshop 7-8 March 2005.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
Δ Storage Middleware GridPP10 What’s new since GridPP9? CERN, June 2004.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
EGEE is a project funded by the European Union under contract IST Gap analysis draft v2 Olle Mulmo, David Groep, Joni Hahkala JRA3 Gap, 10.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
EGEE is a project funded by the European Union under contract IST Middleware Planning for LCG/EGEE Bob Jones EGEE Technical Director e-Science.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
Service Proforma Middleware Workshop. Notes Please complete as much of this proforma as possible – it will help make the workshop more informative & productive.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
LCG EGEE is a project funded by the European Union under contract IST LCG PEB, 7 th June 2004 Prototype Middleware Status Update Frédéric Hemmer.
EGEE is a project funded by the European Union under contract IST R-GMA: Production Services for Information and Monitoring in the Grid John.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
INFSO-RI Enabling Grids for E-sciencE Scenarios for Integrating Data and Job Scheduling Peter Kunszt On behalf of the JRA1-DM Cluster,
EGEE MiddlewareLCG Internal review18 November EGEE Middleware Activities Overview Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Service Proforma Middleware Workshop. Notes Please complete as much of this proforma as possible – it will help make the workshop more informative & productive.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
WP3 Information and Monitoring Rob Byrom / WP3
EGEE is a project funded by the European Union under contract INFSO-RI Middleware for the next Generation Grid Infrastructure Erwin Laure EGEE Deputy.
Service Proforma Middleware Workshop. Notes Please complete as much of this proforma as possible – it will help make the workshop more informative & productive.
CDDLM on HP SmartFrog Middleware Workshop. Service: CDDLM Distributed Deployment Framework HPL implementation of GGF CDDLM WG – (and.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
EGEE is a project funded by the European Union under contract IST Information and Monitoring Services within a Grid R-GMA (Relational Grid.
Distributed Data Access Control Mechanisms and the SRM Peter Kunszt Manager Swiss Grid Initiative Swiss National Supercomputing Centre CSCS GGF Grid Data.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
1 OGSA-DAI: Service Grids Neil P Chue Hong. 2 Motivation  Access to data is a necessity on the Grid  The ability to integrate different data resources.
Data Breakout. OGSA Architecture – databases Eldas, OGSA-DAI and GridMiner implement a slightly old version of OGSA / DAIS –Architecture doc describes.
JRA1 Activity Feedback Frédéric Hemmer EGEE Middleware Manager and the JRA1 team EGEE is a project funded by the European Union under contract IST
EGEE is a project funded by the European Union under contract IST Data Management Data Access From WN Paolo Badino Ricardo.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
The GT 4 GRAM Service Sam Meder Middleware Workshop.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
GridPP2 Data Management work area J Jensen / RAL GridPP2 Data Management Work Area – Part 2 Mass storage & local storage mgmt J Jensen
INFSO-RI Enabling Grids for E-sciencE University of Coimbra gLite 1.4 Data Management System Salvatore Scifo, Riccardo Bruno Test.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
CDDLM on HP SmartFrog Middleware Workshop. Service: CDDLM Distributed Deployment Framework HPL implementation of GGF CDDLM WG – (and.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
gLite Basic APIs Christos Filippidis
EGEE Middleware Activities Overview
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Comparison of LCG-2 and gLite v1.0
Introduction to Data Management in EGI
Service Oriented Architecture (SOA)
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

EGEE Catalogs Peter Kunszt EGEE Data Management Middleware Service Grids NeSC, July 2004 EGEE is a project funded by the European Union under contract IST

Service Grids – NeSC, July 22-23, High-level strategy for middleware EGEE Middleware –  To re-engineer generic middleware packages  Incorporating experience from EDG, VDT, LCG, AliEn (product from CERN Alice experiment) and others  Architected for scale and performance requirements of LCG and other applications EGEE design team formed early to develop architecture  Architecture: Fast prototyping approach  Short update cycles to give applications the chance to influence and give feedback

Service Grids – NeSC, July 22-23, Guiding principles Lightweight (existing) services  Easily and quickly deployable  Re-use as much as possible Interoperability  Allow for multiple implementations Resilience and Fault Tolerance Service oriented approach  Follow WSRF standardization  No mature WSRF implementations exist to date, hence: start with plain WS – WSRF compliance is not an immediate goal  Aim for WS-I compliance Co-existence with deployed infrastructure  Co-existence (and convergence) with existing grid infrastructures (e.g. LCG2) are essential for the EGEE Grid service EDGVDT... LCG EGEE...AliEn

Service Grids – NeSC, July 22-23, High-level functional decomposition Starting point was the ARDA roadmap document  Focus is upon interfaces that can be composed into useful services

Service Grids – NeSC, July 22-23, EGEE Data functional interfaces File Catalog  Management of the logical namespace Replica Catalog  Tracking of file replicas Metadata Catalog  Application specific metadata In particular, metadata used to select logical files Combined Catalog  Added functionality by orchestration of the 3 catalogs (providing transaction safety) Storage Element  Where the files get stored  SRM interface (see GGF GSM-WG) Manage a Storage Resource Space reservation Put and retrieve files using various protocols  Posix-like File I/O Most posix-compliant feature support Abstraction over existing MSS IO mechanisms Data Management: File Transfer Service  Reliable transfer of files between two sites File Placement Service  Transfer and register files  Orchestrate File Transfer and Data Catalog services Data Scheduling Service  Event-based data transfer, using File Placement Service

Service Grids – NeSC, July 22-23, Services: EGEE Catalogs File Catalog  Management of the logical file namespace Replica Catalog  Tracking of file replicas Metadata Catalog  Application specific metadata; in particular, metadata used to select logical files Combined Catalog  Added functionality by orchestration of the 3 catalogs (providing transaction safety)

Service Grids – NeSC, July 22-23, File Catalog Metadata Catalog Replica Catalog Files and Catalogs LFNGUID Master SURL SURL Metadata

Service Grids – NeSC, July 22-23, Services: EGEE Catalogs SOA: WS-I Implementation status: prototype

Service Grids – NeSC, July 22-23, Service Operations File Catalog operations Directory operations Directory permissions Symbolic links List, find +Bulk ops (upload) Replica Catalog operations GUID mappings to SURLs ACLs File ‘stat’ like metadata +Bulk ops (upload, delete) Metadata Catalog operations Query, returning list of LFN/GUID Set metadata based on LFN/GUID Query metadata of LFN/GUID Combined interface List based on LFN (including replicas, metadata) Add entries just based on LFN (auto entry of GUID, SURL) Permissions based on LFNs

Service Grids – NeSC, July 22-23, What do you use to build your service? (i.e. How ‘standard’ is your service?) Widely Implemented Standard Specification (1pt)  All services are described through WSDL, WS-I compliant (nontrivial!)  X509 extensions used for authorization (VOMS) Implemented draft Spec (2pt)  GSS/GSI for delegation Implemented draft specification (3pt)  -- Implemented proposal (4pt)  -- Non-implemented proposal (5pt)  -- Concept (6pt)  -- TOTAL: 4 Will use: messaging (JMS) first (+?), WS-Transactions, WS- Notification when they are available with lower rankings Security: Delegation portType (proposal) (+4?)

Service Grids – NeSC, July 22-23, Service Dependencies What else does your service depend on (i.e. external dependencies)?  RDBMS: need a JNDI connector. Can be anything beneath that, in principle. Implementation currently exists for Oracle, MySQL.  Logging: log4j What does your implementation depend on?  Tomcat 4 or 5  Java 1.4  Axis 1.1  Security libs GSI (using CoG + GSI security libs)

Service Grids – NeSC, July 22-23, AAA & Security What authentication mechanism do you use?  https: SSL/TLS / TrustManager + CoG What authorisation mechanism do you use?  GSI Delegation -- Working on Delegation portType  VOMS/AuthzManager  Work ongoing on restricted delegation What accounting mechanism do you use?  Logging, RGMA (see Abdeslem’s talk) Does service interaction need to be encrypted?  No. Still waiting on detailed req’s from users whether they really need this If these are not used now, will they be in the future?  Plugin-based extensibility planned. GSI over https used today. Extensions should talk anything that people need (WS-Security in particular)

Service Grids – NeSC, July 22-23, Exploiting the Service Architecture What features from your ‘plumbing’ do you use in your service?  Factory port : no  Factory pattern : no  Logging : yes  Event notification : not yet  Meta-data : yes  Registry discovery/advertisement : yes  Other OGSI/WSRF/WS/WS-GAF characteristics? No. but interested in –Messaging –Distributed Transactions, Sessions –Notifications and Eventing

Service Grids – NeSC, July 22-23, Service Activity Multiple interaction or single user?  multiple Throughput (1/per day or 100/per second?)  Many per second. Typical data volume moved in  Lots of small simultaneous single operations  Bulk operations with O(>10000) entries, up to O(10 6 ) Typical data volume moved out  Lots of small single ops (queries for ACL, lookups)  Bulk listings used by scheduler

Service Grids – NeSC, July 22-23, Service Failure Required Reliability  Support for many bulk operation failure policies Fail on any Try as many as possible [implies Policy management] [implies notification for asynch ops]  Atomic operation failures ‘straightforward’ Required Persistence  State persistence not required, try to be atomic Required Availability  Deployment choice.

Service Grids – NeSC, July 22-23, Required Service Management Remote access to:  Performance  Progress  Diagnostic and repair interfaces