Presentation is loading. Please wait.

Presentation is loading. Please wait.

The EU DataGrid Architecture The European DataGrid Project Team

Similar presentations


Presentation on theme: "The EU DataGrid Architecture The European DataGrid Project Team"— Presentation transcript:

1 The EU DataGrid Architecture The European DataGrid Project Team http://www.eu-datagrid.org Peter.Kunszt@cern.ch

2 The EDG Architecture Tutorial - n° 2 Contents  Middleware architecture overview  EDG structure n Job scheduling n Fabric management n Data Management n Monitoring n Storage n Networking  Summary

3 The EDG Architecture Tutorial - n° 3 EDG middleware architecture Globus hourglass  Current EDG architectural functional blocks: n Basic Services ( authentication, authorization, Replica Catalog, secure file transfer,Info Providers) rely on Globus 2.0 (GSI, GRIS/GIIS,GRAM, MDS) OS & Net services Basic Services High level GRID middleware LHC VO common application layer Other apps ALICEATLASCMSLHCb Specific application layer Other apps GLOBUS 2.0 GRID middleware

4 The EDG Architecture Tutorial - n° 4 DataGrid Architecture Collective Services Information & Monitoring Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication & Accounting Replica Catalog Storage Element Services Database Services Fabric services Configuration Management Configuration Management Node Installation & Management Node Installation & Management Monitoring and Fault Tolerance Monitoring and Fault Tolerance Resource Management Fabric Storage Management Fabric Storage Management Grid Fabric Local Computing Grid Grid Application Layer Data Management Job Management Metadata Management Object to File Mapping Logging & Book- keeping

5 The EDG Architecture Tutorial - n° 5 EDG middleware architecture: EDG interfaces Computing Elements System Managers Scientists Operating System File Systems Storage Elements Mass Storage Systems HPSS, Castor User Accounts Certificate Authorities Application Developers Batch Systems Collective Services Info & Monitor Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication Accounting Replica Catalog Storage Element Services SQL Database Services Fabric services Config Managem. Config Managem. Node Installation Managem. Node Installation Managem. Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Managem. Fabric Storage Managem. Grid Application Layer Data Managem. Job Managem. Metadata Managem. Object to File Map Logging & Book- keeping

6 The EDG Architecture Tutorial - n° 6 EDG middleware architecture: The Workload Management System (WP1)  WP1 is responsible for the Workload Management System (WMS). The WMS is currently composed by the following parts: n User Interface (UI) : access point for the user to the GRID ( using JDL) n Resource Broker (RB) : the broker of GRID resources, matchmaking n Job Submission System (JSS) : Condor-G; interfacing batch systems n Information Index (II) : an LDAP server used as a filter to select resources n Logging and Bookkeeping services (LB) : MySQL databases to store Job Info

7 The EDG Architecture Tutorial - n° 7 WP1: Work Load Management Components Job Description Language Resource Broker Job Submission Service Information Index User Interface Logging & Bookkeeping Service Collective Services Info & Monitor Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication Accounting Replica Catalog Storage Element Services Fabric services Config Management Config Management Node Installation Management Node Installation Management Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Management Fabric Storage Management Grid Application Layer Data Managem. Metadata Managem. Object to File Mapping Logging & Book- keeping  Implementation: n UI : python (LB client : C++) n RB : C++ n JSS : C++, python n II : LDAP server n LB: MySQL, C++ n Input/Output Sandboxes: GridFTP Job Managem. SQL Database Services  WMS main interfaces: n Globus Gatekeeper n WP2 Replica Catalog APIs n WP3 Information Systems n WP7 network monitoring info providers n End User (using JDL files, on the UI)

8 The EDG Architecture Tutorial - n° 8 EDG middleware architecture: WP1 (WMS)

9 The EDG Architecture Tutorial - n° 9 EDG middleware architecture: WP2 (Data Management )  WP2 is responsible for Data Management, which includes file and replica management, metadata access and data security. WP2 components:  Replica Manager : the main manager for triggering replica execution all over the GRID, including replica optimization and interfacing the replica catalog service  Replica Catalog : a GRID service used to resolve Logical File Names into a set of corresponding Physical File Names – Globus Replica Catalog  GDMP : the GRID Data Mirroring Package, used to create replicas of any filetype all over the GRID Storage Elements in a synchronized way, by automatic updating the replica catalog  Spitfire : provides a Grid enabled middleware service for access to relational databases : it consists of the Spitfire Server module and the Spitfire Client libraries and command line executables.

10 The EDG Architecture Tutorial - n° 10 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D File Transfer

11 The EDG Architecture Tutorial - n° 11 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer

12 The EDG Architecture Tutorial - n° 12 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Replica Selection: Get ‘best’ file

13 The EDG Architecture Tutorial - n° 13 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file

14 The EDG Architecture Tutorial - n° 14 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file Replication Automation: Data Source subscription

15 The EDG Architecture Tutorial - n° 15 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file Replication Automation: Data Source subscription Load balancing: Replicate based on usage

16 The EDG Architecture Tutorial - n° 16 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Replica Manager: ‘atomic’ replication operation single client interface orchestrator Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file Replication Automation: Data Source subscription Load balancing: Replicate based on usage

17 The EDG Architecture Tutorial - n° 17 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Replica Manager: ‘atomic’ replication operation single client interface orchestrator Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file Replication Automation: Data Source subscription Load balancing: Replicate based on usage Metadata: LFN metadata Transaction information Access patterns

18 The EDG Architecture Tutorial - n° 18 File Management Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Replica Manager: ‘atomic’ replication operation single client interface orchestrator Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file Replication Automation: Data Source subscription Load balancing: Replicate based on usage Metadata: LFN metadata Transaction information Access patterns

19 The EDG Architecture Tutorial - n° 19 Current State  File Transfer: Use GridFTP – deployed n Close collaboration with Globus n NetLogger (Brian Tierney and John Bresnahan)  Replication: GDMP – deployed n Wrapper around Globus ReplicaCatalog n All functionality in one integrated package n Using Globus 2 n Uses GridFTP for transferring file  Replication: edg-replica-manager – deployed  Replication: Replica Location Service Giggle – in testing n Distributed Replica Catalog  Replication: Replica Manager Reptor – in testing  Optimization: Replica Selection OptorSim – in simulation  Metadata Storage: SQL Database Service Spitfire – deployed n Servlets on HTTP(S) with XML (XSQL) n GSI enabled access + extensions  GSI interface to CASTOR – delivered

20 The EDG Architecture Tutorial - n° 20 WP2: Data Management Deployed Components GridFTP Replica Manager - edg-replica- manager Replica Catalog - globus-replica- catalog GDMP Spitfire Collective Services Info & Monitor Grid Scheduler Replica Manager Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication Accounting Replica Catalog Fabric services Config Management Config Management Node Installation Management Node Installation Management Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Management Fabric Storage Management Grid Application Layer Job Managem. Metadata Managem. Object to File Mapping Logging & Book- keeping  Implementation: n RM: C++ classes (under development) n RC : Globus Replica Catalog wrapper n GDMP : C++ n Spitfire : Java, Web Services Data Managem. SQL Database Services  WP2 main interfaces: n The GRID Storage Element n WP1 Resource Broker APIs n WP3 GRID Info services n WP7 network monitoring info providers n End User (using GDMP) Storage Element Services

21 The EDG Architecture Tutorial - n° 21  Copy data file to storage element: globus-url-copy file:///${chemin}/L69999 gsiftp://lxshare0219.cern.ch/flatfiles/SE1/lhcb/L69999  Register stored data in the catalog: /opt/globus/bin/globus-job-run lxshare0219.cern.ch /bin/bash -c "export GDMP_CONFIG_FILE=/opt/edg/lhcb/etc/gdmp.conf;/opt/edg/bin/gdmp_register_local_file -d /flatfiles/SE1/lhcb"  Publish catalog: /opt/globus/bin/globus-job-run lxshare0219.cern.ch /bin/bash -c "export GDMP_CONFIG_FILE=/opt/edg/lhcb/etc/gdmp.conf; /opt/edg/bin/gdmp_publish_catalogue - n"  Copy output to MSS : n rfcp L1600061 /castor/cern.ch/lhcb/mc/L1600061 Example of Data Management by LHCb

22 The EDG Architecture Tutorial - n° 22 Replica Optimiser Replica Manager Replica Catalogue SE CE Replica Optimiser Replica Manager SE CE physical file transfer communication Client The Replica Manager APIs

23 The EDG Architecture Tutorial - n° 23 The Replica Manager APIs  RM.copy(PhysicalFileName source, PhysicalFileName destination, String protocol):Status n allows for third-party transfer n transfer between: s two StorageElements or s ComputingElement and Storage Element s Space management policies under development

24 The EDG Architecture Tutorial - n° 24  RM.add/deletePhysicalFileName(LogicalFileName lfn, PhysicalFileName pfn) n Replica Catalogue operations only - no file transfer  RM.copyAndAddPhysicalFile(PhysicalFileName source, PhysicalFileName destination, LogicalFileName lfn, String protocol):Status n third-party transfer but : files can only be registered in Replica Catalogue if destination PFN contains a valid SE (i.e. needs to be registered in the RC)!  RM.deletePhysicalFile(LogicalFileName lfn, PhysicalFileName pfn) The Replica Manager APIs

25 The EDG Architecture Tutorial - n° 25 WP2 next generation Replication Services Replica Manager Replica Metadata Replica Location File Transfer Optimization Transaction Consistency Preprocessing Postprocessing Subscription Client Reptor Giggle RepMeC Optor GDMP

26 The EDG Architecture Tutorial - n° 26 Replication Services Architecture Replica Location Index Site Replica Manager Storage Element Computing Element Optimiser Resource Broker User Interface Pre-/Post- processing Core API Optimisation API Processing API Local Replica Catalog Replica Location Index Replica Metadata Catalog Replica Location Index Site Replica Manager Storage Element Computing Element Optimiser Pre-/Post- processing Local Replica Catalog

27 The EDG Architecture Tutorial - n° 27 Metadata Management and Security Project Spitfire  'Simple' Grid Persistency n Grid Metadata n Application Metadata n Unified Grid enabled front end to relational databases.  Metadata Replication and Consistency  Publish information on the metadata service Secure Grid Services  Grid authentication, authorization and access control mechanisms enabled in Spitfire  Modular design, reusable by other Grid Services

28 The EDG Architecture Tutorial - n° 28 Spitfire Architecture OracleDB2PostGresMySQL  Atomic RDBMS is always consistent  No local replication of data  Role-based authorization  XSQL Servlet as one access mode for ‘simple’ web access  Web/Grid Services Paradigm n SOAP interfaces n JDBC interface to RDBMS  Plugability and extensibility OracleLayerDB2LayerPGLayerMyLayer Local Spitfire Layer Connecting Layer Global Spitfire Layer SOAP

29 The EDG Architecture Tutorial - n° 29  WP3’s task is to provide information about The Grid itself This includes information about resources (ComputingElements, StorageElements and the Network), for which the Globus MDS is a common solution; and job status information (as implemented by WP1's Logging and Bookkeeping). Grid applications This is information published by user jobs. This is used for performance monitoring. WP3 : GRID monitoring and Info Providers

30 The EDG Architecture Tutorial - n° 30  Main WP3 components: n MDS v 2.1: the Globus Monitoring and Discovery Services based on Soft State Registration protocols and LDAP aggregate directory services n Ftree : EDG developed directory service based on OpenLDAP plus caching to address shortcoming in MDS v1, optimizing data access performances n R-GMA: Relational GMA (Grid Monitoring Architecture [Consumers, Producers and Directory Services, GGF] ) implementation which makes information from producers available to consumers as relations (tables). It also uses relations to handle the registration of producers. R-GMA is consistent with GMA principles. n GRM / PROVE: Application monitoring and visualization tools of the P- GRADE graphical parallel programming environment, properly modified for application monitoring in the DataGrid. The instrumentation library of GRM is generalized for a flexible trace event specification. The components of GRM will be connected to the R-GMA using its Producer and Consumer APIs. WP3 : GRID monitoring and Info Providers

31 The EDG Architecture Tutorial - n° 31 R-GMA  Use the GMA from GGF  A relational implementation  Applied to both information and monitoring  Creates impression that you have one RDBMS per VO Producer Consumer Registry subscribe lookup

32 The EDG Architecture Tutorial - n° 32 Relational Approach  Producers announce:SQL “CREATE TABLE” publish:SQL “INSERT”  Consumers collect:SQL “SELECT”

33 The EDG Architecture Tutorial - n° 33 R-GMA  API – Servlet communication n http(s) in n XML back Sensor Code Producer API Application Code Consumer API ProducerServlet Registry API Registry Servlet Schema API Schema Servlet Consumer Servlet Registry API

34 The EDG Architecture Tutorial - n° 34 Schema & Contributions CPULoad (Global Schema) CountrySiteFacilityLoadTimestamp UKRALCDF0.319055711022002 UKRALATLAS1.619055611022002 UKGLACDF0.419055811022002 UKGLAALICE0.519055611022002 CHCERNALICE0.919055611022002 CHCERNCDF0.619055511022002 CPULoad (Producer3) CHCERNATLAS1.619055611022002 CHCERNCDF0.619055511022002 CPULoad (Producer 1) UKRALCDF0.319055711022002 UKRALATLAS1.619055611022002 CPULoad (Producer 2) UKGLACDF0.419055811022002 UKGLAALICE0.519055611022002

35 The EDG Architecture Tutorial - n° 35 Contributions are Views CPULoad (Producer 1) UKRALCDF0.319055711022002 UKRALATLAS1.619055611022002 CPULoad (Producer 2) UKGLACDF0.419055811022002 UKGLAALICE0.519055611022002 SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’RAL’ SELECT * FROM cpuLoad WHERE country = ’UK’ AND site = ’GLA’

36 The EDG Architecture Tutorial - n° 36 WP3: GRID Monitoring Components MDS / FTree R-GMA GRM/Prove Collective Services Info & Monitor Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication Accounting Replica Catalog Storage Element Services Fabric services Config Management Config Management Node Installation Management Node Installation Management Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Management Fabric Storage Management Grid Application Layer Data Managem. Metadata Managem. Object to File Mapping Logging & Book- keeping  Implementation: n MDS : LDAP, Globus GRIS, GIIS n FTree : OpenLDAP, caching n RGMA : Java, C++, MySQL, TomCat n GRM / PROVE : P-GRADE Job Managem. SQL Database Services  WP3 main interfaces: n WP1 Resource Broker ( InfoIndex) n WP2 RM optimizer n all GRID services producing info (SE,CE..) n WP7 network monitoring

37 The EDG Architecture Tutorial - n° 37  WP4 is responsible to deliver a computing fabric comprised of all the necessary tools to manage a center providing grid services on clusters of thousands of nodes. The computing fabric is called the Computing Element in EDG.  User Job Control and Management (Grid and local jobs) on fabric batch and/or interactive CPU services n Gridification – Grid interface to fabric resources n Resource Management – manage underlying batch services  Automated System Administration for Computing Fabric Elements. These subsystems are reserved for system administrators and operators for performing system maintenance n Configuration Management n Installation Management n Fabric Monitoring EDG middleware architecture: WP4 : Fabric Management Components

38 The EDG Architecture Tutorial - n° 38 Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) WP4 Architecture logical overview

39 The EDG Architecture Tutorial - n° 39 Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) WP4 Architecture logical overview - Interface between Grid-wide services and local fabric; - Provides local authentication, authorization and mapping of grid credentials. - Interface between Grid-wide services and local fabric; - Provides local authentication, authorization and mapping of grid credentials.

40 The EDG Architecture Tutorial - n° 40 Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) WP4 Architecture logical overview - provides transparent access (both job and admin) to different cluster batch systems; - enhanced capabilities (extended scheduling policies, advanced reservation, local accounting). - provides transparent access (both job and admin) to different cluster batch systems; - enhanced capabilities (extended scheduling policies, advanced reservation, local accounting).

41 The EDG Architecture Tutorial - n° 41 Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) WP4 Architecture logical overview - provides the tools to install and manage all software running on the fabric nodes; -Agent to install, upgrade, remove and configure software packages on the nodes. -bootstrap services and software repositories. - provides the tools to install and manage all software running on the fabric nodes; -Agent to install, upgrade, remove and configure software packages on the nodes. -bootstrap services and software repositories.

42 The EDG Architecture Tutorial - n° 42 Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) WP4 Architecture logical overview -provides a central storage and management of all fabric configuration information; -Compile HLD templates to LLD node profiles - central DB and set of protocols and APIs to store and retrieve information. -provides a central storage and management of all fabric configuration information; -Compile HLD templates to LLD node profiles - central DB and set of protocols and APIs to store and retrieve information.

43 The EDG Architecture Tutorial - n° 43 Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) WP4 Architecture logical overview - provides the tools for gathering monitoring information on fabric nodes; -central measurement repository stores all monitoring information; - fault tolerance correlation engines detect failures and trigger recovery actions. - provides the tools for gathering monitoring information on fabric nodes; -central measurement repository stores all monitoring information; - fault tolerance correlation engines detect failures and trigger recovery actions.

44 The EDG Architecture Tutorial - n° 44 User job management (Grid and local) Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Monitoring Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5)

45 The EDG Architecture Tutorial - n° 45 User job management (Grid and local) Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Monitoring Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) - Submit job

46 The EDG Architecture Tutorial - n° 46 User job management (Grid and local) Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Monitoring Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) - publish resource and accounting information

47 The EDG Architecture Tutorial - n° 47 User job management (Grid and local) Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Monitoring Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) - Optimized selection of site

48 The EDG Architecture Tutorial - n° 48 User job management (Grid and local) Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Monitoring Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) -Authorize -Map grid  local credentials -Authorize -Map grid  local credentials

49 The EDG Architecture Tutorial - n° 49 User job management (Grid and local) Farm A (LSF)Farm B (PBS ) Grid User (Mass storage, Disk pools) Local User Monitoring Fabric Gridification Resource Management Grid Info Services (WP3) WP4 subsystems Other Wps Resource Broker (WP1) Data Mgmt (WP2) Grid Data Storage (WP5) -Select an optimal batch queue and submit -Return job status and output -Select an optimal batch queue and submit -Return job status and output

50 The EDG Architecture Tutorial - n° 50 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation

51 The EDG Architecture Tutorial - n° 51 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation - Node malfunction detected

52 The EDG Architecture Tutorial - n° 52 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation -Remove node from queue -Wait for running jobs(?) -Remove node from queue -Wait for running jobs(?)

53 The EDG Architecture Tutorial - n° 53 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation - Update configuration templates

54 The EDG Architecture Tutorial - n° 54 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation - Trigger repair

55 The EDG Architecture Tutorial - n° 55 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation - Repair (e.g. restart, reboot, reconfigure, …)

56 The EDG Architecture Tutorial - n° 56 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation - Node OK detected

57 The EDG Architecture Tutorial - n° 57 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation -Put back node in queue

58 The EDG Architecture Tutorial - n° 58 Automated management of large clusters WP4 subsystems Other Wps Farm A (LSF)Farm B (PBS ) Installation & Node Mgmt Configuration Management Monitoring & Fault Tolerance Resource Management Information Invocation Automation

59 The EDG Architecture Tutorial - n° 59 LCFG (Local ConFiGuration system)  Widely used fabric tool, whose purpose is to handle automated installation and configuration in a very diverse and evolving environment  Mechanism: n Abstract configuration parameters are stored in a central repository located in the LCFG server. n Scripts on the host machine (LCFG client) read these configuration parameters and either generate traditional configuration files, or directly manipulate various services.

60 The EDG Architecture Tutorial - n° 60 WP4: Fabric Management Components LCFG Fabric Monitoring PBS & LSF info providers Image installation Config. Cache Mgr Collective Services Info & Monitor Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication Accounting Replica Catalog Storage Element Services Fabric services Config Management Config Management Node Installation Management Node Installation Management Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Management Fabric Storage Management Grid Application Layer Data Managem. Metadata Managem. Object to File Mapping Logging & Book- keeping  Implementation: LCFG : C++, XML, HTTP Job Managem. SQL Database Services  WP4 main interfaces: WP1 Resource Broker ( InfoIndex) WP2 Data management WP5 Storage Element WP3 GRID Info Services

61 The EDG Architecture Tutorial - n° 61  WP5 delivers the Grid interface to Storage.  Its service, the Storage Element (SE) is interfacing to underlying Mass Storage Systems or simple storage services. WP5 : Mass Storage Management

62 The EDG Architecture Tutorial - n° 62 Interface 1 Interface 3 Interface 2 Message Queue Session Manager System LogHouse Keeping MetaData MSS Interface MSS Interface MSS1MSS2 Top layer Core Bottom layer Clients ( RB,JSS, RM, GDMP, InfoServices(WP3),User Applic running on CEs, CLIs) Storage Element The SE architecture

63 The EDG Architecture Tutorial - n° 63 ClientSE Replica Manager/Catalog Storage 6 2 3 4 1 1.The Client asks a catalog to provide the location of a file 2.The catalog responds with the name of an SE 3.The client asks the SE for the file 4.The SE asks the storage system to provide the file 5.The storage system sends the file to the client through the SE or 6.directly 5 6 SE Interactions

64 The EDG Architecture Tutorial - n° 64 WP5: Mass Storage Management  Achievements n Definition of Architecture and Design for DataGrid storage Element n Collaboration with Globus on GridFTP/RFIO n Collaboration with PPDG on control API n Staging from/to CASTOR at CERN succesfully implemented and tested n Succesfully Interfaced to GDMP  Supported Storage Systems: n UNIX disk systems n HPSS (High Performance Storage System) n CASTOR (through RFIO) n GridFTP servers n DMF n Enstore Collective Services Info & Monitor Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication Accounting Replica Catalog Storage Element Services Fabric services Config Management Config Management Node Installation Management Node Installation Management Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Management Fabric Storage Management Grid Application Layer Data Managem. Metadata Managem. Object to File Mapping Logging & Book- keeping Job Managem. SQL Database Services  WP5 (SE) main interfaces: WP1 Resource Broker & JSS WP2 RM, RC WP7 for GRIDftp monitoring WP3 GRID Info Services

65 The EDG Architecture Tutorial - n° 65 WP6: TestBed Integration and demonstrators  WP6 goals: the EDG testbed n Integration of EDG sw releases (currently 1.2) and deployment all over the EDG testbed : the integration team n Working implementation of multiple VOs & basic security infrastructure n Definition of acceptable usage contracts and creation of Certification Authorities group n Set up of the Authorization Working Group to manage authorization policies on the testbed Components Support for test-VO, mkgridmap tools Globus packaging & EDG config Build tools, CVS central s/w repository End-user documents Collective Services Info & Monitor Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authenticatio n Accounting Replica Catalog Storage Element Services Fabric services Config Management Config Management Node Installation Management Node Installation Management Monitoring Fault Tolerance Monitoring Fault Tolerance Resource Managem. Fabric Storage Management Fabric Storage Management Grid Application Layer Data Managem. Metadata Managem. Object to File Mapping Logging & Book- keeping Job Managem. SQL Database Services

66 The EDG Architecture Tutorial - n° 66 Further Information  DataGrid Dx.2 Deliverables: x=1..5  DataGrid D12.4 Deliverable


Download ppt "The EU DataGrid Architecture The European DataGrid Project Team"

Similar presentations


Ads by Google