13th EELA Tutorial, La Antigua, 18-19, October E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA gLite Overview Richard Miguel San Martin SENAMHI - PERU La Antigua, 18 – 19 October 2007
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Disclaimer This presentation is based on materials provided and authorized by the EGEE project and EELA project. It is freely available to download and use.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Overview Introduction Grid Concepts gLite Architecture by services Architecture by machines Life of Jobs
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Introduction gLite is a complex system, composed of various packages installed on different machines, interacting with each other and every of them playing a different role in the chain of the grid activity. It can be deployed and configured in extremely variable number of ways, because of its modularity and scalability, and it relies as last part of its chain on Local Batch System such as PBS/TORQUE-MAUI, LSF and Condor.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Job Workflow in gLite UI JDL Logging & Book-keeping ResourceBroker Job Submission ServiceStorageElementComputingElement InformationService Job Status LFCCatalog DataSets info Author. &Authen. Job Submit Event Job Query Job Status Input “sandbox” Input “sandbox” + Broker Info Globus RSL Output “sandbox” Job Status Publish voms-proxy-init Expanded JDL SE & CE info
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Job Workflow in gLite UI JDL Logging & Book-keeping ResourceBroker Job Submission ServiceStorageElementComputingElement InformationService Job Status LFCCatalog DataSets info Author. &Authen. Job Submit Event Job Query Job Status Input “sandbox” Input “sandbox” + Broker Info Globus RSL Output “sandbox” Job Status Publish voms-proxy-init Expanded JDL SE & CE info
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite gLite is the next generation middleware for grid computing. Born from the collaborative efforts from academic and industrial research centers as part of the EGEE Project. The gLite Grid services follow a Service Oriented Architecture –facilitate interoperability among Grid services –allow easier compliance with upcoming standards Architecture is not bound to specific implementations –services are expected to work together –services can be deployed and used independently The gLite service decomposition has been largely influenced by the work performed in the LCG project
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Building on GSI Build on Grid Security Infrastructure to create services that include: –Job submission: run a job on a remote computer –Information services: So I know which computer to use –File transfer: so large data files can be transferred –Replica management: so I can have multiple versions of a file “close” to the computers where I want to run jobs Production grids are (currently) based on the Globus Toolkit release 2 Globus Alliance:
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite - Middleware Many VOs need sharing of resources through services –Accessing –Allocating –Monitoring –Accounting gLite – Lightweight Middleware for Grid Computing
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Service Decomposition 5 High level services + CLI & API
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Security Services
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Security Services Authentication Authentication based on X.509 PKI infrastructure Certificate Authorities (CA) issue (long lived) certificates identifying individuals (much like a passport) Commonly used in web browsers to authenticate to sites Trust between CAs and sites is established (offline) In order to reduce vulnerability, on the Grid user identification is done by using (short lived) proxies of their certificates Proxies can Be delegated to a service such that it can act on the user’s behalf Include additional attributes (like VO information via the VO Membership Service VOMS) Be stored in an external proxy store (MyProxy) Be renewed (in case they are about to expire)
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Security Services Authentication - Authorization Authentication –User receives certificate signed by CA –Connects to “UI” by ssh –Downloads certificate –Single logon to Grid – create proxy - then Grid Security Infrastructure identifies user to other machines Authorisation –User joins Virtual Organisation –VO negotiates access to Grid nodes and resources –Authorisation tested by CE –gridmapfile maps user to local account UI AUP VO mgr Personal/once VO database GSI VO service 3. 2.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Security Services Auditing, Delegation, Sandboxing Auditing - Monitoring and Post-Mortem analysis of security related events. In computational grids It goes hand by hand with the accounting. Who did what? Where and when? Delegation: The need of delegate privileges to other entities is done by Proxy Certificates. This is the most widely adopted mechanism by Grid communities. (Also: Single Sign-On, Dynamic entity identification). Sandboxing - Grid applications need the isolation of assigned resources in a transparent fashion by Security services: AuthN and AuthZ. (Virtualisation).
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Grid Access Two possibilities: APIs and CLI. The use of web-services allows the automatic generation of APIs.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Information and Monitoring Services Information services are vital low level component of Grids.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Information and Monitoring Services Basic info and monitoring services (RGMA) Information is provided by a Publish and Consume mechanism. Appearance of a single federated database to query through the SQL. Each VO has a VDB. Schema - Contains tables (GLUE) Registry – List of available sources of information (Mediation) Producers – Source of information (Primary, Secondary, On-demand) Consumers – Make queries against tables (Continuous, Latest, History)
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Information and Monitoring Services Job Monitoring, Service Discovery, Network performance Monitoring Job Monitoring – Java logging service, log4j, Apache/Chainsaw (for other languages). Service Discovery – Locates suitable services to both users and services. Network Performance Monitoring – Many network monitoring frameworks. Aim: perform a standard interface to those frameworks.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Job Management Services
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Job Management Services Accounting Accumulates information about the resource usage done by users or groups of users (VOs). Information on Grid Services/Resources needs sensors (Resource Metering, Metering Abstraction Layer, Usage Records). Records are collected by the Accounting System (Queries: Users, Groups, Resource) Grid services should register themselves with a pricing service when accounting for billing purposes.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Job Management Services Computing Element Service that represent the computing resource that is responsible of the job management: (submission, control, etc.) CEs refer to a set or a cluster of computational resources (WN) managed by LRMS, to dispatch jobs matching users requests. Two job submission models (accordingly to user requests and site policies) : PUSH(jobs pushed to CE queue), PULL(jobs coming from WMS when CE queue is empty) CE responsible to collect accounting information.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Computing Element (CE) CEA … Computing Element Acceptance JC … Job Controller MON … Monitoring LRMS … Local Resource Management System
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Job Management Services Workload Management WMS set of middleware components responsible of distribution and management of jobs across Grid resources. Two core components of WMS: WM: accept and satisfy requests for job management. Matchmaking is the process of assigning the best available resource. L&B: keep track of job execution in term of events: (Submitted, Running, Done,...)
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Job Management Services Job Provenance, Package Manager Job Provenance (JP) - Keeps track of submitted jobs for long periods (months, years). Package Manager – Helper service to automate: installing, configuring, updating and removing of software components. (RPM, dpkg/APT, Portage, …)
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Data Services
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Data Services Storage Element Needed Service are at least: Storage back-end (Drivers and Hardware) SRM Interface (Storage Specific) Transfer service (GridFTP) Native POSIX like file I/O API (gLite-I/O) Auxiliary Accounting and Logging services
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Data Services Data Movement Data Scheduler (DS) Keep track of user/service transfer requests File Transfer/Placement Service (FTS/FPS) Transfer Queue (Table) Transfer Agent (Network)
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA gLite – Helper Services Configuration and Instrumentation Service – Query service state. Agreement Service – Implements a communication protocol for the SLAs. Bandwidth Allocation & Reservation service (BAR) – Controlling, Balancing and Manage Network flows.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA VOMS Virtual Organization Membership Service –Multiple VOs –Multiple roles in VO Compatible X509 extensions Signed by VOMS server –Web admin interface –Supports MyProxy –Resources providers grant access to VOs or roles –Sites map VO members/roles to local auth mechanism (unix users accounts) Allows for local policy
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA MyProxy –Allows longer lived jobs / increases security WMS renews proxy Users should not produce long lives proxies –Allows for secure user mobility Users does not need to copy globus-keys around
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Architecture User Interface (UI) Workload management system (WMS) Logging and bookkeeping service (LB) Virtual Organization Management service (VOMS) Information system (BDII, RGMA?), monitoring (MON) Computing element (CE) and worker nodes (WN) Storage element (SE) and File catalogue (LFC)
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Components The User Interface (UI) is the package on the user's machine. It is the submission entry point of the system, and it is considered a part of the WMS. The Workload Management System (WMS) is a sum of components whose task is matching the resources requested by the user's job with the ones available on the Computing Elements, in order to find the machine where the job will be executed The Computing element (CE)is the entry point in the Local Batch Systems of the resources. It can be an executing machine itself or simply the entry point for the local cluster managed by a batch system (PBS,LSF, CONDOR) The Worker Nodes are the machines on which the job is actually executed. They are linked with the CE through a local batch system, to which at last the jobs are submitted
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Components The Information System and Monitoring (IS and MON), which keep data about resources available and the status of the system The Logging and bookkeeping service (LB), that keeps rack of the events which happen to the jobs (and can be very useful to us bench makers!!) The Virtual Organization Management service (VOMS), for Authentication and Authorization. The Storage element (SE) and File catalogue (LFC), to manage big file transfer or make easier the availability of files jobs need.
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Life of a Job
E-infrastructure shared between Europe and Latin America 13th EELA Tutorial, La Antigua, October 2007 FP6−2004−Infrastructures−6-SSA Questions