Download presentation
Presentation is loading. Please wait.
1
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -1 The LHC Grid EGEE is funded by the European Union under contract IST-2003-508833 Peter Kacsuk MTA SZTAKI
2
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -2 Acknowledgement This tutorial is based on the work of many people: Fabrizio Gagliardi, Flavia Donno and Peter Kunszt (CERN) the EDG developer team the EDG training team the NeSC training team the SZTAKI training team
3
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -3 What is LHC Grid? LHC stands for Large Hadron Collider to be built by CERN http://lhc-new-homepage.web.cern.ch/lhc-new-homepage/ The LHC will be put in operation in 2007 with many experiments collecting 5-6 PetaB data per year The LHC Grid was built by CERN in order to provide storage and computing capacity for the process of this huge data set The LHC Grid current version is called LCG-2 It was built based on the sw developed by the European DataGrid project and by the Gryphin US project Now LCG-2 is the first EGEE infrastructure
4
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -4 What is LHC Grid? The first EGEE infrastructure - Largest functioning Grid: more than 70 sites, over 5,000 CPUs, 4,000 TB 5,000 jobs simultaneously
5
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -5 What is EGEE ? (I) EGEE (Enabling Grids for Escience in Europe) is a seamless Grid infrastructure for the support of scientific research, which: Integrates current national, regional and thematic Grid efforts, especially in HEP (High Energy Physics) Provides researchers in academia and industry with round-the-clock access to major computing resources, independent of geographic location Applications Geant network Grid infrastructure
6
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -6 What is EGEE ? (II) 70 leading institutions in 27 countries, federated in regional Grids 32 M Euros EU funding (2004-5), O(100 M) total budget Aiming for a combined capacity of over 20’000 CPUs (the largest international Grid infrastructure ever assembled) ~ 300 dedicated staff
7
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -7 EGEE Community
8
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -8 EGEE infrastructure Access to networking services provided by GEANT and the NRENs Production Service : in place (based on HEP LCG-2) for production applications MUST run reliably, runs only proven stable, debugged middleware and services Will continue adding new sites in EGEE federations Pre-production Service: For middleware re-engineering Certification and Training/Demo testbeds
9
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -9 What do we expect from the Grid? Access to a world-wide virtual computing laboratory with almost infinite resources Possibility to organize distributed scientific communities in VOs Transparent access to distributed data and easy workload management Easy to use application interfaces
10
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -10 What are the characteristics of a Grid system? Numerous Resources Ownership by Mutually Distrustful Organizations & Individuals Potentially Faulty Resources Different Security Requirements & Policies Required Resources are Heterogeneous Geographically Separated Different Resource Management Policies Connected by Heterogeneous, Multi-Level Networks
11
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -11 The LCG-2 Architecture Collective Services Information & Monitoring Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication & Accounting Replica Catalog Storage Element Services Database Services Fabric services Configuration Management Configuration Management Node Installation & Management Node Installation & Management Monitoring and Fault Tolerance Monitoring and Fault Tolerance Resource Management Fabric Storage Management Fabric Storage Management Grid Fabric Local Computing Grid Grid Application Layer Data Management Job Management Metadata Management Logging & Book- keeping
12
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -12 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) Frontend Node Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)
13
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -13 User Interface The initial point of access to the LCG-2 Grid is the User Interface This is a machine where LCG users have a personal account The user’s certificate is installed The UI is the gateway to Grid services It provides a Command Line Interface to perform the following basic Grid operations: submit a job for execution on a Computing Element; list all the resources suitable to execute a given job; replicate and copy files; cancel one or more jobs; retrieve the output of one or more finished jobs; show the status of one or more submitted jobs. One or more UIs are available at each site part of the LCG-2 Grid
14
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -14 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) Frontend Node Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)
15
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -15 Computing Element (CE) Defined as a Grid batch Queue and identified by a pair : / Several queues defined for the same hostname are considered different CEs. For example: adc0015.cern.ch:2119/jobmanager-lcgpbs-long adc0015.cern.ch:2119/jobmanager-lcgpbs-short A Computing Element is built on a homogeneous farm of computing nodes (called Worker Nodes) One node acts as a Grid Gate (GG) or front-end to the Grid and runs: a Globus gatekeeper the Globus GRAM (Globus Resource Allocation Manager) the master server of a Local Resource Management System that can be: PBS, LSF or Condor a local Logging and Bookkeeping server Each LCG-2 site runs at least one CE and a farm of WNs behind it.
16
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -16 Computing Element CPU:PIII RAM:0.5GB OS:Linux … Grid Gate node gatekeeper infoService CPU:PIII RAM:0.5GB OS:Linux CPU:PIV RAM:2GB OS:Linux CPU:PIV RAM:2GB OS:Linux Batch server in the example the red queue is assigned for two hosts Computing Element: entry point into a queue of a batch system information associated with a computing element is limited only to information relevant to the queue Resource details relates to the system
17
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -17 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) Frontend Node Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)
18
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -18 Storage Element (SE) A Storage Element (SE) provides uniform access and services to large storage spaces. Each site includes at least one SE They use two protocols: GSIFTP for file transfer Remote File Input/Output (RFIO) for file access
19
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -19 Storage Resource Management (SRM) Data are stored on disk pool servers or Mass Storage Systems storage resource management needs to take into account Transparent access to files (migration to/from disk pool) Space reservation File status notification Life time management SRM (Storage Resource Manager) takes care of all these details SRM is a Grid Service that takes care of local storage interaction and provides a Grid interface to outside world
20
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -20 Storage Resource Management Support for local policy Each storage resource can be managed independently Internal priorities are not sacrificed by data movement between Grid agents Disk and tape resources are presented as a single element Reservation on demand and advance reservation Space can be reserved for registering a new file Plan the storage system usage File status and estimates for planning Provides info on file status Provide estimates on space availability/usage
21
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -21 A Simple Configuration User Interface Resource Broker Replica Catalog Information Service Storage Element 1 Storage Element 2 Computing Element 1 Computing Element 2 “CLOSE”
22
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -22 LCG-2 local configuration SZTAKI’s LCG-2 system GRID GATE n31.hpcc.sztaki.hu -User Interface -Computing Element -Storage Element (69GB) -Resource Broker -ReplicaManager (512MB,Intel Pentium4 2.53GHz) n27.hpcc.sztaki.hu (128MB,Genuine Intel PentiumIII Dual Proc. 2x500MHz) n28.hpcc.sztaki.hu Workernode #1 Default:Workernode#2
23
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -23 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) Frontend Node Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)
24
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -24 Information System (IS) The Information System (IS) provides information about the LCG-2 Grid resources and their status The current IS is based on LDAP: a directory service infrastructure which is a specialized database optimized for reading, browsing and searching information. the LDAP schema used in LCG2 implements the GLUE (Grid Laboratory for a Uniform Environment) Schema
25
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -25 How to store Information? The LDAP information model is based on entries. An entry usually describes an object such as a person, a computer, a server, and so on. Each entry contains one or more attributes that describe the entry. Each attribute has a type and one or more values. Each entry has a name called a Distinguished Name (DN) that uniquely identifies it. A DN is formed by a sequence of attributes and values. Example: The DN of a particular CE entry would be: an attribute identifying the site (site_ID=cern) and an attribute identifying the CE (CE_ID=lxn1102.cern.ch), so the complete DN would be: CE_ID=lxn1102.cern.ch,site_ID=cern.
26
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -26 The Directory Information Tree Based on their DNs, the entries can be arranged into a hierarchical tree-like structure. This tree of directory entries is called the Directory Information Tree (DIT).
27
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -27 Information System (IS) The IS is a hierarchical system with 3 levels from bottom up: GRIS (Grid Resource Information Servers) level (CE and SE level) Grid Index Information Server (GIIS) level (site level) Top, centralized level (Grid level) the Globus Monitoring and Discovery Service (MDS) mechanism has been adopted at the GRIS level The other two levels use the Berkeley DB Information Index (BDII) mechanism
28
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -28 LCG-2 hierarchical Info system BDII: Berkley DB Information Index GIIS: Grid Index Information Server GRIS: Grid Resource Information Server CE: Computing Element SE: Storage Element
29
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -29 How to collect and store information? All services are allowed to enter information into the IS The BDII at the top queries every GIIS in every 2 min and acts as a cache storing information about the Grid status in its LDAP database The BDII at the GIIS collects info from every GRIS in every 2 min and acts as a cache storing information about the site status in its LDAP database The GRIS updates information according to the MDS protocol
30
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -30 How to obtain Information? All users can browse the catalogues To obtain the information the client should: Ask BDII about possible GIIS/GRIS Directly query GIIS/GRIS Or use BDII cache The IS scales to ~1000 sites (MDS much less: ~100)
31
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -31 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) Frontend Node Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)
32
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -32 Data Management The Data Management services are provided by the Replica Management System (RMS) of EDG and the LCG Data Management client tools In LCG, the data files are replicated: on a temporary basis, to many different sites depending on where the data is needed. The users or applications do not need to know where the data is located, they use logical files names the Data Management services are responsible for locating and accessing the data.
33
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -33 File Management Motivation Site A Storage Element AStorage Element B Site B File B File AFile X File YFile B File AFile C File D Replica Catalog: Map Logical to Site files File Transfer Replica Manager: ‘atomic’ replication operation single client interface orchestrator Pre- Post-processing: Prepare files for transfer Validate files after transfer Replica Selection: Get ‘best’ file Replication Automation: Data Source subscription Load balancing: Replicate based on usage Metadata: LFN metadata Transaction information Access patterns
34
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -34 Data Management Tools Tools for Locating data Copying data Managing and replicating data Meta Data management In LCG-2 you have Replica Manager (RM) Replica Location Service (RLS) Replica Metadata Catalog (RMC) RM RLS RMC
35
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -35 Storage Element Replication Services: Basic Functionality Replica Manager Replica Location Service Replica Metadata Catalog Storage Element Files have replicas stored at many Grid sites on Storage Elements. Each file has a unique Grid ID. Locations corresponding to the GUID are kept in the Replica Location Service. Users may assign aliases to the GUIDs. These are kept in the Replica Metadata Catalog. The Replica Manager provides atomicity for file operations, assuring consistency of SE and catalog contents.
36
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -36 Storage Element Interactions with other Grid components Replica Manager Replica Location Service Replica Metadata Catalog Information Service Resource Broker User Interface or Worker Node Storage Element Virtual Organization Membership Service Applications and users interface to data through the Replica Manager either directly or through the Resource Broker. Management calls should never go directly to the SRM.
37
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -37 Replica Manager client SRM Replica Catalog Storage 2 3 4 1 1.The Client asks a catalog to provide the location of a file 2.The catalog responds with the name of an SRM 3.The client asks the SRM for the file 4.The SRM asks the storage system to provide the file 5.The storage system sends the file to the client through the SRM or 6.directly 5 6 6 Simplified Interaction Replica Manager – Storage Resource Manager
38
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -38 Replica Manager (RM) High level data management on the Grid, takes care of: Location of data Replication of data Efficient access to data Hides the SRM (Storage Resource Manager): User cannot access directly the SRM, only through the RM Coordinates the use of Replica Location Service Replica Metadata Catalog
39
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -39 Interaction of the Replica Manager (RM) with other Grid services The RM presents a single interface to the user or other services Some of the RM functionalities have been replaced by a new, faster interface: the LCG Data Management client tools.
40
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -40 File References and Replica Catalogs The files in the Grid are referenced by different names: Grid Unique IDentifier (GUID) Logical File Name (LFN) Storage URL (SURL) Transport URL (TURL). the GUID or LFN refer to files and not replicas, and say nothing about locations the SURLs and TURLs give information about where a physical replica is located. RMC : Replica Metadata Catalog LRC : Local Replica Catalog
41
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -41 Abstract file names GUID A file can always be identified by its GUID GUID is assigned at data registration time GUID is based on the UUID standard to guarantee unique IDs A GUID is of the form: guid: All the replicas of a file will share the same GUID LFN In order to locate a Grid accessible file, the human user will normally use a LFN LFNs are human-readable strings, they are allocated by the user as GUID aliases LFN’s form is: lfn:
42
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -42 Physical file names SURL used by the RMS to find where a replica is physically stored and by the SE to locate the file SURLs are of the form: sfn: / where is used internally by the SE to locate the file. TURL TURL gives the necessary information to retrieve a physical replica, including hostname path protocol port (as any conventional URL);
43
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -43 Replica Location Service (RLS) RLS maintains information about the physical location of the replicas (mapping with the GUIDs). It is composed of several Local Replica Catalogs (LRCs) which hold the information of replicas for a single VO.
44
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -44 Replica Metadata Catalog (RMC) The RMC stores the mapping between GUIDs and the respective aliases (LFNs) Maintains other metada information (sizes, dates, ownerships... )
45
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -45 User Interfaces for Data Management Users are mainly referred to use the interface of the Replica Manager client: Management commands Catalog commands File Transfer commands The services RLS and RMC provide additional user interfaces Mainly for additional catalog operations Additional server administration commands Should mainly be used by administrators Can also be used to check the availability of a service RM RLS RMC
46
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -46 The Replica Manager Interface – Management Commands copyAndRegisterFile args: source, dest, lfn, protocol, streamscopyAndRegisterFile Copy a file into grid-aware storage and register the copy in the Replica Catalog as an atomic operation. replicateFile args: source/lfn, dest, protocol, streamsreplicateFile Replicate a file between grid-aware stores and register the replica in the Replica Catalog as an atomic operation. deleteFile args: source/seHost, all deleteFile Delete a file from storage and unregister it.
47
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -47 The Replica Manager Interface – Catalog Commands (1) registerFileargs: source, lfnregisterFile Register a file in the Replica Catalog that is already stored on a Storage Element. unregisterFileargs: source, guidunregisterFile Unregister a file from the Replica Catalog. listReplicas args: lfn/surl/guid List all replicas of a file. registerGUIDargs: surl, guid Register an SURL with a known GUID in the Replica Catalog. listGUIDargs: lfn/surl Print the GUID associated with an LFN or SURL.
48
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -48 The Replica Manager Interface – Catalog Commands (2) addAliasargs: guid, lfn Add a new alias to GUID mapping removeAliasargs: guid, lfn Remove an alias LFN from a known GUID. printInfo() Print the information needed by the Replica Manager to screen or to a file. getVersion() Get the versions of the replica manager client.
49
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -49 The Replica Manager Interface – File Transfer Commands copyFile args: source, destcopyFile Copy a file to a non-grid destination. listDirectory args: dir List the directory contents on an SRM or a GridFTP server.
50
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -50 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) Frontend Node Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)
51
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -51 Job Management The user interacts with Grid via a Workload Management System (WMS) The Goal of WMS is the distributed scheduling and resource management in a Grid environment. What does it allow Grid users to do? To submit their jobs To execute them on the “best resources” The WMS tries to optimize the usage of resources To get information about their status To retrieve their output
52
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -52 WMS Components WMS is currently composed of the following parts: 1.Workload Manager, which is the core component of the system 2.Match-Maker (also called Resource Broker), whose duty is finding the best resource matching the requirements of a job (match- making process). 3.Job Adapter, which prepares the environment for the job and its final description, before passing it to the Job Control Service. 4.Job Control Service (JCS), which finally performs the actual job management operations (job submission, removal...) 5.Logging and Bookkeeping services (LB) : store Job Info available for users to query
53
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -53 Information to be specified Job characteristics Requirements and Preferences of the computing system Software dependencies Job Data requirements Specified using a Job Description Language (JDL) Job Preparation: Let’s think the way the Grid thinks!
54
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -54 Job Flow a. the user logs to the UI machine and creates a proxy certificate that authenticates her in every secure interaction, and has a limited lifetime.
55
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -55 Job Flow b. The user submits the job from the UI to the WMS. The user can specify in the JDF one or more files to be copied from the UI to the RB node; this set of files is called Input Sandbox. The event is logged in the LB and the status of the job is SUBMITTED.
56
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -56 Job Flow c. The WMS, and in particular the Match-Maker component, looks for the best available CE to execute the job. The Match-Maker interrogates the BDII to query the status of CEs and SEs, and the RLS to find location of data. The event is logged in the LB and the status of the job is WAITING.
57
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -57 Job Flow d. The WMS Job Adapter prepares the job for submission creating a wrapper script that is passed, together with other parameters, to the JCS for submission to the selected CE. The event is logged in the LB and the status of the job is READY.
58
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -58 Job Flow e. The Globus Gatekeeper on the CE receives the request and sends the Job for execution to the LRMS (e.g. PBS, LSF or Condor). The event is logged in the LB and the status of the job is SCHEDULED.
59
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -59 Job Flow f. The LRMS handles the job execution on the available local farm worker nodes. User’s files are copied from the RB to the WN where the job is executed. The event is logged in the LB and the status of the job is RUNNING.
60
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -60 Job Flow f. While the job runs, Grid files can be accessed on a (close) SE using either the RFIO protocol or local access if the files are copied to the WN local filesystem. In order for the job to find out which is the close SE, or what is the result of the Match-Maker process, a file with this information is produced by the WMS and shipped together with the job to the WN. This is known as the.BrokerInfo file. Information can be retrieved from this file using the BrokerInfo CLI or the API library.
61
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -61 Job Flow f. The job can produce new output data that can be uploaded to the Grid and made available for other Grid users to use. This can be achieved using the Data Management tools. Uploading a file to the Grid means copying it on a Storage Element and registering its location, metadata and attribute to the RMS. During job execution, data files can be replicated between two SEs using again the Data Management tools.
62
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -62 Job Flow i. If the job reaches the end without errors, the output (not large data files, but just small output files specified by the user in the so called Output Sandbox) is transferred back to the RB node. The event is logged in the LB and the status of the job is DONE.
63
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -63 Job Flow j. At this point, the user can retrieve the output of his/her job from the UI using the WMS CLI or API. The event is logged in the LB and the status of the job is CLEARED.
64
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -64 Job Flow Status and Errors Status queries from the UI machine: job status queries are addressed to the LB database. Resource status queries are addressed to the BDII If the site where the job is being run falls down, the job will be automatically resent to another CE that is analogue to the previous one, w.r.t. requirements the user asked for. In the case that this new submission is disabled, the job will be marked as aborted. Users can get information about what happened by simply questioning the LB service.
65
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -65 How do I login on the Grid ? Distribution of resources: secure access is a basic requirement secure communication security across organisational boundaries single “sign-on" for users of the Grid Two basic concepts: Authentication: Who am I? “Equivalent” to a pass port, ID card etc. Authorisation: What can I do? Certain permissions, duties etc.
66
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -66 Security in the Grid In industry, several security standards exist: Public Key Infrastructure (PKI) PKI keys SPKI keys (focus on authorisation rather than certificates) RSA Secure Socket Layer (SSL) SSH keys Kerberos Need for a common security standard for Grid services Above standards do not meet all Grid requirements (e.g. delegation, single sign-on etc.) Grid community mainly uses X.509 PKI for the Internet Well established and widely used (also for www, e-mail, etc.)
67
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -67 PKI – Basic overview Public Key Infrastructure (also called asymmetric cryptography) One primary advantage: it is generally easier than distributing secret keys securely, as required in symmetric keys ciphertext c = E e (m) m = D d (c). public key e private key d encryption transformation E e decryption transformation D d wishing to send a message m to A: applies the decryption transformation Entity A (Alice) Entity B (Bob) public key private key Uses A’s public key Message direction Uses own private key
68
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -68 Digital Certificates How can B be sure that A’s public key is really A’s public key and not someone else’s? ● A third party guarantees the correspondence between public key and owner’s identity, by signing a document which contains the owner’s identity and his public key (Digital Certificate) ● Both A and B must trust this third party Two models: ● X.509: hierarchical organization; ● PGP: “web of trust”.
69
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -69 Involved entities User Certificate Authority Public key Private key certificate CA Resource (site offering services)
70
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -70 Certificate Request Private Key encrypted on local disk Cert Request Public Key ID Cert User generates public/private key pair. User send public key to CA along with proof of identity. CA confirms identity, signs certificate and sends back to user. Signed public key.
71
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -71 Grid Security Infrastructure (GSI) Globus Toolkit TM proposed and implements the Grid Security Infrastructure (GSI) Protocols and APIs to address Grid security needs GSI protocols extend standard public key protocols Standards: X.509 & SSL/TLS Extensions: X.509 Proxy Certificates (single sign-on) & Delegation Proxy Certificate: Short term, restricted certificate that is derived form a long-term X.509 certificate Signed by the normal end entity cert, or by another proxy Allows a process to act on behalf of a user Not encrypted and thus needs to be securely managed by file system
72
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -72 Delegation Proxy creation can be recursive each time a new private key and new X.509 proxy certificate, signed by the original key Allows remote process to act on behalf of the user Avoids sending passwords or private keys across the network The proxy may be a “Restricted Proxy”: a proxy with a reduced set of privileges (e.g. cannot submit jobs).
73
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -73 Virtual organisations An EGEE user must belong to a VO A VO Controls access to specified CE, SE Usually comprises geographically distributed people Requires the ability to know who has done what, and who will not be allowed to do it again…. Security. Current VO’s: HEP communities, biology, astronomy,… VOMS: enhanced flexibility in VO management
74
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -74 EGEE Pilot Applications (I) High Energy Physics: Have been running large distributed computing systems for many years Now focus on computing for LHC hence LCG (LHC computing grid project) several current HEP experiments use grid technology (Babar,CDF, etc.) LHC experiments are currently executing large scale data challenges
75
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -75 EGEE Pilot Applications (II) Biomedics Bioinformatics (gene/proteome databases distributions) Medical applications (screening, epidemiology, image databases distribution, etc.) Interactive application (human supervision or simulation) BioMed applications deployed and expect to run first job on LCG-2 by September
76
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -76 Who else will benefit from EGEE ? EGEE Generic Applications Advisory Panel: 4 applications presented 3 applications (comp. chemistry, earth science, astro-particle) recommended for deployment with allocation of NA4 resources EU projects: GRACE, Mammogrid and Diligent asking for support Expression of interest: Planck/Gaia (astroparticle), SimDat (drug discovery)
77
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -77 How to access EGEE (I) 0) Review information provided on the EGEE website: www.eu-egee.org 1)Establish contact with the EGEE applications group lead by Vincent Breton (breton@clermont.in2p3.fr) 2) Provide information by completing a questionnaire describing your application 3) Applications selected based on scientific criteria, Grid added value, effort involved in deployment, resources consumed/contributed etc.
78
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -78 How to access EGEE (II ) 4) Follow a training session 5) Migrate application to EGEE infrastructure with the support of EGEE technical experts 6) Initial deployment for testing purposes 7) Production usage (contribute computing resources for heavy production demands)
79
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -79 SZTAKI’s role in EGEE Education Centre of the Central European Region Developing and providing various training courses Provide user support for applications See the web page: www.lpds.sztaki.hu/egee Leader of middleware deployment effort in the SEEGRID project Creating a regional EGEE Grid for South-East Europe Promote the establishment of the Hungarian EGEE Grid together with other members of MGKK: Creating a national EGEE Grid for Hungary Extend the EGEE Grid with new layers: Mercury monitor www.lpds.sztaki.hu/mercury P-GRADE portal www.lpds.sztaki.hu/pgportal
80
DAPSYS Tutorial: LCG-2 Overview – Sep 19th, 2004 -80 Conclusions EGEE is the first attempt to build a worldwide Grid infrastructure for data intensive applications from many scientific domains A large-scale production grid service, LCG-2 is already deployed and being used for HEP and BioMed applications Resources and user groups will rapidly expand during the course of the project A process has been established for migrating new applications to the EGEE infrastructure A training programme has been established with a number of events already held Prototype “next generation” grid middleware is being tested now See the on-line broker demo at: http://www.hep.ph.ic.ac.uk/~mp801/applet/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.