Presentation is loading. Please wait.

Presentation is loading. Please wait.

Amsterdam, 28 June 2006DEISA UNICORE tutorial UNICORE and the DEISA supercomputing grid Jules Wolfrat

Similar presentations


Presentation on theme: "Amsterdam, 28 June 2006DEISA UNICORE tutorial UNICORE and the DEISA supercomputing grid Jules Wolfrat"— Presentation transcript:

1 Amsterdam, 28 June 2006DEISA UNICORE tutorial UNICORE and the DEISA supercomputing grid Jules Wolfrat wolfrat@sara.nl

2 Amsterdam, 28 June 2006DEISA UNICORE tutorial2 Outline DEISA overview UNICORE history UNICORE architecture Demo?

3 Amsterdam, 28 June 2006DEISA UNICORE tutorial3 GEANT AIX distributed super-cluster Vector systems (NEC, …) Linux systems (SGI, IBM, …) THE DEISA SUPERCOMPUTING GRID

4 Amsterdam, 28 June 2006DEISA UNICORE tutorial4 DEISA objectives To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems. Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success DEISA is an European Supercomputing Service built on top of existing national services. This service is based on the deployment and operation of a persistent, production quality, distributed supercomputing environment with continental scope. The integration of national facilities and services, together with innovative operational models, is expected to add substantial value to existing infrastructures. Main focus is High Performance Computing (HPC).

5 Amsterdam, 28 June 2006DEISA UNICORE tutorial5 BSC Barcelona Supercomputing Centre Spain CINECA Consortio Interuniversitario per il Calcolo Automatico Italy CSC Finnish Information Technology Centre for Science Finland EPCC/HPCx University of Edinburgh and CCLRC UK ECMWF European Centre for Medium-Range Weather Forecast UK (int) FZJ Research Centre Juelich Germany HLRS High Performance Computing Centre Stuttgart Germany IDRIS Institut du Développement et des Ressources France en Informatique Scientifique - CNRS LRZ Leibniz Rechenzentrum Munich Germany RZG Rechenzentrum Garching of the Max Planck Society Germany SARA Dutch National High Performance Computing The Netherlands and Networking centre Participating Sites

6 Amsterdam, 28 June 2006DEISA UNICORE tutorial6 The DEISA supercomputing environment (21.900 processors and 145 Tf in 2006, more than 190 Tf in 2007) IBM AIX Super-cluster –FZJ-Julich, 1312 processors, 8,9 teraflops peak –RZG – Garching, 748 processors, 3,8 teraflops peak –IDRIS, 1024 processors, 6.7 teraflops peak –CINECA, 512 processors, 2,6 teraflops peak –CSC, 512 processors, 2,6 teraflops peak –ECMWF, 2 systems of 2276 processors each, 33 teraflops peak –HPCx, 1536 processors, 11 teraflops peak BSC, IBM PowerPC Linux system (MareNostrum) 4864 processeurs, 40 teraflops peak SARA, SGI ALTIX Linux system, 416 processors, 2,2 teraflops peak LRZ, Linux cluster (2.7 teraflops) moving to SGI ALTIX system (5120 processors and 33 teraflops peak in 2006, 70 teraflops peak in 2007) HLRS, NEC SX8 vector system, 576 processors, 12,7 teraflops peak. Systems interconnected with dedicated 1Gb/s network – currently upgrading to 10 Gb/s – provided by GEANT and NRENs

7 Amsterdam, 28 June 2006DEISA UNICORE tutorial7 The technology cycle Technology providers R&D projects DEISA strategic and technologic management Service definitions Technology specifications Technology watch Technology pull WAN GPFS (IBM) completed Multi-cluster batch processing (IBM) completed GPFS for non-IBM systems (IBM) ongoing Co-scheduling (Platform) in preparation

8 Amsterdam, 28 June 2006DEISA UNICORE tutorial8 How is DEISA enhancing HPC services in Europe? Running larger parallel applications in individual sites, by a cooperative reorganization of the global computational workload on the whole infrastructure, or by the operation of the job migration service inside the AIX super-cluster. Enabling workflow applications with UNICORE (complex applicaions that are pipelined over several computing platforms) Enabling coupled multiphysics Grid applications (when it makes sense) Providing a global data management service whose primordial objectives are: –Integrating distributed data with distributed computing platforms –Enabling efficient, high performance access to remote datasets (with Global File Systems and striped GridFTP). We believe that this service is critical for the operation of (possible) future European petascale systems –Integrating hierarchical storage management and databases in the supercomputing Grid. Deploying portals as a way to hide complex environments to new users communities, and to interoperate with another existing grid infrastructures.

9 Amsterdam, 28 June 2006DEISA UNICORE tutorial9 Basic Services: Global File Systems Global file system Sophisticated software environment, necessary to provide single system image if a clustered computing platform. They provide global data management. Data in the GFS is “symmetric” with respect to all computing nodes. HPC system at site A nodes Disk space network

10 Amsterdam, 28 June 2006DEISA UNICORE tutorial10 The DEISA integration concept Global distributed GPFS file system with continental scope. Global resource pool is dynamic: nodes can enter and leave the pool without Disrupting the national services. Site A Site CSite D Network interconnect (Reserved bandwidth) Site B

11 Amsterdam, 28 June 2006DEISA UNICORE tutorial11 Linux SGI SARA (NL) LRZ (DE) DEISA Global File System integration in 2006 (based on IBM’s GPFS) CINECA (IT) FZJ (DE) ECMWF (UK) IDRIS (FR) AIX IBM domain RZG (DE) BSC (ES) LINUX Power-PC CSC (FI) HPC Common Global File System similar architectures / operation systems High bandwidth (10 Gbit/s) High Performance Common Global File System various architectures / operating systems High bandwidth (up to 10 Gbit/s)

12 Amsterdam, 28 June 2006DEISA UNICORE tutorial12 Demonstration of a transparent data access in a heterogeneous configuration ) (1) A 64 processor job is running at SARA (SGI Altix system) Global File System (2) The input data for this run are read from the Linux GPFS at SARA (3) The output data will be written into the BSC GPFS system in Spain (4) Visualization at RZG system is reading the output data produced by the application from BSC GPFS Demonstration RZG (G) BSC (ES) SARA (NL)

13 Amsterdam, 28 June 2006DEISA UNICORE tutorial13 Amsterdam, NL Orsay,FR Garching,DE Bologna, IT Jülich, DE Argonne, IL Bloomington, IN Urbana- Champaign, IL San Diego, CA TeraGrid Sites DEISA Sites American and European supercomputing infrastructures linked: bridging communities with scalable, wide-area global file systems Global File System Interoperability demo during Supercomputing Conference 2005 in Seattle

14 Amsterdam, 28 June 2006DEISA UNICORE tutorial14 Basic services: workflow simulations using UNICORE UNICORE supports complex simulations that are pipelined over several heterogeneous platforms (workflows). UNICORE handles workflows as a unique job and transparently moves the output – input data along the pipeline. UNICORE clients that monitor the application can run in laptops. UNICORE has a user friendly graphical interface. DEISA has developed a command line interface for UNICORE. UNICORE infrastructure including all sites has full production status. It has proven to be very stable during the last few months.

15 Amsterdam, 28 June 2006DEISA UNICORE tutorial15 Other basic services Job migration inside the AIX super-cluster. Based on LoadLeveler Multi- Cluster, it allows system administrators to reroute jobs to other sites, in a way transparent for the end users. Used to move away simple jobs of « implicit users » to make place for a bigger application in a site. Full production status. Co-allocation. We are starting to prepare a first generation co-allocation service on the full heterogeneous infrastructure, using LSF Multi-cluster. Important for coupled Grid aplications and for data movement. Service in development phase, prototype expected in 6-9 months Remote I/O using Global File Systems and fast data transfers. See next transparency Integrating hierarchical data management and databases in the supercomputing Grid. In progress

16 Amsterdam, 28 June 2006DEISA UNICORE tutorial16 Accessing remote data: high performance remote I/O and file transfer GridFTP Co-scheduled, parallel data mover tasks DATA REPOSITORY Remote I/O with global file systems implicitly moves data across platforms (in production today) DEISA will also deploy explicit high performance data movers, using GridFTP

17 Amsterdam, 28 June 2006DEISA UNICORE tutorial17 Summary DEISA provides an integrated supercomputing environment, with efficient data sharing through high performance global file systems. This is highly transparent to end users. DEISA enables job migration across sites (also transparent to end users). Exceptional resources for very demanding applications are made available by the operation of the global resource pool. We are load balancing computational workload at a European scale. Huge, demanding applications can be run “as such”. Support of Grid applications (which are distributed by design). With this operational model, the DEISA super-cluster is not very different from a “true” monolithic European supercomputer (which must be partitioned in any case for fault tolerance and QoS). The main difference comes from the coexistence of several independent administration domains. This requires, as in TeraGrid, coordinated production environments.

18 Amsterdam, 28 June 2006DEISA UNICORE tutorial18 UNICORE UNiform Interface to COmputer Resources Following material thanks to UNICORE team

19 Amsterdam, 28 June 2006DEISA UNICORE tutorial19 Highlights Excellent workflow support Transparent data staging / transfer Multi-site, multi-step jobs: heterogeneous meta-computing Uniform user authentication and security mechanisms The site maintains full control over their resources UNICORE Client offers –Uniform GUI for job creation and monitoring –Easy integration of applications through plugins

20 Amsterdam, 28 June 2006DEISA UNICORE tutorial20 History I Development started in 1997 Projects UNICORE and UNICORE Plus –Funded by the German Ministry of Education and Research (until 12/2002)

21 Amsterdam, 28 June 2006DEISA UNICORE tutorial21 History II Developments in EC funded projects: EUROGRID (11/2000 – 01/2004) –IST-1999-20247 –Resource broker, Standard based File Transfer (gridFTP) –Bio molecular simulations, Weather prediction, coupled CAE simulations, Structural analysis GRIP (01/2002 – 02/2004) –IST-2000-32257 –Interoperability between UNICORE and Globus (Integration of Globus maintained resources as target system in UNICORE) OpenMolGRID (09/2002 – 11/2004) –IST-2001-37238 –Use UNICORE for molecular engineering –Focus on scientific workflows

22 Amsterdam, 28 June 2006DEISA UNICORE tutorial22 History III Collaborators: –Intel GmbH (former Pallas GmbH) –Fujitsu Laboratory of Europe (former fecit) –Forschungszentrum Jülich –Deutscher Wetterdienst –Genias, RUS, RUKA, LRZ, PC 2, ZHR, ZIB –CNRS-IDRIS (F), CSCS (CH), GIE EADS CCR (F), ICM (PL), Parallab (N), Soton (UK), UoM (UK), ANL (US), UT (EE), UU (UK), ComGenex (HU), Negri (I)

23 Amsterdam, 28 June 2006DEISA UNICORE tutorial23 Features –Intuitive GUI with single sign-on –X.509 certificates for AA and job/data signing –only one opened port in firewall required –workflow engine for complex multi-site multi-step workflows job monitoring –extensible application support –secure data transfer integrated –resource management –easy installation and configuration of client and server components –full control of resources remains –production quality, …

24 Amsterdam, 28 June 2006DEISA UNICORE tutorial24 Software Status I Current version 5.6 (Client) / 4.6 (Server) User Client is platform independent (Java) Servers (Unix) Target systems (Unix) –“no batch“ –T3E, SP3, VPP, hpcLine, SR 8000, SX-5, PC-Clusters, …, Globus 2.x as targets –NQS, LL, LSF, PBS, CCS, SGE,...

25 Amsterdam, 28 June 2006DEISA UNICORE tutorial25 Software Status II UNICORE available at SourceForge as OpenSource under BSD license http://unicore.sourceforge.net UNICORE Forum e.V. http://www.unicore.org Public test system for testing (standard) client functions available

26 Amsterdam, 28 June 2006DEISA UNICORE tutorial26 Deployment At all project partner sites DEISA sites (IDRIS, CINECA, RZ Garching,...) Naregi project (Japan) …

27 Amsterdam, 28 June 2006DEISA UNICORE tutorial27 The UNICORE Grid UNICORE Client UNICORE Architecture

28 Amsterdam, 28 June 2006DEISA UNICORE tutorial28 Usite Vsite ARCHITECTURE TSI NJS RMS TSI NJS Authorization Gateway Authentication opt. Firewall Gateway opt. Firewall Client Multi-Site Jobs UUDB SSL Abstract Non- Abstract DiscRMSDisc Vsite TSI NJS RMS UUDB Disc IDB Incarnation opt. Firewall Authorization

29 Amsterdam, 28 June 2006DEISA UNICORE tutorial29 Client Job Preparation Job Monitoring Workflow Management Usites Vsites

30 Amsterdam, 28 June 2006DEISA UNICORE tutorial30 UNICORE Server Gateway Network Job Supervisor –Configuration –UNICORE User Data Base Target System Interface Demo package containing preconfigured components available on sourceforge.net/projects/unicore

31 Amsterdam, 28 June 2006DEISA UNICORE tutorial31 Server Components Network Job Supervisor Gateway UNICORE User DB Target System Interface conf UUDB

32 Amsterdam, 28 June 2006DEISA UNICORE tutorial32 Server Prerequisites Gateway and NJS: –Java  1.4.2 –X.509 certificates for Gateway and NJS –Signer certificate(s) TSI: –Perl (  5.004)

33 Amsterdam, 28 June 2006DEISA UNICORE tutorial33 Gateway Entry point of a UNICORE Site Accepts SSL connections from Clients and NJSs Accepts valid certificates from all signers known to it (authentication) Talks UNICORE Protocol Layer (UPL) on connections to the outside world Sends/receives AJOs to/from the NJSs

34 Amsterdam, 28 June 2006DEISA UNICORE tutorial34 Gateway connections Network Job Supervisor Gateway UNICORE User DB conf UUDB connections gateway.properties gw.gateway_host_name= gw.port=

35 Amsterdam, 28 June 2006DEISA UNICORE tutorial35 Network Job Supervisor (NJS) UNICORE scheduler Receives/sends AJOs from/to local Gateway Translates AJO into batch job for target Maps the user’s Ulogin to Xlogin Sends sub-AJOs to corresponding Gateway according to dependencies Polls for status and output of sub-AJOs Sends batch jobs and requests to TSI Polls TSI for job status and output

36 Amsterdam, 28 June 2006DEISA UNICORE tutorial36 NJS Connections NJS Gateway UNICORE User DB conf UUDB njs.properties njs.gateway= njs.vsite_name= njs.gateway_port= njs.admin_port= connections TSI Admin

37 Amsterdam, 28 June 2006DEISA UNICORE tutorial37 Incarnation Data Base Static definitions and translation table, contains definitions for GENERAL properties (file spaces, descriptions, …) EXECUTION_TSI (host + ports, resources, batch queues, …) STORAGE_TSI (for file transfers and management) RUN (translation rules for target) IMPORT, EXPORT, CLEANUP, LIST_DIRECTORY, RENAME, COPY_FILE, DELETE_FILE, CHANGE_PERMISSIONS FORTRAN, LINK

38 Amsterdam, 28 June 2006DEISA UNICORE tutorial38 UNICORE User Data Base Management of Ulogin – Xlogin mapping information NJS accesses this information Basic version allows to map one certificate to exactly one Xlogin NJS to UUDB interface defined to adapt to site specific user data bases (i.e. ldap) http://www.unicore.org/downloads.htm  contributions offers an alternative uudb with certificate-projectid pairs being mapped to Xlogins

39 Amsterdam, 28 June 2006DEISA UNICORE tutorial39 NJS connections NJS Gateway UNICORE User DB conf UUDB njs.idb SOURCE njs.properties njs.gateway= njs.vsite_name= njs.gateway_port= njs.admin_port= connections TSI Admin

40 Amsterdam, 28 June 2006DEISA UNICORE tutorial40 Target System Interface Interface to target operating and batch system Perl scripts and modules Needs root privileges to act on behalf of the user (uses setreuid) Provides interface to local system for –Job submission –Status query, job monitoring –File handling –…

41 Amsterdam, 28 June 2006DEISA UNICORE tutorial41 Example: Submit.pm $jobname= $2 if $1 eq "JOBNAME"; $outcome_dir = $2 if $1 eq "OUTCOME_DIR"; $uspace_dir = $2 if $1 eq "USPACE_DIR"; $time = $2 if $1 eq "TIME"; $memory = $2 if $1 eq "MEMORY"; $nodes = $2 if $1 eq "NODES"; … $memory = "-lM $memory"."Mb"; my $command = "$main::submit_cmd $queue $nodes $email $memory $time $jobname $stdout_loc $stderr_loc $Submit::tsi_unique_file_name";

42 Amsterdam, 28 June 2006DEISA UNICORE tutorial42 TSI connections NJS UNICORE User DB conf UUDB njs.idb SOURCE njs.properties TSI tsi.properties $main::njs_machine = shift || "NJS host"; $main::njs_port = shift || "port1"; $main::my_port = shift || “ port2”;

43 Amsterdam, 28 June 2006DEISA UNICORE tutorial43 Overview: Server connections Gateway Admin Client SSL SSL or plain socket NJS TSI Plain sockets njs.gateway_port + GW connections file gw.port + Client Usite list idb: SOURCE host p1 p2 + tsi script njs.admin_port

44 Amsterdam, 28 June 2006DEISA UNICORE tutorial44 Firewall Issues Client  Gateway –Internet –Allow connections to Gateway for https protocol on the port the Gateway is listening on –Client side has to allow for outgoing traffic on any port Gateway  NJS –Intranet –All connections from Gateway to NJS system and NJS’s Gateway port NJS  TSI –Intranet –All connections from NJS to TSI system and TSI’s NJS port –All connections from TSI to NJS system and NJS’s TSI port

45 Amsterdam, 28 June 2006DEISA UNICORE tutorial45 Current Trends and a look into the UNICORE future... Web services for interoperability “open up“ the architecture...but keep the UNICORE strenghts –abstraction and virtualisation –workflows –easy application integration

46 Amsterdam, 28 June 2006DEISA UNICORE tutorial46 Acronyms I UNICORE Job Ujob File space on the computer where Client runs Nspace File space at the Vsite Xspace Temporary file space for Ujob at Vsite Uspace Unix Login at Vsite Xlogin UNICORE Login, X.509 certificate Ulogin Computing resource, target system Vsite Site providing UNICORE services Usite

47 Amsterdam, 28 June 2006DEISA UNICORE tutorial47 Transl. of AJO into batch job using IDB Incarnation Abstract Job Object AJO UNICORE Protocol Layer (Client – Gateway) UPL Job Monitor Controller, part of Client JMC Job Preparation Agent, part of Client JPA Target System Interface TSI UNICORE User Data Base UUDB Incarnation Data Base IDB Network Job Supervisor NJS Acronyms II

48 Amsterdam, 28 June 2006DEISA UNICORE tutorial48 meets FZJ users DEISA FZJ gateway DMZ FZJ NJS intranet CNE users DEISA CNE gateway DMZ CNE NJS intranet RZG users DEISA RZG gateway DMZ RZG NJS intranet IDR users DEISA IDR gateway DMZ IDR NJS intranet CSC users DEISA CSC gateway DMZ CSC NJS intranet SARA users DEISA SARA gateway DMZ SARA NJS intranet BSC users DEISA BSC gateway DMZ BSC NJS intranet LRZ users DEISA LRZ gateway DMZ LRZ NJS intranet RZG users

49 Amsterdam, 28 June 2006DEISA UNICORE tutorial49 UNICORE Security Security model based on X509 public key infrastructure Credential consists of a public and a private key No userid and password authentication Password protected keystore Single sign on UNICORE accepts following private key formats: –RSA (pkcs12) E.g. Openssl 0.9.7x –Java keystore (jks) SUN Java Certificates provided e.g. by DFN CA Two server site security entities: –Gateway – Authentication –NJS – Authorisation

50 Amsterdam, 28 June 2006DEISA UNICORE tutorial50 UNICORE Security - Client Access to password protected keystore Encrypted Keystore contains all imported certificate(s) and the user‘s private key(s) UNICORE Keystore editor allows to –Generate a X509 certificate request –Import/export.p12 or.jks keystores –Import public keys The User has to import (at least) three certificates into the Client –Pluginsigner‘s certificate (public key) –Gateway signer‘s certificate (public key) –User‘s signed public key

51 Amsterdam, 28 June 2006DEISA UNICORE tutorial51 UNICORE Security: Gateway Gateway authenticates the user Following checks are performed on certificates presented by a client –Certificate is issued by one of the trusted CA (e.g. DFN-CA) –Certificate is within its validity period –Certificate has not been revoked (if check for Certification Revocation Lists (CRL) is activated) Gateway accepts only SSL connections from Clients and other NJSs –SSL-Handshake Optional SSL connection between Gateway and NJS

52 Amsterdam, 28 June 2006DEISA UNICORE tutorial52 Behind the scenes: Authentication Gateway establish SSL connection send user certificate send gateway certificate Trust user certificate issuer? Trust gateway certificate issuer? Gateway Certificate Client User Certificate

53 Amsterdam, 28 June 2006DEISA UNICORE tutorial53 UNICORE Security: NJS NJS authorizes the user Access the UNICORE user Database (UUDB) –Maps the user‘s certificate to his xlogin on the target system Only users presenting certificates stored in the UUDB can connect to the target system NJS authorises other NJSs Explicit UUDB entry

54 Amsterdam, 28 June 2006DEISA UNICORE tutorial54 Behind the Scenes: Authorisation IDB TSI UUDB Certificate 2 Certificate 3 Certificate 4 Certificate 5 Certificate 1 Login B Login C Login D Login E Login A Typical UNICORE User User Certificate User Login Client NJS Gateway User Certificate AJO

55 Amsterdam, 28 June 2006DEISA UNICORE tutorial55 UNICORE Job Job contains –Sub-jobs and tasks –Dependency information Without dependencies all tasks of a job are executed in “parallel” –Workflow: doN, loops, if-then-else –Target system location Tasks are translated into batch jobs for the destination system by the servers (NJSs)

56 Amsterdam, 28 June 2006DEISA UNICORE tutorial56 Abstract Job Object (AJO) Abstract, target system independent representation of a job Specifies actions to be performed by UNICORE –Execute task –File transfer task –Control task Contains dependency graph Contains resource requests (nodes, memory, time,...) Contains data set descriptions for data to be streamed Realised as Java classes

57 Amsterdam, 28 June 2006DEISA UNICORE tutorial57 @ Open Source under BSD license Supported by FZJ –Integration of own results and from other projects –Release Management –Problem tracking –CVS, Mailing Lists –Documentation –Assistance Viable basis for many projects –DEISA, UniGrids, NaReGI, … http://unicore.sourceforge.net


Download ppt "Amsterdam, 28 June 2006DEISA UNICORE tutorial UNICORE and the DEISA supercomputing grid Jules Wolfrat"

Similar presentations


Ads by Google