Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Interoperability and Data Management KEK-CRC & CC-IN2P3 Yonny CARDENAS JFPPL09 Workshop, Tsukuba, May 2009.

Similar presentations


Presentation on theme: "Grid Interoperability and Data Management KEK-CRC & CC-IN2P3 Yonny CARDENAS JFPPL09 Workshop, Tsukuba, May 2009."— Presentation transcript:

1 Grid Interoperability and Data Management KEK-CRC & CC-IN2P3 Yonny CARDENAS JFPPL09 Workshop, Tsukuba, May 2009

2 Content Objective Activities last year Grid Interoperability  Saga-Naregi adaptor  Jsaga Grid Data Management  RNS  IRODS Conclusions

3 Members Japan S. Kawabata T. Sasaki G. Iwai K. Murakami Y. Iida France D. Boutigny S. Reynaud F. Hernandez J.Y. Nief Y. Cardenas

4 Grid Interoperability SAGA-NAREGI Adaptor JSAGA

5 Objective Several Grid middleware and infrastructures are being developed in various countries: NAREGI in Japan gLite - EGEE in Europe VDT - Open Science Grid in the US etc. Researchers need collaborate sharing global resources and information even if different middleware and infrastructures are used.

6 Activities last year (I)‏ Workshop in Tsukuba in November 2008 6 of CC-IN2P3 staffs visited KEK (3 days)‏ Interoperability  SAGA-NAREGI Adaptor Development (KEK)‏ Has been released - only for job adaptor currently SAGA – PBS released also  JSAGA Development (CC-IN2P3)‏ Plug-ing for NAREGI beta-2 job submission Plug-ins for CREAM, WMS, SRM, iRODS Plug-ing for VOMS-MyProxy requested for Yoshiyuki Watase (KEK)‏

7 Activities last year (II)‏ Data Management  RNS Development (KEK)‏ KEK and U.Tsukuba implementation (in progress)‏ Integration with LFC (in planning)‏  IRODS Development Transfers and scaling tests (KEK -CCIN2P3) (KEK-JPARC)‏ Started operation at KEK in Mars 2009 Data management system for J-PARC projects (in progress)‏ HPSS driver (in progress) Load Balancing micro-services

8 Activities last year (III)‏ Interoperability  SAGA-NAREGI Adaptor  JSAGA OGF25/EGEE User Forum Catania, Italy March 2009 Data Management  IRODS IRODS International Multidisciplinary Workshop at CC-INP2P3 Lyon, February 2009

9 Grid Grid is a new generation information utility for global distributed computing. Grid technology allows to share resources to access, process and storage huge quantities of data. Grid Middleware is the software that permits interaction between heterogeneous infrastructures (software and hardware).

10 gLite Developed by EGEE project since 2002 Strong connection with CERN Core computing component of LCG experience

11 NAREGI Grid middleware for research and industrial application More focused in the computing grid for linking supercomputer centers for coupled simulation of multi-scale physics Support heterogeneous computer architectures (vector & super parallel & clusters)‏

12 NAREGI architecture

13 Grid Interoperability Why it is necessary? “Grid” is the name of the concept and the real infrastructure is realized by middleware gLite in Europe, OSG and TereGRID in US, NAREGI in Japan, ChinaGRID in China and so on, so on…. Different middleware in different area of research. Depending on national infrastructure helps to save human and technical resources. High Energy Physics is not the only program where GRID is deployed. CC-IN2P3 and KEK supports wide area of physical science.

14 Grid Deployment at KEK Middleware/Experiment Matrix gLiteNAREGIGfarmSRBiRODS BelleUsingPlanningUsing AtlasUsing Radio therapyUsingDevelopingPlanning ILCUsingPlanning J-PARCPlanning Testing Super-BelleTo be decided by 2010 ▸Commonly most of experiment or federation are using gLite as the Grid middleware. ▸NAREGI middleware is being deployed as the general purpose e- science infrastructure in Japan ▹Difficulties: e.g. human costs, time differences ▹Both interops among MWs are mandate for us (next a few slides)‏ ▪To provide higher availability, reliability and to keep prod. quality 14Current Status and Recent Activities on Grid at KEK -- Go Iwai, KEK/CRC

15 Issues on Multi Middleware Apps ▸For site admins: ▹Dedicate HW is deployed in each middleware ▪LRMS ▪OS ▸For end users: ▹By ordinal way, same apps for each middle are developed to be enabled on Grid ▹They have to know which middleware they are using. gLiteNAREGI SRB iRODS CPUs Storage CPUs Storage Deployed dedicate HW App Users should be aware the underlying middleware-layer and hardware deployed 15Current Status and Recent Activities on Grid at KEK -- Go Iwai, KEK/CRC

16 Interoperability between EGEE and NAREGI 2 possible approaches Implement the GIN (Grid Interoperability Now) layer in NAREGI  Defined by the GIN group from the OGF  Short term solution in order to get the Interoperability Now !  Pragmatic approach Work with longer term standards defined within the OGF  Based on SAGA (Simple API for Grid Applications) "Instead of interfacing directly to Grid Services, the applications can so access basic Grid Capabilities with a simple, consistent and stable API"

17 GIN (Grid Interoperability Now)‏ Pushed by KEK, NAREGI has done considerable efforts to implement the GIN layer "Trying to identify islands of interoperation between production grids and grow those islands“ This has been deployed at KEK and under testing

18 SAGA-Engine SAGA-NAREGI Adaptor SAGA adaptors Adpt Applications ▸We need to operate multi Grid middleware at the same time. ▹Resource sharing among them is mandate ▪We are also contributing to GIN ▸Virtualization of Grid middleware is our wish ▹The best scenario for the application developers Today’s topic for SAGA-NAREGI 18Current Status and Recent Activities on Grid at KEK -- Go Iwai, KEK/CRC gLiteNAREGI SRB iRODS CPUs Storage Cloud LRMS LSF/PBS/SGE/… GIN/PGI: Multi-Middleware Layer Fair share resources among middles

19 SAGA-NAREGI Adaptor SAGA-Engine gLiteNAREGI SRB iRODS SAGA adaptors Adpt C++ Interface Python Binding Svc Apps RNS FC service based on OGF standard CPUs Storage 1.Middleware-transparent layer 2.Middleware-independent services 19Current Status and Recent Activities on Grid at KEK -- Go Iwai, KEK/CRC Cloud LRMS LSF/PBS/SGE/… 1. GIN/PGI: Multi-Middleware Layer 2.

20 SAGA-NAREGI Adaptor Current status : SAGA-NAREGI ready for use  Only for job adaptor currently  SAGA-PBS is now being developed and will be released soon. Next step: more application-wise development  Based on SAGA-NAREGI  As the first practical example Offline analysis for Belle experiment RT simulation 20

21 JSAGA This approach is being developed at CC-IN2P3 (Sylvain Reynaud)‏ Objective: describe your job once, submit it everywhere ! Implement standard specifications  SAGA (OGF)‏  JSDL Provide high-level abstraction layer with no sacrifice on efficiency or scalability  thanks to design (definition of plug- ins interface)‏ Use grid infrastructures as they are (i.e. no pre-requisite)‏ Hide heterogeneity middlewares  of middlewares grid infrastructures  of grid infrastructures

22 WMS WMS input data SRM GridFTP WS-GRAM LCG-CELCG-CEWS-GRAM firewall JSAGA job desc. gLite plug-ins Globus plug-ins JSAGA job staging graph delegate selection & files staging job hideinfrastructuresheterogeneity (e.g. EGEE, OSG, DEISA)‏ hidemiddlewareheterogeneity (e.g. gLite, Globus, Unicore)‏ JDLRSL OPlast EGEE

23 JSAGA Ready-to-use software, adapted to targeted scientific field Ready-to-use software, adapted to targeted scientific field Hide heterogeneity between grid infrastructures Hide heterogeneity between grid infrastructures Hide heterogeneity between middlewares Hide heterogeneity between middlewares As many interfaces as ways to implement each functionality As many interfaces as ways to implement each functionality As many interfaces as used technologies As many interfaces as used technologies Applications end user application developer plug-ins developer core engine + plug-ins JSAGA jobscollection JSAGA SAGA SAGA

24

25 JSAGA plug-ins for NAREGI Preliminary plug-in for Super Scheduler  not all features  NAREGI beta-2 only  simple individual jobs Submit, monitor, cancel  support security context types VOMS Plug-ins for CREAM, WMS, SRM, iRODS Plug-ing for VOMS-MyProxy requested for Yoshiyuki Watase (KEK)‏

26 Grid Data Management RNS - Resource Namespace Service IRODS - i Rule Oriented Data Systems

27 RNS - Resource Name Space Middleware independent file catalogue is strongly desirable to operate multi-middleware and share data.  Robustness and scalability are issue It is standardized at OGF already Two independent implementations are going on  U. of Tsukuba  University of Virginia KEK is working with U. of Tsukuba and has requested NAREGI to support RNS. U. of Virginia is developing LFC interface to RNS LFC interface to iRODS will be the subject of the FJPPL collaboration

28 RNS: Resource Namespace Service ▸Hierarchical namespace management that provides name-to-resource mapping ▸Basic Namespace Component ▹Virtual Directory ▪Non-leaf node in hierarchical namespace tree ▹Junction ▪Name-to-resource mapping that interconnects a reference to any existing resource into hierarchical namespace Current Status and Recent Activities on Grid at KEK -- Go Iwai, KEK/CRC /ILC EUAP datamc file1 file3 file2 file1file2 EPR1 EPR2 http://www.ogf.org/documents/GFD.101.pdf US file1file2 IN2P3 DESY 28

29 IRODS The succesor of SRB  SRB is used for BELLE at KEK Developed by international collaboration  CC-IN2P3 and KEK are the members of the collaboration Rules concept. Customized access rights to the system:  Disallow file removal from a particular directory even by the owner. Security and integrity check of the data:  Automatic checksum launched in the background.  On the fly anonymization of the files even if it has not been made by the client. Metadata registration: Customized transfer parameters:  Number of streams, stream size, TCP window as a function of the client or server IP.

30 iRODS@KEK Rule concept for J-PARC Rule concept for J-PARC  Data migration - testing Ingest data and replicate it onto KEK HPSS, then purge from J-PARC after a certain period Ingest data and replicate it onto KEK HPSS, then purge from J-PARC after a certain period  Data registration - done Automated file registration created outside of iRODS Automated file registration created outside of iRODS  Customized access rights -discussing Set the write permission to the resource in each experiment group Set the write permission to the resource in each experiment group Decide the resource according to the size of the file from among the resource group automatically Decide the resource according to the size of the file from among the resource group automatically  …

31 Client Transfer test (preliminary)‏ Data transfer between KEK and J-PARC Data transfer between KEK and J-PARC – Not tune yet iRODS HPSS iput: 43MB/s iget: 40MB/s iRODSserver ssh scp: 24MB/s scp: 4MB/s ftp workserver J-PARCKEK Kerberos-ftp put: 8MB/s Kerberos-ftp get: 22MB/s

32 Data transfer performance Performance test was done by Yoshimi Iida while she was at CC-IN2P3 Transfer speed using iRODS and bbftp were measured in 12 hours  iput/iget and bbcp are the names of corresponding commands  Both supports parallel transfer iRODS transfer commands works better on congested network

33 From KEK to Lyon 1GB data transfer  TCP window size 4MB  number of parallel streams 16 bbcp often failed to connect

34 From Lyon to KEK 1GB data transfer –TCP window size 4MB –number of parallel streams 16

35 IRODS transfer performance Results were very promising. But still needs to understand a couple of issue (asymetry of the performances between the 2 directions: KEK  Lyon, Lyon  KEK).  Observed with other protocols. stress tests of the iCAT (database) with millions of entries into the catalog  Doing several kind of queries.  Results are stable and good for iRODS.  Goal: go up to 10 millions of files.

36 Scaling test IRODS @ KEK iRODS environment iRODS environment  iRODS 2.0  Separate DB server (ICAT) from iRODS server  Use local disk as iRODS resource Data used Data used  Same directory  1000 files of 100 bytes each Measurement Measurement  1 process use 5 kinds of icommands ireg, ils, icp, iget and irm ireg, ils, icp, iget and irm  Measure performance of each directory operation

37 Results of scaling tests @ KEK  Create all collections under the home collection  /Zone/home/rods coll-1 ~ coll-1000  Create a nested Collection every 50 collections  /Zone/home/rods/nest-0 coll-1 ~ coll-50  /Zone/home/rods/nest-0/nest-1 coll-51 ~ coll-100

38 Conclusion CC-IN2P3 and KEK are working jointly for grid interoperability and grid data management. Exchanges have been useful to improve the development, deployment and validation of common work in these fields. These works have been presented to grid community. FJPPL has permitted exchange ideas and experiences beyond of the grid. We wish to hold contact and collaboration among FJPPL people.


Download ppt "Grid Interoperability and Data Management KEK-CRC & CC-IN2P3 Yonny CARDENAS JFPPL09 Workshop, Tsukuba, May 2009."

Similar presentations


Ads by Google