COMP_3:Grid Interoperability and Data Management CC-IN2P3 and KEK Computing Research Center FJPPL Annecy June 15, 2010
Members (2010) Japan(KEK) M. Nozaki T. Sasaki Y. Watase G. Iwai Y. Kawai S.Yashiro Y. Iida France(CC-IN2P3) D. Boutigny G. Rahal S. Reynaud F. Hernandez J.Y. Nief Y. Cardenas P. Calvat 2FJPPL 2010
Activities in Cooperative development of SAGA and iRODS – Please see following slides Workshop at Lyon on February 17 – Status report on each side – SAGA and iRODS development discussion – 3 Japanese visited Lyon FJPPL+other KEK budget One had to cancel the trip because the person was suspected to be influenced by a swein flu 3FJPPL 2010
Common concern GRID interoperability – How we build the world wide distributed computing infrastructure in HEP and related fields? Different middleware are deployed and operated in the different region Data handling in smaller experiments – Should be simple, but efficient enough 4FJPPL 2010
SAGA Virtualization of Grid/cloud resources and Grid interoperability 5FJPPL 2010
International collaboration and computing resource e-Science infrastructures are developed and operated independently and not compatible each other Japan:NAREGI United states: globus and VDT(OSG) Europe: gLite(EGEE), ARC, UNICORE How they can share the resources? How they can develop the software together? Local computing resource Grid Interoperability and SAGA will be the key. 6FJPPL 2010
SAGA Simple API for Grid Applications – The API to provide the single method to access the distributed computing infrastructure, such as “cloud”, GRID, local batch schedulers and independent local machines. API definition itself is language independent – This is the technology for a world size collaboration, such as Belle-II or ILC Different institutes depends on different technologies – There are implementations in two languages JAVA – JSAGA: CC-IN2P3 – JAVA SAGA C++ (SAGA-C++): KEK and others – Python and C languages bindings are also available 7FJPPL 2010
The aim of the project Exchange the knowledge and information Converge two implementations in the future 8FJPPL 2010
Converging JSAGA and SAGA-C++ SAGA-C++Java SAGAJSAGA Java GAT SAGA Java BindingSAGA C Binding PySAGA JySAGA the most used SAGA Python Binding Boost-based implementation? C Python Jython a user application another user application 9FJPPL 2010
Converging JSAGA and SAGA-C++ SAGA-C++Java SAGAJSAGA Java GAT SAGA Java BindingSAGA C Binding PySAGA JySAGA the most used SAGA Python Binding Boost-based implementationJPySAGA C Python Jython a user application another user application 10FJPPL 2010
Converging all SAGA implementations SAGA-C++Java SAGAJSAGA Java GAT SAGA Java BindingSAGA C Binding JySAGA common SAGA Python Binding (PySAGA ?) Boost-based implementationJPySAGA C Python Jython a user application another user application 11FJPPL 2010
JPySAGA Developed by J. Devemy (CC-IN2P3) – based on Compatible with reference implementations of… – Python (CPython) – SAGA (python binding of SAGA-C++) First release available for download – namespace, file system and replica functional packages only – execution management functional package will come soon… – Will be used by to integrate JSAGA into –. (Distributed Infrastructure with Remote Agent Control) 12FJPPL 2010
Summary of KEK activities related SAGA This activity is a part of the RENKEI project – RENKEI:Resource Linkage for e-Science – funded by MEXT during JFY Job adaptors for NAREGI, PBSpro and Torque have been implemented File adaptors for NAREGI(Gfarm v1 and v2) has been implemented also File adaptors for RNS and iRODS are under development Service Discovery for NAREGI will be implemented 13FJPPL 2010
RNSiRODSgLiteNAREGIPBSPro/torqueLSFcloudglobus SAGA File adaptors SAGA Job adaptors Unified GRID Interface(UGI) 14 RNSSAGA-C++ Python Interface (Unified GRID Interface ) RENKEI-KEK Goal: Hide the differences of underlying middleware from users Single commands set will work for everything OGF standards FJPPL 2010
Summary of CC-IN2P3 activities related SAGA Latest developments JSAGA plug-ins for – gLite-LFC, by – Globus GK with Condor for OSG, by – SSH with offline monitoring, by JSAGA core engine – many improvements (scalability, features…) Next developments JSAGA plug-ins for – ARC (NorduGrid) – DIET (Decrypton) – Grid Engine (next batch system at CC-IN2P3) Service Discovery API (SAGA extension) GridRPC SAGA package – needed for DIET 15FJPPL 2010
IRODS Data handling for small size projects 16FJPPL 2010
What is iRODS? iRODS is the successor of SRB – Data management software Meta data catalogue and rule based data management – Considered as a data Grid solution – The project is led by Prof. Reagan Moore of North Carolina University 17FJPPL 2010
iRODS service at KEK HPSS (Tape library) iRODS server DB server (ICAT) HPSS-VFS iRODS server × 4 IBM x3650 QX5460 (4 core) Memory 8GB HDD 293.6GB + 600GB RHEL 5.2 iRODS 2.1 HPSS-VFS client GPFS client Postgres server –IBM x3650 –QX5460 (4 core) –Memory 8GB –HDD 293.6GB –RHEL 5.2 –Postgres HPSS –TS3500 –HPSS p –3PB in maximum (3000 vols) –10TB cache disk –10 tape drives –5 movers –2 VFS servers
Client tools Client tools i-commands JUX (GUI Application) Davis (Web Application)
Client tools JUX (Java Universal eXplorer) Works on Linux, Windows and Mac Looks like windows explorer Looks like windows explorer visually confirm the file structuring copy the files by drag and drop not able to recognize the replicated files not able to handle Japanese character
Client tools Davis (A webDAV-iRODS/SRB) running Jetty and Apache on iRODS server Useful for a small laboratory in a university Useful for a small laboratory in a university don’t need a special software at client side use only https port not able to upload/download some files at the same time not support parallel transfer
KEK wiki page Wiki page in Japanese for end users what is iRODS how to use at KEK how to install how to make rule how to make MS …
storage raw data 20~50TB/year in each groups raw data simulateddata MLF : Materials and Life Science Experimental Facility
Use case Scenario raw data simulateddata Raw data is used once Simulated data can be accessed from collaborators raw data After processing, move to KEK storage raw data simulateddata Replicate between J-PARC and KEK After a certain term, delete from J-PARC Keep it forever at KEK Data preservation and distribution for MLF groups
iCAT From J-PARC to Collaborators J-PARC (Tokai) KEK (Tsukuba) Collaborators (Internet) Storage Storage HPSS iRODSServer iRODS Client DataServer iRODSServer Web Client iRODS Client HPSS Client(?) iCAT iRODSServer
Rules and Micro-services Main Rules 1.All created data should be replicated to the KEKCC storage 10 min later. 2.All row data older than 1 week should be removed from the JPARC storage, with checking the existence of their replicated data in the KEKCC storage before removing. 3.All simulated data should be removed in the same way but the period of time can be changed by each research group.
Rules and Micro-services Created a new micro-service To detect the files matched with the specified age. Implemented by Adil Hasan at University of Liverpool Other experiments use the different rule Send the files for successful runs only Check file sizes and age
Client Speed Performance Data transfer between KEK and J-PARC HPSS iput: 43MB/s iget: 40MB/s pftp put: 26MB/s pftp get: 33MB/s scp: 24MB/s scp: 4MB/s J-PARCKEK iRODSserver iRODS ssh HPSS workserver
New server setup Set up parallel iRODS servers Before March: running 1 iRODS on 2 machine (active & standby) Now: running separate iRODS on each machine (backup each other) run the iRODS for each experiment in order to change the writing user to HPSS for each experiment group in order to avoid the influence of the congestion of other experiment groups iRODSserver iRODSserveriRODS-AiRODS-B:iRODS-BiRODS-A:iRODS-CiRODS-D:iRODS-DiRODS-C:
iRODS CC-IN2P3 In production since early servers: –3 iCAT servers (metacatalog): Linux SL4, Linux SL5 –6 data servers (200 TB): Sun Thor x4540, Solaris 10. Metacatalog on a dedicated Oracle 11g cluster. HPSS interface: rfio server (using universal MSS driver). Use of fuse-iRODS: –For Fedora-Commons. –For legacy web applications. TSM: backup of some stored data. Monitoring and restart of the services fully automated (crontab + Nagios + SMURF). Automatic weekly reindexing of the iCAT databases. Accounting: daily report on our web site.
iRODS usage: prospects Starting: –Neuroscience: ~60 TB. –IMXGAM: ~ 15 TB ( X and gamma ray imagery). –dChooz (neutrino experiment): ~ 15 TB / year. Coming soon: LSST (astro): –For the IN2P3 electronic test-bed: ~ 10 TB. –For the DC3b data challenge: 100 TB ? Thinking about a replacement of light weight transfer tool (bbftp). communities: High Energy physics, astrophysics, biology, biomedical, Arts and Humanities.
iRODS contributions Scripts: –Test of icommands functionnalities. icommand: –iscan (release 2.3): admin command. Micro-services: –Access control: flexible firewall. –Msi to tar/untar files and register them in iRODS. –Msi to set ACLs on objects/collections. Universal Mass Storage driver. Miscealeneous (related to the Resource Monitoring System): –Choose best resource based on the load. –Automatic setup of status for a server (up or down).
JUX: Java Universal eXplorer Provide a single GUI for accessing the data on the GRID. JUX tries to be intuitive and easy to use for non-expert users: –use context menus, drag-and-drop… –close to widely used explorer (i.e. Windows explorer) Written in Java by Pascal Calvat. Based on the JSAGA API developed at ccin2p3 by Sylvain Reynaud. JSAGA provides the data management layer: –Protocols: srb, irods, gsiftp, srm, http, file, sftp, zip… –SRB and iRODS plugins are using Jargon. –Can add a plugin easily for a new protocol. JSAGA provides security mechanisms: –Globus proxy, VOMS proxy, Login/Password, X509
JUX: Java Universal eXplorer Download:
iRODS overall assessement iRODS is becoming more and more popular in IN2P3 community and beyond. Very flexible, large amount of functionnalities. Can be interfaced with many different technologies (no limit): –Cloud, Mass Storage, web services, databases, …. Able to answer a vast amount of needs for our users community. Lot of projects = lot of work for us ! Goal for this year: ~ x00 TB (guess: > 300 TBs). Should reach PB scale very quickly.
FJPPL
FJKPPL? CC-IN2P3, KISTI Super Computing Center and KEK Computing Research Center are agreed to build the three points collaboration – We share the common interests on Grid computing – We will discuss what we will do together The same effort is done in BIO_1 also FJPPL
SUMMARY FJPPL
Summary CC-IN2P3 and KEK-CRC are working to solve the common problems in Grid computing mostly independently, but interactively and complementary – SAGA as the solution for Grid interoperability – iRODS as the solution for data management in smaller size projects Long term collaboration has a benefit – For KEK. CC-IN2P3 is very strong partner who provides useful software tools 39FJPPL 2010