1 Kihyeon Cho & Soonwook Hwang (KISTI) Super Belle with FKPPL VO & AMGA Data Handling.

2 Contents  KISTI Super computing Center  FKPPL VO Farm  Super Belle Data Handling  Summary

3 FKPPL VO Grid Testbed

4 Goal  Background  Collaborative work between KISTI and CC-IN2P3 in the area of Grid computing under the framework of FKPPL  Objective  (short-term) to provide a Grid testbed to the e-Science summer school participants in order to keep drawing their attention to Grid computing and e-Science by allowing them to submit jobs and access data on the Grid  (long-term) to support the other FKPPL projects by providing a production-level Grid testbed for the development and deployment of their applications on the Grid  Target Users  FKPPL members  2008 Seoul e-Science summer school Participants


6 VO Registration Detail  Official VO Name   Description  VO dedicated to joint research projects of the FKPPL(France Korea Particle Physics Laboraroty), under a scientific research programme in the fields of high energy physics (notably LHC and ILC) and e- Science including Bioinformatics and related technologies  Information about the VO 

7 FKPPL VO Usage  Application porting support on FKPPL VO  Geant4  Detector Simulation Toolkit  Working with National Cancer Center  WISDOM  MD part of the WISDOM drug discovery pipeline  Working with the WISDOM Team  Support for FKPPL Member Projects  Grid Testbed for e-Science School  Seoul e-Science summer school

8 How to access resources in FKPPL VO Testbed  Get your certificate issued by KISTI CA  uest.php uest.php  Join a FKPPL VO membership   Get a user account on the UI node for FKPPL Vo  Send an email to the system administrator at

9 User Support  FKPPL VO Wiki site   User Accounts on UI machine  17 User accounts have been created  FKPPL VO Registration  4 users have been registered as of now

10 Contact Infomation  Soonwook Hwang (KISTI), Dominique Boutigny (CC- IN2P3)  responsible person ,  Sunil Ahn (KISTI), Yonny Cardenas (CC-IN2P3)  Technical contact person, ,  Namgyu Kim  Site administrator   Sehoon Lee  User Support 

11 Monday, December 2, 2008 Yonny CARDENAS 11 KISTI site VOMS, WMS, CE+WN*, UI, Wiki * Infrastructure installation in progress ( a cluster with 128 cores has been purchased) CC-IN2P3 site CE+WN, SE, LFC Configuration

12 Monday, December 2, 2008 Yonny CARDENAS 12 VO Registration procedure –VO –VO manager: Sunil Ahn –Status: Active Configuration

13 Monday, December 2, 2008 Yonny CARDENAS 13 Status (Operational Services)‏ KISTI site: VOMSOK WMSOK CE OK WIKIOK SEOK WN*in Progress CC-IN2P3 site: SEOK dCache/SRMOK WN OK CEOK LFCOK

14 Monday, December 2, 2008 Yonny CARDENAS 14 Available Services Job Submission Available since October 1, 2008 Resource allocation: 5 millions hours CPU SI2K CC-IN2P3 Job monitoring Quality of Service Operation team

15 Monday, December 2, 2008 Yonny CARDENAS 15 Available Services Data storage dCache SE/SRM System for storing and retrieving data, distributed among a large number of heterogeneous server nodes. Implements the SRM v2.2 interface required EGEE/LCG Resource allocation: 0.5 Terabytes

16 Monday, December 2, 2008 Yonny CARDENAS 16 Available Services Data storage AFS (Andrew File System) Network file system for personal and group files, experiment software, system tools (compilers, libraries,... )‏ Indirect use (jobs)‏ Resource allocation: 2 Gigabytes

17 Monday, December 2, 2008 Yonny CARDENAS 17 Available Services Data storage LFC - LCG File Catalog Maintains mappings between logical file names (LFN) and SRM file identifiers. Supports references to SRM files in several storage elements.

18 Monday, December 2, 2008 Yonny CARDENAS 18 Utilisation - Services Jobs Submission October 34 jobs for 150 hours CPU SI2K November 1690 jobs for 48250 hours CPU SI2K

19 Monday, December 2, 2008 Yonny CARDENAS 19 Utilisation - Services Data Storage 7193 files for 60 G bytes of used space 440 G bytes available.

20 User Support  FKPPL VO Wiki site   User Accounts on UI  20 User accounts has been created  FKPPL VO Membership Registration  7 Users have been registered at FKPPL VO membership

21 FKPPL VO Usage  Deployment of Geant4 applications on FKPPL VO  Detector Simulation Toolkit  Working with Jungwook Shin at National Cancer Center  Grid Interoperability Testbed

22 Geant4 Application: GTR2_com  Application name: GTR2_com (G4 app for proton therapy sim s/w by developed by NCC) -> GTR2 : Gantry Treatment Room #2, com: commissioning (now GTR2 simulation code is under commissioning phase) -> libraries: Geant4, root ( as simulation output library /user/io/OpenFile root B6_1_1_0.root /GTR2/SNT/type 250 /GTR2/SNT/aperture/rectangle open #Geant4 kernel initialize /run/initialize /GTR2/FS/lollipops 9 5 /GTR2/SS/select 3 /GTR2/RM/track 5 /GTR2/RM/angle 80.26 /GTR2/VC/setVxVy cm 14.2 15.2 /beam/particle proton /beam/energy E MeV 181.8 1.2 /beam/geometry mm 3 5 /beam/emittance G mm 1.5 /beam/current n 3000000 #SOBP /beam/bcm TR2_B6_1 164 /beam/juseyo /user/io/CloseFile user macro output GTR2_com GTR2_com 의 input 은 nozzle 의 configuration 이며, 이 configuration 이 명시된 macro 파일을 읽어서 최종 양성자 빔에 의한 선량분포를 3D-histogram 의 root 파일로 출력

23 Distribution of the completion time of 1000 GTR2_com jobs on FKPPL VO The submission of 1000 GTR2_com jobs was done around 18:05

24 BC408 M.C study Purpose : the accurate simulation study will help to design and construct a dosimetry device utilizing the BC408 scintilator

25 BC408 M.C on FKPPL resolution : 2mm in X,Y and 1mm thicknes From 1 file, ~3.5 hrs on WN of FKPPL From 589 files, 589 files sucessfully generated among total 99*7 (693) jobs Trial #Total jobCompletedErr 1996534 2996633 399 0 4 981 5998415 6998019 799972 693589104 I immediately submit parametric job after initializing the proxy

26 Super Belle Data Handling

27 Super Belle Computing  Conveners: T. Hara, T. Kuhr  Distributed Computing (Martin Seviour)  Data Handling (Kihyeon Cho)  Data Base (Vacant)

28 Super Belle Data Handling  Data Handling depends on distribution computing.  Cloud Computing?  Grid farm?  Data Handling Suggestions (2/17)  SAM (Sequential Access through Metadata)  CDF by Thomas Kuhr  AMGA  KISTI by Soonwook Hwang

29 EGEE (Enabling Grids for E-SciencE) the largest multi-disciplinary grid infrastructure in the world Objectives Build large-scale, production-quality grid infrastructure for e-Science Available to scientists 24/7 EGEE grid Infrastructure 300 sites in 50 countries 80,000 CPU cores 20 PBytes 10,000 User

30 Overview of AMGA (1/2)  Metadata is data about data  AMGA provides:  Access to Metadata for files stored on the Grid  A simplified general access to relational data stored in database systems.  2004 – the ARDA project evaluated existing Metadata Services from HEP experiments  AMI (ATLAS), RefDB (CMS), Alien Metadata Catalogue (ALICE)  Similar goals, similar concepts  Each designed for a particular application domain  Reuse outside intended domain difficult  Several technical limitations: large answers, scalability, speed, lack of flexibility  ARDA proposed an interface for Metadata access on the GRID  Based on requirements of LHC experiments  But generic - not bound to a particular application domain  Designed jointly with the gLite/EGEE team

31 Overview of AMGA (2/2) What is AMGA ? (ARDA Metadata Grid Application)  Began as prototype to evaluate the Metadata Interface  Evaluated by community since the beginning:  Matured quickly thanks to users feedback  Now part of gLite middleware : EGEE’s gLite 3.1 MW  Requirements from HEP community  Millions of files, 6000+ users, 200+ computing centres  Mainly (real-only) file metadata  Main concerns : scalability, performance, fault-tolerance, Support for Hierarchical Collection  Requirements from Biomed community  Smaller scale than HEP  Main concerns : Security ARDA Project (A Realisation of Distributed Analysis for LHC)

32 Metadata user requirements  I want to  store some information about files  In a structured way  query a system about those information  keep information about jobs  I want my jobs to have read/write access to thos e information  have easy access to structured data using the grid proxy certificate  NOT use a database

33 Metadata Concepts  Schema (table, think directory)  Has hierarchical name and list of attributes /prod/events  Attributes (columns)  Have name and storage type  Interface handles all types as strings  Entry (row)  Live in a schema, assign values to attributes  Collections  A set of entries associated with schema  Query  SELECT... WHERE... clause in SQL-like or SQL query language

34 AMGA data organization  Relational schema  AMGA(hierarchy) /HOSPITAL/ PATIENTS/ DOCTORS/ john george #namesicknessage johnmalaria68 georgeotitis84 sicknessotitis age84 Attributes Entries Schema/Directory TABLE: PATIENTS #name PATIENTS DOCTOR S TABLE: HOSPITAL Collection #type people_group

35 Importing existing data  Suppose that you have the data  A reasonable question would be:  Can I use my existing database data??  The answer is YES  Importing data to AMGA  Pretty simple  Connect a database to AMGA  Execute the import command  import table directory  Ready to go!

36 AMGA Use Cases

37 AMGA Features in One Slide  Official gLite middleware component for metadata catalogue  Metadata is relationally structured  Schema (aka table, think directory) Has hierarchical name and list of attributes /prod/events  Attributes (aka columns) Have name and storage type, Interface handles types as strings  Entry (aka row) Live in a schema, assign values to attributes  Query: SELECT... WHERE... clause in SQL  Fine grained access control (ACL) support  Table level and entry level  Tight integration into the Virtual Organization Management System (VOMS)  X509 Grid certificate  Native SQL support in AMGA 1.9  Direct DB access to existing databases on the Grid via SQL  OGF WS-DAIR compatible interface support in AMGA 2.0  Uniform interface to Heterogeneous Database backend  Oracle, Postgres SQL, MySQL, etc.  Support for Many Programming APIs  Diverse user community requested/provided  e.g., C/C++, Java, Python, Perl, PHP  Replication Support  Full replication, Partial replication, Federation  Support for “Import of pre-existing databases”

38 AMGA Website

39 To do  Prototype of AMGA  Using Belle flat form, we will use AMGA (Namkyu Kim and Dr. Junghyun Kim)  Belle flat form => flag  Using Super Belle flat form (Grid or Cloud computing), we will use AMGA.

