Presentation is loading. Please wait.

Presentation is loading. Please wait.

Setting up a Pan-European Datagrid using QCDgrid technology Chris Johnson, James Perry, Lorna Smith and Jean-Christophe Desplat EPCC, The University Of.

Similar presentations


Presentation on theme: "Setting up a Pan-European Datagrid using QCDgrid technology Chris Johnson, James Perry, Lorna Smith and Jean-Christophe Desplat EPCC, The University Of."— Presentation transcript:

1 Setting up a Pan-European Datagrid using QCDgrid technology Chris Johnson, James Perry, Lorna Smith and Jean-Christophe Desplat EPCC, The University Of Edinburgh The ENACTS “Demonstrator”

2 This Talk A summary of the title: –ENACTS Demonstrator –Pan-European Datagrid –QCDgrid.

3 ENACTS European Network for Advanced Computing Technology for Science. EC-funded project with 14 members. Started in 2000. Attempt to ensure that Europe did not lag behind US in grid technology. ENACTS originally consisted of many reports reports with little technical work.

4 14 ENACTS partners Please see: www.enacts.org

5 ENACTS Demonstrator: Partners involved EPCC, Edinburgh, UK –Chris Johnson –Jean-Christophe Desplat –James Perry Parallab, Bergen, Norway –Jacko Koster –Jan-Frode Myklebust –Csaba Anderlik TCD, Dublin, Ireland –Geoff Bradley –Bob Crosbie.

6 ENACTS Demonstrator… Objective –“To enable the formation of a pan-European HPC metacentre…”. The Demonstrator is part of Phase II of the activity and its specific objective is –“to draw together the results from all of the Phase I technology studies and evaluate their practical consequences for operating a pan- European metacentre and constructing a best- practice model for collaborative working amongst individual facilities”.

7 ENACTS Demonstrator ENACTS Phase I –consisted mainly of reports ENACTS Phase II –contained the Demonstrator activity Phase I identified technologies such as –Globus, replica management, LDAP database and XML metadata. All these technologies are inherent in the QCDgrid system.

8 metacentre A “virtual organisation” with data described by metadata. –Users submit data from any site –The data is stored on “the grid” –All data is stored reliably –All data is easy to retrieve.

9 Our Demonstrator Set-up QCDgrid across the 3-sites to create our metacentre. Use a genuine scientific scenario. Use an XML schema for meta-data. Ensure the data is portable between the systems involved.

10 Summary so far… ENACTS demonstrator project is an EC funded project involving 3 partners attempting to set up a pan- European metadata centre using QCDgrid technology.

11 What is QCDgrid? It’s not QCD-specific!! QCDgrid was written to manage the QCD data belonging to the UK QCD community (UKQCD) –2 previous All-Hands talks (James Perry,EPCC). The original grid consisted of 6 geographically dispersed sites. Around 5 terabytes of data. The amount of data is expected to grow dramatically when QCDOC comes online later in 2004.

12 What is QCDgrid? QCDgrid is a layer of software written on top of the Globus Toolkit. –Uses security infrastructure and basic grid operations such as data transfer –also uses more advanced features such as the replica catalogue.

13 How does QCDgrid work? A control thread runs on one storage element –constantly scans grid –ensures all storage elements are working –ensures all files are stored in at least 2 suitable locations. When a new file is added it is rapidly replicated across grid onto 2 or more geographically separate sites.

14 QCDgrid:Dealing with node loss If a storage element is lost unexpectedly, all files that were held on the failed system are replicated elsewhere. QCDgrid can cope with loss of entire site. If the control node is lost – control reverts to a secondary node.

15 How is QCDgrid used? Assuming QCDgrid is set up and the control thread and metadata database are running… and that each user has a valid certificate… User submits the usual initialisation commands –grid-proxy-init –source the correct set up files –sets a few paths, classpaths, etc.

16 Submitting files Command line User submits a file (datafile.dat) –put-file-on-qcdgrid datafile.dat AND an accompanying metadata file which describes the above data file –put-file-on-qcdgrid datafile.xml –exist:/db>put datafile.xml

17 Submitting files (better still…) Using the GUI User runs the Java GUI and submits both data file and metadata file at the same time. metadata file has a tag for the name of the corresponding data file –marries the two –every data file should have an associated metadata file.

18 Metadata Browser Can submit, search and retrieve data using this Java browser.

19 Metadata/Datagrid Integration QCDgrid software deals with storage and replication of data. eXist database deals with cataloging of data using metadata. All can be controlled using command line or GUI.

20 More on QCDgrid Other commands available with QCDgrid –qcdgrid-list lists all files on grid. –get-file-from-qcdgrid retrieves files from grid. –i-like-this-file attempts to store a file local to the user. There are also several commands for administering nodes, etc.

21 QCDgrid Sites ? QCDgrid ENACTS ?

22 Our depoyment of QCDgrid UK (all using e-science CA) -> Europe (all using different CAs). Moving from a homogenous Linux environment to a mixed one (Linux/Solaris). Moving from Globus Toolkit (GT) 2.0 - > GT2.4.

23 How difficult was it? Certificates –Some certificate issuers took several weeks to issue certificates. –Different policies on issuing certificates, e.g. non-human users (project accounts). –Not too many difficulties using multiple certificates.

24 How difficult was it? Moving to a heterogeneous environment. –Installing of Globus 2.x is difficult on Solaris – led to the Solaris node being unable to submit data. A few minor problems getting system specific functions to work (e.g. df command). Usual minor compilation issues – did require gcc compiler.

25 How difficult was it? Globus –This presented the biggest difficulty! –Installation difficulties and firewall issues several months before a “helloworld” job would run from any site to any other. –Migrating from GT 2.0 -> GT 2.4 Major difficulties! Had to re-write the replica schema. Remove some error-handling functionality.

26 Users – Scientific scenario Appealed to QCD –Given more time we could have found a different discipline. Two users from TCD as well as those involved in the project itself. Code used was MILC code. Monte-Carlo simulation to investigate string-breaking. Three Monte-Carlo chains, one on each node.

27 User feedback Generally impressed with functionality. Some frustration in getting certificates. Difficulty persuading users to use metadata, although agreed it was useful. Would make more use of file-sharing using such a system. Liked machine-independent data. Wanted grid to do job submission.

28 Conclusions We have created a pan-European datagrid (metacentre) using QCDgrid technology. The systems works well… …Globus is the limiting factor. Users were impressed with the system in use. The system was not tested with many users but we can see no reason why it would not scale to many users/nodes if Globus allows.

29 Acknowledgements James Perry. Jean-Christophe Desplat, Jacko Koster, Jan-Frode Myklebust, Csaba Anderlik, Geoff Bradley, Bob Crosbie. Craig McNeile and Bálint Joó. Mike Peardon and Jimmy Juge.

30 References ENACTS –http://www.enacts.org QCDgrid –http://www.gridpp.ac.uk/qcdgrid –code: http://forge.nesc.ac.uk/projects/qcdgrid MILC code –http://physics.indiana.edu/~sg/milc.html


Download ppt "Setting up a Pan-European Datagrid using QCDgrid technology Chris Johnson, James Perry, Lorna Smith and Jean-Christophe Desplat EPCC, The University Of."

Similar presentations


Ads by Google