DIRAC 2014, Helsinki, May 21th, 2014 Alexandre M.J.J. Bonvin

Slides:



Advertisements
Similar presentations
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
Advertisements

INFN - Ferrara BaBarGrid Meeting SPGrid Efforts in Italy BaBar Collaboration Meeting - SLAC December 11, 2002 Enrica Antonioli - Paolo Veronesi.
16 th LHCb Software Week1 April th LHCb Software Week1 April 2004 Happy April Fools Unfortunately, not yet … … but I hope so one day.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Enabling Grids for E-sciencE Medical image processing web portal : Requirements analysis. An almost end user point of view … H. Benoit-Cattin,
Enabling Grids for E-sciencE gLite training at Sinaia '06 Victor Penso Kilian Schwarz GSI Darmstadt Germany.
Example Gridification via command-line Application Developer Training Day IV. Miklos Kozlovszky Ankara, 25. October, 2007.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
GRID Computing: Ifrastructure, Development and Usage in Bulgaria M. Dechev, G. Petrov, E. Atanassov.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
1 HeMoLab - Porting HeMoLab's SolverGP to EELA glite Grid Environment FINAL REPORT Ramon Gomes Costa - Paulo Ziemer.
4th EGEE user forum / OGF 25 Catania, TheoSSA on AstroGrid-D Iliya Nickelt (AIP / GAVO), Thomas Rauch (IAAT / GAVO), Harry Enke (AIP.
Nadia LAJILI User Interface User Interface 4 Février 2002.
E-science grid facility for Europe and Latin America Marcelo Risk y Juan Francisco García Eijó Laboratorio de Sistemas Complejos Departamento.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) GISELA Additional Services Diego Scardaci
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
E-science grid facility for Europe and Latin America Using Secure Storage Service inside the EELA-2 Infrastructure Diego Scardaci INFN (Italy)
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
 Our mission Deploying and unifying the NMR e-Infrastructure in System Biology is to make bio-NMR available to the scientific community in.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status report on Application porting at SZTAKI.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
A worldwide e-Infrastructure for NMR and structural biology A worldwide e-Infrastructure for NMR and structural biology Introduction Structural biology.
Protein docking with PCS restraints Christophe Schmitz29 September 2010.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
Structural Biology on the GRID Dr. Tsjerk A. Wassenaar Biomolecular NMR - Utrecht University (NL)
EGEE-0 / LCG-2 middleware Practical.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
Weather Research and Forecast implementation on Grid Computing Chaker El Amrani Department of Computer Engineering Faculty of Science and Technology, Tangier.
INFN - Ferrara BaBar Meeting SPGrid: status in Ferrara Enrica Antonioli - Paolo Veronesi Ferrara, 12/02/2003.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VO auger experience with large scale simulations on the grid Jiří Chudoba.
User Interface UI TP: UI User Interface installation & configuration.
The WeNMR HADDOCK portal goes crowd computing Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for Biomolecular Research Faculty of Science,
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Update report Marco Verlato INFN (National Institute of Nuclear Physics) Division of Padova Italy CHAIN Workshop, Taipei, Taiwan,
Job Management Beijing, 13-15/11/2013. Overview Beijing, /11/2013 DIRAC Tutorial2  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you.
A worldwide e-Infrastructure and Virtual Research Community for NMR and structural biology Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for.
Workload Management Status DIRAC User Meeting Marseille, Oct 2012.
GRID commands lines Original presentation from David Bouvet CC/IN2P3/CNRS.
Structural Biology on the Grid Christophe Schmitz Bijvoet Center for Biomolecular Research Faculty of Science, Utrecht University The Netherlands
Antonio Fuentes RedIRIS Barcelona, 15 Abril 2008 The GENIUS Grid portal.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Requirements from the WeNMR VRC User Forum 2011, Vilnius Workshop.
Toward an open-data infrastructure for WeNMR and structural biology-related services Alexandre M.J.J. Bonvin WeNMR Project coordinator Bijvoet Center for.
An short overview of INSTRUCT and its computational requirements Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for Biomolecular Research Faculty.
A Competence Center to Serve Translational Research from Molecule to Brain Alexandre M.J.J. Bonvin MoBrain CC coordinator Bijvoet Center for Biomolecular.
WeNMR Operations Report Marco Verlato INFN (National Institute of Nuclear Physics) Division of Padova Italy EGI Community Forum.
GPGPU use cases from the MoBrain community
The EDG Testbed Deployment Details
Belle II Physics Analysis Center at TIFR
MyProxy Server Installation
September 19th-23th, Lyon, France
Marc van Dijk | WeNMR | StickyBits
Alexandre M.J.J. Bonvin MoBrain CC coordinator
Introduction to Grid Technology
CRC exercises Not happy with the way the document for testbed architecture is progressing More a collection of contributions from the mware groups rather.
CompChem VO: User experience using MPI
Supporting user communities across EGI and OSG: the WeNMR use case
Leveraging GPGPU Computing in Grid and Cloud Environments: First Results from the Mobrain CC Alexandre Bonvin Utrecht University
gLite Job Management Christos Theodosiou
The EU DataGrid Fabric Management Services
GENIUS Grid portal Hands on
Installation/Configuration
Job Submission M. Jouvin (LAL-Orsay)
The LHCb Computing Data Challenge DC06
Presentation transcript:

The HADDOCK WeNMR portal: From gLite to DIRAC submission in three hours DIRAC workshop@EGI-CF 2014, Helsinki, May 21th, 2014 Alexandre M.J.J. Bonvin Project coordinator Bijvoet Center for Biomolecular Research Faculty of Science, Utrecht University the Netherlands a.m.j.j.bonvin@uu.nl

The WeNMR VRC

www.wenmr.eu NMR SAXS A worldwide e-Infrastructure for NMR and structural biology WeNMR VRC (May 2014) One of the largest (#users) VO in life sciences > 620 VO registered users (36% outside EU) ~ 1200 VRC members (>60% outside EU) ~ 100 000 CPU cores > 2000 SI2K CPU years > 5M jobs User-friendly access to Grid via web portals www.wenmr.eu

A glimpse of the WeNMR services portfolio 4 4

Distribution of resources Currently (May 2014) ~ 100’000 CPU cores

Over 1200 VRC and 620 VO members (64% / 36% outside Europe)! WeNMR VRC users distribution Over 1200 VRC and 620 VO members (64% / 36% outside Europe)! (May 2014)

The HADDOCK portal

Haddock web portal

Molecular Docking HADDOCK score consists of different energy terms, from which EAIR is the most important one. Additionally, BSA, Desolv and extra restraint data, if available, are added. In it0 EvdW is unimportant, because rigid body. In water electrostatics are turned off.

HADDOCK server users distribution Over 4000 registered (now all have access to the grid-server via robot cert)

What is happening behind the scene? Behind the scenes

Grid daemons beyond a portal Data transfer / jobs User level – data transfer via web interface: Data input: ~10 kB to 10 MB Data output: ~10MB to a few GB Local cluster– data/jobs transfer: Job submission: ~10 to 50 single jobs Data: a few GB Grid level – data/jobs transfer (UI<->WMS<->CE<->WN): Job submission: ~250 to 2500 single jobs Data transfer: ~1 to 10MB Output: ~1-20MB per job Some portals make use of storage elements for the data 12 12

HADDOCK Grid submission Each job in the pool directory consists of: Shell script to be executed on WN Compressed tar archive containing the data/directory structure for the job File specifying the location where results should be copied JDL script with requirements/ranks for selecting suitable sites, e.g.: Requirements = (Member("OSG_VERSION",other.GlueHostApplicationSoftwareRunTimeEnvironment) || (other.GlueCEPolicyMaxCPUTime < 720 && other.GlueCEPolicyMaxCPUTime > 110 && Member("VO-enmr.eu-CNS1.2",other.GlueHostApplicationSoftwareRunTimeEnvironment)) ); Rank = ( other.GlueCEStateFreeJobSlots > 0 ? other.GlueCEStateFreeJobSlots : (-other.GlueCEStateWaitingJobs * 4 / ( other.GlueCEStateRunningJobs + 1 )) - 1 ); Software is remotely deployed and CE tagged with a software tag: For HADDOCK: VO-enmr.eu-CNS1.2 Submission via robot proxie – fully automated – no user/operator intervention. 13 13

HADDOCK goes DIRAC DIRAC submission enabled at minimum cost! In one afternoon, thanks to the help or Ricardo and Andrei Clone of the HADDOCK server on a different machine No root access required, no EMI software installation required Minimal changes to our submission and polling scripts Requirements and ranking no longer needed, only CPUTime JobName = "dirac-xxx"; CPUTime = 100000; Executable = "dirac-xxx.sh"; StdOutput = "dirac-xxx.out"; StdError = "dirac-xxx.err"; InputSandbox = {"dirac-xxx.sh","dirac-xxx.tar.gz"}; OutputSandbox = {"dirac-xxx.out", "dirac-xxx.err","dirac-xxx-result.tar.gz"}; Very efficient submission (~2s per job – without changing our submission mechanism), high job throughput 14 14

Some statistics 15 15

Some statistics HADDOCK server grid jobs (total 2014 ~2.5M / DIRAC~1.35M) 16 16

Conclusions Successful and smooth porting of the HADDOCK portal to DIRAC4EGI (initially tested on DIRAC France Grilles) Higher performance / reliability than regular gLite- based submission Currently almost all HADDOCK grid-enabled portals redirected to DIRAC

Acknowledgments Arne Visscher DIRAC4EGI VT HADDOCK Inc. VICI NCF (BigGrid) DIRAC4EGI VT Ricardo Graciani, Andrei Tsaregorodtsev, ... BioNMR WeNMR HADDOCK Inc. Gydo van Zundert, Charleen Don, Adrien Melquiond, Ezgi Karaca, Marc van Dijk, Joao Rodrigues, Mikael Trellet, ..., Koen Visscher, Manisha Anandbahadoer, Christophe Schmitz, Panos Kastritis, Jeff Grinstead DDSG

The End Thank you for your attention! HADDOCK online: http://haddock.science.uu.nl http://nmr.chem.uu.nl/haddock http://www.wenmr.eu