SARA Reken- en Netwerkdiensten ToPoS: High-Throughput Parallel Processing Pipelines on the Grid Pieter van Beek SARA Computing and Networking Services.

Slides:



Advertisements
Similar presentations
Nikhef Jamboree 2008 BiG Grid Update Jan Just Keijser.
Advertisements

SARA Reken- en NetwerkdienstenToPoS | 3 juni 2007 More efficient job submission Evert Lammerts SARA Computing and Networking Services High Performance.
UK Campus Grid Special Interest Group Dr. David Wallom University of Oxford.
SALSA HPC Group School of Informatics and Computing Indiana University.
High Performance Computing Course Notes Grid Computing.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Infrastructure overview Arnold Meijster &

Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
Collaboration Suite Business Process Management
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
Portals and Credentials David Groep Physics Data Processing group NIKHEF.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
The BioBox Initiative: Bio-ClusterGrid Gilbert Thomas Associate Engineer Sun APSTC – Asia Pacific Science & Technology Center.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
Riccardo Bruno INFN.CT Sevilla, Sep 2007 The GENIUS Grid portal.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
INFSO-RI Enabling Grids for E-sciencE The GENIUS Grid portal Tony Calanducci INFN Catania - Italy First Latin American Workshop.
Database-Driven Web Sites, Second Edition1 Chapter 5 WEB SERVERS.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
SALSA HPC Group School of Informatics and Computing Indiana University.
BiG Grid at HTC Philips Research Jaap van den Herik BiG Grid and Tilburg University.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
PROGRESS: ICCS'2003 GRID SERVICE PROVIDER: How to improve flexibility of grid user interfaces? Michał Kosiedowski.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The GILDA training infrastructure.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
Afresco Overview Document management and share
DPM Python tools Ivan Calvet IT/SDC-ID DPM Workshop 10 th October 2014.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
INFSO-RI Enabling Grids for E-sciencE Charon Extension Layer. Modular environment for Grid jobs and applications management Jan.
Overview Background: the user’s skills and knowledge Purpose: what the user wanted to do Work: what the user did Impression: what the user think of Ganga.
Grid Remote Execution of Large Climate Models (NERC Cluster Grid) Dan Bretherton, Jon Blower and Keith Haines Reading e-Science Centre
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
+ Support multiple virtual environment for Grid computing Dr. Lizhe Wang.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Scaling bio-analyses from computational clusters to grids George Byelas University Medical Centre Groningen, the Netherlands IWSG-2013, Zürich, Switzerland,
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Antonio Fuentes RedIRIS Barcelona, 15 Abril 2008 The GENIUS Grid portal.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
The LGI Pilot job portal EGI Technical Forum 20 September 2011 Jan Just Keijser Willem van Engen Mark Somers.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
BiG Grid Communities Update
CRC exercises Not happy with the way the document for testbed architecture is progressing More a collection of contributions from the mware groups rather.
Recap: introduction to e-science
MIK 2.1 DBNS - introduction to WS-PGRADE, 2013
Presentation transcript:

SARA Reken- en Netwerkdiensten ToPoS: High-Throughput Parallel Processing Pipelines on the Grid Pieter van Beek SARA Computing and Networking Services High Performance Computing and Visualization e-Science Support ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten SARA Computing & Networking services SARA hosts the Dutch (NCF) supercomputer SARA/UvA have set up the E-bioLab CAVE used for 3D visualisation SARA provides infrastructure and support for academic projects and institutions. Tier-1 site for LHC at CERN

SARA Reken- en Netwerkdiensten SARA e-science support E-science support assists scientist to utilize high performance computational infrastructure Daily work ranges from programming software tools to consultancy is e-science projects. A team of 9 persons in 2008 Pieter van Beek TOPOS developer ToPoS |

SARA Reken- en Netwerkdiensten Main Grid projects BioAssist - NBIC (Netherlands BioInformatics Centre) Bioinformatics Life Science Grid Grid for life sciences Big Grid Grid infrastructure and e-science LOFAR satellite dishes in Netherlands and Germany ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten Life Science Grid Various clusters in The Netherlands for bioinformatics computing. Clusters can be used separately or by using gLite middleware. Open to all Life Scientists in the Netherlands Support on various levels provided by SARA

SARA Reken- en Netwerkdiensten Users experiences with gLite Overhead for starting jobs is considerable Determining the best chunk size is difficult. Too small -> large overhead Too large -> timeouts and throughput problems. Resource brokering is far from optimal Jobs often fail and users create their own tools for administrative tasks ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten Resource Brokering ToPoS | 23 October 2008 Submitted jobs are sent to a CE immediately. When another CE becomes available, you won't use it automatically

SARA Reken- en Netwerkdiensten Failing Jobs (1) Common experiences: Sorry, an Incomprehensible Error occurred Your VOMS Credential has expired What Job? Success! (but there’s no output) Failure! (but it ran just fine) Out of Wall-time (but no CPU-time?) A lot of “monitoring and resubmission” software is created again and again by many users. ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten Failing Jobs (2) A real world example: 27,000 jobs duration: approx. 4 hrs approx. 280 WNs Theoretical duration: 16 days But with a success rate of 70% … Approx. 9 resubmissions “Practical” duration: >2 months ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten Pilot Jobs ToPoS | 23 October 2008 “Normal” jobs Pilot jobs

SARA Reken- en Netwerkdiensten Simplest possible solution: Topos I An online counter, like a “page views” counter Numbers are “leased” for some period Leases must be renewed Interfaced with HTTP (REST web service) Can be used with any HTTP client (wget, browsers) As little security as possible ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten Pilot job flow ToPoS | 23 October 2008 Pilot job affirm token use affirm token use Get unused token Get unused token Submit Pilot job with token Running pilot job Execute token task Execute token task Finished ? Delete token Delete token no yes

SARA Reken- en Netwerkdiensten Advantages Simple design and use Using HTTP REST Automatic resubmissions Less overhead for large number of jobs. One pilot job can execute several tasks in sequence. Improved scheduling Easy job administration by querying Token Pool Server. Progress Fail rate ToPoS | 23 October 2008

SARA Reken- en Netwerkdiensten Topos I screenshots ToPoS | 14 November 2008

SARA Reken- en Netwerkdiensten User experiences First users are biologists. A large number of sequence aligments Weeks of execution time Originally a high failure rate TOPOS improved this situation considerably Easily scripted by using CURL Progress could be monitored by a web browser …without Grid certificate ToPoS |

SARA Reken- en Netwerkdiensten Topos 2.x Interfaced by WebDAV i.o. HTTP Tokens are files, i.e. they have identity content mime-type properties Token pools are directories Tokens can be moved between directories Allows users to build pipelines and workflows (high-level colored Petri nets) ToPoS |

SARA Reken- en Netwerkdiensten Topos 2 screenshot ToPoS |

SARA Reken- en Netwerkdiensten “Portfolio” SciaGrid Collaboration between SRON, KNMI, NIKHEF and SARA Website where users can select  satellite data (Sciamachy)  data processors Arnold Kuzniar and Jack Leunissen (WUR) BLAST protein sequence alignment Bas Dutilh (CMBI) HAMMER sequence alignment (?) Jan Bot (TUD) ToPoS |

SARA Reken- en Netwerkdiensten Future directions Documentation ATOM/RSS instead of WEBDAV Back to numbers instead of files TODO ToPoS |

SARA Reken- en Netwerkdiensten ToPoS | 23 October 2008