Geoff Cawood, Terry Sloan Edinburgh Parallel Computing Centre (EPCC) Telephone: +44 131 650 5155 EPCC Sun Data and Compute.

Slides:



Advertisements
Similar presentations
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Advertisements

WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University.
UK Campus Grid Special Interest Group Dr. David Wallom University of Oxford.
FirstDIG First Data Investigation on the Grid Paul Graham, Terry Sloan, Adam Carter EPCC Ian Gregory, Darren Unwin First South Yorkshire tel:+44 (0)131.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Cracow Grid Workshop, November 5-6, 2001 Towards the CrossGrid Architecture Marian Bubak, Marek Garbacz, Maciej Malawski, and Katarzyna Zając.
Universität Dortmund Robotics Research Institute Information Technology Section Grid Metaschedulers An Overview and Up-to-date Solutions Christian.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
John Kewley e-Science Centre GIS and Grid Computing Workshop 13 th September 2005, Leeds Grid Middleware and GROWL John Kewley
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
1 e-science & data mining workshop, NeSC, UK, November 30 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI.
W w w. h p c - e u r o p a. o r g The HPC-Europa project and GridSphere Dawid Szejnfeld Poznan Supercomputing.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
SUN HPC Consortium, Heidelberg 2004 Grid(Lab) Resource Management System (GRMS) and GridLab Services Krzysztof Kurowski Poznan Supercomputing and Networking.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
1 UK NeSC Meeting, November 18 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI in a commercial environment.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Web: OMII-UK Campus Grid Toolkit NW-GRID Campus Grids Workshop 31 st October 2007 University of Liverpool Tim Parkinson.
Long Term Ecological Research Network Information System LTER Grid Pilot Study LTER Information Manager’s Meeting Montreal, Canada 4-7 August 2005 Mark.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
1 EPCC Sun Data and Compute Grids Project Using Sun Grid Engine and Globus to Schedule Jobs Across a Combination of Local.
Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone:
Computational grids and grids projects DSS,
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
1 All-Hands Meeting 2-4 th Sept 2003 e-Science Centre The Data Portal Glen Drinkwater.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
Grid Workload Management Massimo Sgaravatto INFN Padova.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
March 17, 2006CIP Status Meeting March 17, 2006 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Project Report at CIP AG Meeting.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Tools for collaboration How to share your duck tales…
1 Grid Portal for VN-Grid Cu Nguyen Phuong Ha. 2 Outline Some words about portals in principle Overview of OGCE GridPortlets.
1 Grid scheduling issues in the Sun Data and Compute Grids Project NeSC Grid Scheduling Workshop, Edinburgh, UK 21 st October.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Institute For Digital Research and Education Implementation of the UCLA Grid Using the Globus Toolkit Grid Center’s 2005 Community Workshop University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1October 9, 2001 Sun in Scientific & Engineering Computing Grid Computing with Sun Wolfgang Gentzsch Director Grid Computing Cracow Grid Workshop, November.
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Middleware for Campus Grids Steven Newhouse, ETF Chair (& Deputy Director, OMII)
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Portal Update Plan Ashok Adiga (512)
1 OGSA-DAI Status Report Neil P Chue Hong 20 th May 2005.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
John Kewley e-Science Centre All Hands Meeting st September, Nottingham GROWL: A Lightweight Grid Services Toolkit and Applications John Kewley.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
The National Grid Service Mike Mineter.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Shaowen Wang 1, 2, Yan Liu 1, 2, Nancy Wilkins-Diehr 3, Stuart Martin 4,5 1. CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department.
Grid Portal Services IeSE (the Integrated e-Science Environment)
Presentation transcript:

Geoff Cawood, Terry Sloan Edinburgh Parallel Computing Centre (EPCC) Telephone: EPCC Sun Data and Compute Grids NeSC Review 18 March 2004

Overview Description and Aims Project Status Technical Achievements Dissemination/Exploitation Future Plans

Description and Aims

Project Goal “Develop a fully Globus-enabled compute and data scheduler based around Grid Engine, Globus and a wide variety of data technologies” Partners – Sun Microsystems – National e-Science Centre represented by EPCC Timescales – 23 (+2) months duration – Due to project staff involvement in ODDGenes – Start Feb 2002, end Feb 2004 Grid Engine open source distributed resource management (DRM) system Globus integration enables sharing of resources amongst collaborating enterprises

Project Scenario If enterprises A and B could expose some of their machines to each other across the internet … Both A and B could enjoy throughput efficiency improvements Large gains when one enterprise is busy and the other is idle Grid Engine abcd efgh efgh abcd AB Users (A) Users (B)

Functional Aims What does the project goal mean in practice? Identify five key functional aims 1. Job scheduling across Globus to remote Grid Engines 2. File transfer between local client site and remote jobs 3. File transfer between any site and remote jobs 4. Allow 'datagrid aware' jobs to work remotely 5. Data-aware job scheduling Derived from questioning existing Grid Engine users during Requirements WP

Project Status

Workpackages WP 1: Analysis of existing Grid components – WP 1.1: UML analysis of core Globus 2.0 – WP 1.2: UML analysis of Grid Engine – WP 1.3: UML analysis of other Globus 2.0 – WP 1.4: Globus toolkit V3.0 Investigations – WP 1.5: Data Technologies Investigations WP 2: Requirements Capture & Analysis WP 3: Prototype Development WP 4: Hierarchical Scheduler Design WP 5: Hierarchical Scheduler Development

Deliverables WP 3: Prototype Development (FINISHED) D3.1 Prototype Development: Requirements D3.2 Prototype Development: Design D3.3 Prototype Development: Test plan D3.4 Prototype Development: TOG software D3.6 Prototype Development: How-To WP 4: Hierarchical Scheduler Design (FINISHED) D4.1 JOSH Functional Specification D4.2 JOSH Systems Design WP5: Hierarchical Scheduler Development (FINISHED) JOSH User Guide JOSH Software JOSH Client Install Guide JOSH Server Install Guide JOSH Known Problems & Solutions All WPs finished Deliverables available from project public web site Or from the Grid Engine community web site (for software) WP 1: Analysis of existing Grid components (FINISHED) – D1.1 Analysis of Globus Toolkit V2.0 – D1.2 Grid Engine UML Analysis – D1.3 Globus Toolkit 2.0 GRAM Client API Functions – D1.4 Globus 3.0 Features and Use – D1.5.2 Datagrids In Practice – D1.5.3 GridFTP – D1.5.4 OGSA-DAI – D1.5.5 Storage Resource Broker (SRB) WP 2: Requirements Capture & Analysis (FINISHED) – D2.1 Use cases and requirements – D2.2 Questionnaire Report

Technical Achievements "From Sun's perspective, the SunDCG project has been tremendously successful. Together, EPCC and Sun have produced very high quality software and documents, providing real added value to Sun's Grid Engine suite and addressing some of the key issues in robust and usable Grid middleware." Fritz Ferstl, Sun Microsystems

TOG (Transfer-queue Over Globus) Grid Engine abcd e efgh d Site A Site B Globus 2.2.x User A User B – WP 3 deliverable – prototype compute scheduler – Integrates GE and Globus 2.2.x/2.4 (Software library) – Supply GE execution methods (starter method etc.) to implement a 'transfer queue' which sends jobs over Globus to a remote GE – GE complexes used for configuration – Globus GSI for security, GRAM for interaction with remote GE – GASS for small data transfer, GridFTP for large datasets – Written in Java - Globus functionality accessed through Java COG kit Transfer queue

TOG Software Functionality 1. Job scheduling across Globus to remote Grid Engines 2. File transfer between local client site and remote jobs  Add special comments to job script to specify set of files to transfer between local site and remote site 4. Allow 'datagrid aware' jobs to work remotely  Use of Globus GRAM ensures proxy certificate is present in remote environment Absent 3. File transfer between any site and remote jobs  Files are transferred between remote site and local site only 5. Data-aware job scheduling

TOG Software Pros – Simple approach – Usability ● Existing Grid Engine interface ● Only addition is Globus certificate for authentication/authorisation – Remote administrators still have full control over their resources Cons – Low quality scheduling decisions ● State of remote resource – is it fully loaded? ● Ignores data transfer costs – Scales poorly - one local transfer queue for each remote queue – Manual set-up ● Configuring the transfer queue with same properties as remote queue Java virtual machine invocation per job submission

JOSH (JOb Scheduling Hierarchically) Developing JOSH software Address the shortcomings of TOG Incorporate Globus 3 and grid services WP 5 deliverable - compute/data scheduler Adds a new 'hierarchical' scheduler above Grid Engine hiersched submit_ge  Takes GE job script as input (embellished with data requirements)  Queries grid services at each compute site to find best match and submits job User Job Spec Hierarchical Scheduler hiersched user Interface Grid Engine Grid Service Layer Input Data Site Output Data Site Grid Engine Grid Service Layer

JOSH Pros Satisfies the 5 functionality goals Fulfills the project goal Remote administrators still have full control over their GEs Makes use of existing GE functionality eg. 'can run' Cons Latency in decision making Not so much 'scheduling' as 'choosing' Grid Engine specific solution

Dissemination/Exploitation

Presentations Ernst & Young, WestInfo Services, Strategy & Performance Associates, SingTel Optus, Executive Briefing Centre, Curtin Business School, Curtin University of Technology, Perth Australia, February 24 th, 26 th, Curtin Business School Information Systems Seminar, Curtin University of Technology, Perth, Australia, February 20 th 2004 GlobusWORLD 2004, San Francisco, USA, January 22 nd, 2004 White Rose Grid, EPCC Sun Data & Compute Grids, UCL Workshop, York University, November 11 th, 2003 Sun HPC Consortium, Phoenix, USA, November 2003 Open Issues in Grid Scheduling, National e-Science Centre, Edinburgh, UK, October 21st nd Grid Engine Workshop, Regensburg, Germany, September SunLabs Europe, Edinburgh, September 1 st, 2003 Sun HPC Consortium, Grid and Portal Computing SIG, Heidelberg, Germany, June 21st 2003 Resource Management and Scheduling for the Grid, National e-Science Centre, Edinburgh, UK, February 13th 2003 Sun HPC Consortium, Grid and Portal Computing SIG, Baltimore, USA, November 15th 2002 EPCC Sun Data and Compute Grids / White Rose Computational Grid Meeting, EPCC, Edinburgh, UK, November 7th 2002 Sun HPC Consortium, Grid and Portal Computing SIG, Glasgow, UK, July 18th 02 Grid Engine Workshop, Regensburg, Germany, April

Software Take-up Transfer-queue Over Globus (TOG) take–up includes ODD-Genes  Uses SunDCG TOG and OGSA-DAI to demonstrate a scientific use for the grid (bioinformatics), presented at –UK All Hands Meeting 2003 in Sept 2003 –Supercomputing 2003 in Nov 2003 on Sun, UK e-Science and Globus Alliance booths –Poster/Demo at Globusworld 2 –Numerous visitors to Edinburgh University INWA  Uses Sun DCG TOG, OGSA-DAI and FirstDIG browser to demonstrate data mining of commercial bank and telco data over the grid with Curtin Business School, Perth Australia Liverpool University’s ULGrid  Using Sun DCG TOG to enable users to access resources from various departments Raytheon Inc (USA)  Use SunDCG TOG in grid evaluations Sun Singapore

Software Take-up Job Scheduling Hierarchically (JOSH) known interest includes White Rose Grid Raytheon Inc. Academic Technology Services at UCLA School of Pharmaceutical Sciences at the University of Nottingham Texas Advanced Computing Center Forecast Systems Laboratory of NOAA

Downloads 10,300 document downloads between Feb 27 th 2003 and Feb 26 th 2004 No specific figures on TOG/JOSH software downloads Hosted at Grid Engine community web site Figures are not available BUT from EPCC web site … > 400 TOG Requirements document > 400 JOSH Functional Specification > 300 JOSH Systems design JOSH documents only available since *Feb 3 rd 2004* Community Scheduler Framework Does not have data aware scheduling Platform have asked if they could get the JOSH algorithms included So LOTS of interest in JOSH

Future Plans

Effort budget ran out in February 2004 Sun will integrate TOG/JOSH into Grid Engine source from March 2004 Open Source development via Grid Engine community web site If funds made available WS-RF update Access to other DRMS eg Loadleveller, LSF WS-Agreement compliance, JSDL Further functionality All are straightforward due to good design in JOSH "I just recommended TOG and JOSH as a starting point for a partner who wants to build Grid middleware for nuclear plants." Fritz Ferstl, Sun Microsystems

Demo