Kento Aida, Tokyo Institute of Technology Grid Challenge - programming competition on the Grid - Kento Aida Tokyo Institute of Technology 22nd APAN Meeting.

Slides:



Advertisements
Similar presentations
National Institute of Advanced Industrial Science and Technology Experiences through Grid Challenge Event Yoshio Tanaka.
Advertisements

Resource WG Update PRAGMA 8 Singapore. Routine Use - Users make a system work.
GXP in nutshell You can send jobs (Unix shell command line) to many machines, very fast Very small prerequisites –Each node has python (ver or later)
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
Developing an Agricultural Monitoring System from Remote Sensing Data Using GridRPC on Ninf-G Shamim Akther, Yann Chemin, Honda Kiyoshi Asian Institute.
LinkSCEEM-2: A computational resource for the development of Computational Sciences in the Eastern Mediterranean Mostafa Zoubi SESAME SESAME – LinkSCEEM.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
6/2/20071 Grid Computing Sun Grid Engine (SGE) Manoj Katwal.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Grid’5000 Grid' DAS-3 workshop 104/12/06 Grid’5000 * DAS-3 – Grid'5000 workshop December 4th, *5000 CPUs Pierre NEYRON - INRIA.
IFIN-HH LHCB GRID Activities Eduard Pauna Radu Stoica.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
1 Int System Introduction to Systems and Networking Department Faculty of Computer Science and Engineering Ho Chi Minh City University of Technology.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
Edge Based Cloud Computing as a Feasible Network Paradigm(1/27) Edge-Based Cloud Computing as a Feasible Network Paradigm Joe Elizondo and Sam Palmer.
A.V. Bogdanov Private cloud vs personal supercomputer.
Computing/Tier 3 Status at Panjab S. Gautam, V. Bhatnagar India-CMS Meeting, Sept 27-28, 2007 Delhi University, Delhi Centre of Advanced Study in Physics,
SSI-OSCAR A Single System Image for OSCAR Clusters Geoffroy Vallée INRIA – PARIS project team COSET-1 June 26th, 2004.
Grid Information Systems. Two grid information problems Two problems  Monitoring  Discovery We can use similar techniques for both.
December 8 & 9, 2005, Austin, TX SURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide Configuring Resources for the Grid Jerry Perez.
PrimoGENI Tutorial Miguel Erazo, Neil Goldman, Nathanael Van Vorst, and Jason Liu Florida International University Other project participants: Julio Ibarra.
Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.
Operating Systems CS3502 Fall 2014 Dr. Jose M. Garrido
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Resource management system for distributed environment B4. Nguyen Tuan Duc.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Computing Labs CL5 / CL6 Multi-/Many-Core Programming with Intel Xeon Phi Coprocessors Rogério Iope São Paulo State University (UNESP)
Matthew Palmer, Cambridge University01/10/2015 First Use of the UK e-Science Grid Overview The Physics Experiences Looking forward Conclusions Matthew.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
◦ What is an Operating System? What is an Operating System? ◦ Operating System Objectives Operating System Objectives ◦ Services Provided by the Operating.
Alain Romeyer - 15/06/20041 CMS farm Mons Final goal : included in the GRID CMS framework To be involved in the CMS data processing scheme.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
A User-Lever Concurrency Manager Hongsheng Lu & Kai Xiao.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Rob Allan Daresbury Laboratory NW-GRID Training Event 25 th January 2007 Introduction to NW-GRID R.J. Allan CCLRC Daresbury Laboratory.
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
Virtual Private Grid (VPG) : A Command Shell for Utilizing Remote Machines Efficiently Kenji Kaneda, Kenjiro Taura, Akinori Yonezawa Department of Computer.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Networking: Applications and Services Antonia Ghiselli, INFN Stu Loken, LBNL Chairs.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Roadmap to Next Generation Internet: Indian Initiatives Subbu C-DAC, India.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Running clusters on a Shoestring US Lattice QCD Fermilab SC 2007.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
CFI 2004 UW A quick overview with lots of time for Q&A and exploration.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Advanced Network Administration Computer Clusters.
Grid Computing.
Virtualization in the gLite Grid Middleware software process
Presentation transcript:

Kento Aida, Tokyo Institute of Technology Grid Challenge - programming competition on the Grid - Kento Aida Tokyo Institute of Technology 22nd APAN Meeting in Singapore

Kento Aida, Tokyo Institute of Technology What is Grid Challenge? programming competition to develop high- performance programs on the Grid The organizer operates a Grid testbed. Participants develop/run programs on the testbed. a special event in the Annual Symposium on Advanced Computing Systems and Infrastructures (SACSIS) history 1st Grid Challenge in SACSIS nd Grid Challenge in SACSIS 2006

Kento Aida, Tokyo Institute of Technology Category compulsory programming competition on the Grid testbed solving the problem provided by the organizer  Graph Partitioning Problem students (university and high school) free giving opportunities to perform experiments on the Grid presentations during the conference students, engineers and researchers

Kento Aida, Tokyo Institute of Technology Compulsory Graph Partitioning Problem for given undirected graph G(V,E), |V| = 2n L and R are disjoint partitions generated by equally dividing G, where |L| = |R|. Find partition that minimizes the number of edges with one endpoint in L and the other in R LR

Kento Aida, Tokyo Institute of Technology Compulsory (cont’d) qualifying runs (3 weeks) Solve early!  to find a solution within a given threshold  shared resources  problem size: |V| = final runs (2 weeks) Solve fast!  dedicated time slots for finalists (2.5h per a team)  to find a solution within a given period (10 min)  A finalist with the best solution will be a winner!  problem size: |V| =

Kento Aida, Tokyo Institute of Technology Free experiments of research projects (1 month) shared resources projects tools  a monitoring tool, a message passing system, a programming tool, volunteer computing applications  physics simulation, bio informatics, simulation of diesel engine, optimization problems

Kento Aida, Tokyo Institute of Technology Participants D, 2 M, 12 U, 6 H, 1 compulsory free D, 2 M, 5 U, 1

Kento Aida, Tokyo Institute of Technology Testbed Grid Challenge Federation AIST Tokyo Institute of Technology The University of Tokyo Doshisha University more than 1,200 CPUs

Kento Aida, Tokyo Institute of Technology Resources collection of PC clusters spec of a PC cluster a gateway node  gateway, compiling computing nodes  computation global IP address/private IP address NFS  “/home” is shared among nodes

Kento Aida, Tokyo Institute of Technology Resources (cont’d) namesitecompt. node#compt. node (#CPUs) F32AIST (Tsukuba) Xeon 3GHz x2, 4GB mem., 1000BASE-T 128(256) SAKURAOpteron 1.8GHz x2, 3GB mem., 1000BASE-T 16(32) DISTITECH (Yokohama) Athlon MP GHz x2, 512MB mem. 100BASE-TX 50(100) PrestoIIITITECH (Tokyo) Opteron 246/242 2/1.6GHz x2, 4/3/2GB mem. 1000BASE-T 103(206) TauU. Tokyo (Tokyo) Xeon 2.4/2.8GHz x2, 2GB mem., 1000BASE-T 175(350) ChikayamaU. Tokyo (Chiba) Xeon 2.4GHz x2, 2GB mem., 1000BASE-T 64(128) XeniaDoshisha U. (Kyoto) Xeon 2.4GHz x2, 1GB em. 100BASE-TX 63/126

Kento Aida, Tokyo Institute of Technology Internet Connection Tsukuba WAN F32 SAKURA PrestoIII Chikayama Tau DIS SINET Xenia WIDE

Kento Aida, Tokyo Institute of Technology Software Grid middleware Globus Tool Kit 2.4 batch queueing system Sun Grid Engine, PBS remote process invocation SSH, GXP monitoring Ganglia programming MPICH 1.2.7, Ninf-G 2.4

Kento Aida, Tokyo Institute of Technology GXP shell for distributed multi-cluster environment fast simultaneous command submissions parallel job pipes interactive selection of nodes to execute commands no cumbersome per-node operations! installation and deployment invocation of parallel processes monitoring, trouble diagnosis, debugging dead processes clean-up

Kento Aida, Tokyo Institute of Technology Ninf-G reference implementation of GridRPC GridRPC : a simple RPC-based programming model for the Grid  Client invokes remote libraries installed on remote servers on the Grid.  utilizing task parallelism server library server library data result data result client program server program grpc_call(…)

Kento Aida, Tokyo Institute of Technology Ganglia a distributed monitoring tool for high- performance computing systems such as PC clusters and Grids CPU load memory usage network traffic

Kento Aida, Tokyo Institute of Technology Operation The testbed is operated by volunteers! researchers/technical staff/students What we need to do installation and its training for students user management job management

Kento Aida, Tokyo Institute of Technology User Management local account the same UID and login name for a user on all sites remote login via ssh  public key Globus account temporal CA for the Grid Challenge

Kento Aida, Tokyo Institute of Technology Job Management interactive or batch All sites provide both environment for job execution. dedicated slot Finalists are assigned dedicated slots for their application runs. the gentlemen’s agreement

Kento Aida, Tokyo Institute of Technology Troubles … computing nodes OS hang up, troubles on hard disc drives power supply failure of balancing power supply servers troubles on NFS, batch queueing systems monitoring troubles to collect monitoring data on ganglia

Kento Aida, Tokyo Institute of Technology Troubles … (cont’d) jobs being out of control waste of CPU/memory resources by jobs being out of control dedicated slots jobs running beyond its slot.

Kento Aida, Tokyo Institute of Technology Operational Issue trouble on computing nodes monitoring tools to identify computing nodes power supply critical problem for small groups, e.g., a lab in university tools for power monitoring low-power processor servers redundancy

Kento Aida, Tokyo Institute of Technology Operational Issue (cont’d) user/process management tools to control user processes  monitoring user processes  detecting unusual behavior  suspending/killing jobs being out of control tools for reservation  reserving dedicated slots for users  controlling user jobs

Kento Aida, Tokyo Institute of Technology Snapshots qualifying runs final runs

Kento Aida, Tokyo Institute of Technology Snapshots (cont’d)

Kento Aida, Tokyo Institute of Technology Conclusions Grid Challenge is programming competition to develop high-performance programs on the Grid. compulsory and free categories Grid testbed for Grid Challenge 6 sites, 7 PC clusters, >1200 CPU Globus, SGE, PBS, GXP, Ganglia, Ninf-G, MPICH, … discussion about operational issue tools for monitoring, power supply, user/process management

Kento Aida, Tokyo Institute of Technology Acknowledgements Information Processing Society of Japan Sun Microsystems Soum Corporation Grid Consortium Japan

Kento Aida, Tokyo Institute of Technology Thank you.