National Institute of Advanced Industrial Science and Technology Running flexible, robust and scalable grid application: Hybrid QM/MD Simulation Hiroshi.

Slides:



Advertisements
Similar presentations
PRAGMA – TeraGrid – AIST Interoperation Testing Philip Papadopoulos.
Advertisements

Resource WG Breakout. Agenda How we will support/develop data grid testbed and possible applications (1 st day) –Introduction of Gfarm (Osamu) –Introduction.
National Institute of Advanced Industrial Science and Technology Experiences through Grid Challenge Event Yoshio Tanaka.
National Institute of Advanced Industrial Science and Technology Flexible, robust, and efficient multiscale QM/MD simulation using GridRPC and MPI Yoshio.
Multi-organisation Grid Accounting System (MOGAS): PRAGMA deployment update A/Prof. Bu-Sung Lee, Francis School of Computer Engineering, Nanyang Technological.
Resources WG Update PRAGMA 9 Hyderabad. Status (in 1 slide) Applications QMMD (AIST) Savannah (MU) iGAP (SDSC, AIST) Middleware Gfarm (AIST) Community.
Reports from Resource Breakout PRAGMA 16 KISTI, Korea.
Resource WG Update PRAGMA 8 Singapore. Routine Use - Users make a system work.
17 th October, 2006PRAGMA 11, Beautiful Osaka, Japan COMPLAINTS TO RESOURCE GROUP Habibah A Wahab, Suhaini Ahmad, Nur Hanani Che Mat School of Pharmaceutical.
National Institute of Advanced Industrial Science and Technology Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation.
Cindy Zheng, PRAGMA 8, Singapore, 5/3-4/2005 Status of PRAGMA Grid Testbed & Routine-basis Experiments Cindy Zheng Pacific Rim Application and Grid Middleware.
Resource/data WG Summary Yoshio Tanaka Mason Katz.
Resource WG Report. Projects Applications EOL Ninf-G Climate model GridBlast GOC Gangla / SCMSWeb => Uniform Database Goodness Status map (e.g. IVDGL)
Oh-kyoung Kwon Grid Computing Research Team KISTI First PRAGMA Institute MPICH-GX: Grid Enabled MPI Implementation to Support the Private IP and the Fault.
National Institute of Advanced Industrial Science and Technology Advance Reservation-based Grid Co-allocation System Atsuko Takefusa, Hidemoto Nakada,
CSF4 Meta-Scheduler PRAGMA13 Zhaohui Ding or College of Computer.
A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Gfarm v2 and CSF4 Osamu Tatebe University of Tsukuba Xiaohui Wei Jilin University SC08 PRAGMA Presentation at NCHC booth Nov 19,
National Institute of Advanced Industrial Science and Technology Ninf-G - Core GridRPC Infrastructure Software OGF19 Yoshio Tanaka (AIST) On behalf.
Does the implementation give solutions for the requirements? Flexibility GridRPC enables dynamic join/leave of QM servers. GridRPC enables dynamic expansion.
Severs AIST Cluster (50 CPU) Titech Cluster (200 CPU) KISTI Cluster (25 CPU) Climate Simulation on ApGrid/TeraGrid at SC2003 Client (AIST) Ninf-G Severs.
Three types of remote process invocation
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Providing Fault-tolerance for Parallel Programs on Grid (FT-MPICH) Heon Y. Yeom Distributed Computing Systems Lab. Seoul National University.
Pontus Boström and Marina Waldén Åbo Akademi University/ TUCS Development of Fault Tolerant Grid Applications Using Distributed B.
28 April, 2005ISGC 2005, Taiwan The Efficient Handling of BLAST Applications on the GRID Hurng-Chun Lee 1 and Jakub Moscicki 2 1 Academia Sinica Computing.
Setting up Small Grid Testbed
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Developing an Agricultural Monitoring System from Remote Sensing Data Using GridRPC on Ninf-G Shamim Akther, Yann Chemin, Honda Kiyoshi Asian Institute.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Enabling Grids for E-sciencE Medical image processing web portal : Requirements analysis. An almost end user point of view … H. Benoit-Cattin,
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.
Module 13: Configuring Availability of Network Resources and Content.
Grid ASP Portals and the Grid PSE Builder Satoshi Itoh GTRC, AIST 3rd Oct UK & Japan N+N Meeting Takeshi Nishikawa Naotaka Yamamoto Hiroshi Takemiya.
Kento Aida, Tokyo Institute of Technology Grid Challenge - programming competition on the Grid - Kento Aida Tokyo Institute of Technology 22nd APAN Meeting.
Matthew Palmer, Cambridge University01/10/2015 First Use of the UK e-Science Grid Overview The Physics Experiences Looking forward Conclusions Matthew.
PRAGMA: Cyberinfrastructure, Applications, People Yoshio Tanaka (AIST, Japan) Peter Arzberger (UCSD, USA)
National Institute of Advanced Industrial Science and Technology Introduction of PRAGMA routine-basis experiments Yoshio Tanaka
Building the PRAGMA Grid Through Routine-basis Experiments Cindy Zheng Pacific Rim Application and Grid Middleware Assembly San Diego Supercomputer Center.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Routine-Basis Experiments in PRAGMA Grid Testbed Yusuke Tanimura Grid Technology Research Center National Institute of AIST.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Cloud Age Time to change the programming paradigm?
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Towards large-scale parallel simulated packings of ellipsoids with OpenMP and HyperFlow Monika Bargieł 1, Łukasz Szczygłowski 1, Radosław Trzcionkowski.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
HPC HPC-5 Systems Integration High Performance Computing 1 Application Resilience: Making Progress in Spite of Failure Nathan A. DeBardeleben and John.
National Institute of Advanced Industrial Science and Technology Developing Scientific Applications Using Standard Grid Middleware Hiroshi Takemiya Grid.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Operating Systems Distributed-System Structures. Topics –Network-Operating Systems –Distributed-Operating Systems –Remote Services –Robustness –Design.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
TensorFlow– A system for large-scale machine learning
OpenPBS – Distributed Workload Management System
Chapter 2: System Structures
Grid Coordination by Using the Grid Coordination Protocol
Presentation transcript:

National Institute of Advanced Industrial Science and Technology Running flexible, robust and scalable grid application: Hybrid QM/MD Simulation Hiroshi Takemiya, Yusuke Tanimura and Yoshio Tanaka Grid Technology Research Center National Institute of Advanced Industrial Science and Technology, Japan

Goals of the experiment To clarify functions needed to execute large scale grid applications requires many computing resources for a long time 1000 ~ CPUs 1 month ~ 1 year 3 requirements Scalability Managing a large number of resources effectively Robustness Fault detection and fault recovery Flexibility Dynamic Resource Switching Can t assume all resources are always available during the experiment

Difficulty in satisfying these requirements Existing grid programming models are hard to satisfy the requirements GridRPC Dynamic configuration Does not need co-allocation Easy to switch computing resources dynamically Good fault tolerance (detection) One remote executable fault client can retry or use other remote executable Hard to manage large numbers of servers Client will become bottleneck Grid-enabled MPI Flexible communication Possible to avoid communication bottleneck Static configuration Need co-allocation Can not change the No. of processes during execution Poor fault tolerance One process fault all process fault Fault tolerant MPI is still in the research phase

Gridifying applications using GridRPC and MPI Combining GridRPC and MPI Grid RPC Allocating server (MPI) programs dynamically Supporting loose communication between a client and servers Managing only tens to hundreds of server programs MPI Supporting scalable execution of a parallelized server program Suitable for gridifying applications consisting of loosely-coupled parallel programs Multi-disciplinary simulations Hybrid QM/MD simulation GridRPC client GridRPC MPI Programs GridRPC MPI Programs MPI Programs

Related Work Scalability Large scale experiment in SC2004 Gridfying QM/MD simulation program based on our approach Executing a simulation using ~1800 CPUs of 3 clusters Our approach can manage a large No. of computing resourcesRobustness Long run experiment on the PRAGMA testbed Executing TDDFT program over a month Ninf-G can detect servers faults and return errors correctly Conducting an experiment to show the validity of our approach Long run QM/MD simulation on the PRAGMA testbed implementing scheduling mechanism as well as fault tolerant mechanism

Large scale experiment in SC2004 P32 (512 CPU) F32 (256 CPU) TCS (512 PSC P32 (512 CPU) F32 (1 CPU) QM #1: 69 atoms including 2H 2 O+2OH QM #3: 44 atoms including H 2 O QM #2: 68 atoms including H 2 O QM #4: 56 atoms including H 2 O MD: 110,000 atoms (1281 CPU) P32 (1024 CPU) Opteron (2.0 GHz) 2-way cluster F32 (257 CPU) Xeon (3.06 GHz) 2-way cluster PSC (512 CPU) ES45 alpha (1.0 GHz) 4-way cluster (1281 CPU) P32 (1024 CPU) Opteron (2.0 GHz) 2-way cluster F32 (257 CPU) Xeon (3.06 GHz) 2-way cluster PSC (512 CPU) ES45 alpha (1.0 GHz) 4-way cluster Using totally 1793 CPUs on 3 clusters Succeeded in running QM/MD program over 11 hours Our approach can manage a large No. of resources Using totally 1793 CPUs on 3 clusters Succeeded in running QM/MD program over 11 hours Our approach can manage a large No. of resources

Related Work Scalability Large scale experiment in SC2004 Gridfying QM/MD simulation program based on our approach Executing a simulation using ~1800 CPUs of 3 clusters Our approach can manage a large No. of computing resourcesRobustness Long run experiment on the PRAGMA testbed Executing TDDFT program over a month Ninf-G can detect servers faults and return errors correctly Conducting an experiment to show the validity of our approach Long run QM/MD simulation on the PRAGMA testbed implementing scheduling mechanism as well as fault tolerant mechanism

Long run Experiment on the PRAGMA testbed Purpose Evaluate quality of Ninf-G2 Have experiences on how GridRPC can adapt to faults Ninf-G stability Number of executions : 43 Execution time (Total) : 50.4 days (Max) : 6.8 days (Ave) : 1.2 days Number of RPCs: more than 2,500,000 Number of RPC failures: more than 1,600 (Error rate is about %) Ninf-G detected these failures and returned errors to the application

Related Work Scalability Large scale experiment in SC2004 Gridfying QM/MD simulation program based on our approach Executing a simulation using ~1800 CPUs of 3 clusters Our approach can manage a large No. of computing resourcesRobustness Long run experiment on the PRAGMA testbed Executing TDDFT program over a month Ninf-G can detect servers faults and return errors correctly The present experiment reinforces the evidence of the validity of our approach Long run QM/MD simulation on the PRAGMA testbed implementing a scheduling mechanism for flexibility as well as fault tolerance

Necessity of Large-scale Atomistic Simulation Modern material engineering requires detailed knowledge based on microscopic analysis Future electronic devices Micro electro mechanical systems (MEMS) Features of the analysis nano-scale phenomena A large number of atoms Sensitive to environment Very high precision Quantum description of bond breaking [ Deformation process ] [ Stress distribution ] Large-scale Atomistic Simulation Stress enhances the possibility of corrosion?

Hybrid QM/MD Simulation (1) Enabling large scale simulation with quantum accuracy Combining classical MD Simulation with QM simulation MD simulation Simulating the behavior of atoms in the entire region Based on the classical MD using an empirical inter-atomic potential QM simulation Modifying energy calculated by MD simulation only in the interesting regions Based on the density functional theory (DFT) MD Simulation QM simulation based on DFT

Hybrid QM/MD Simulation (2) Suitable for Grid Computing Additive Hybridization QM regions can be set at will and calculated independently Computation dominant MD and QMs are loosely coupled Communication cost between QM and MD: ~ O(N) Very large computational cost of QM Computation cost of QM: ~ O(N 3 ) Computation cost of MD: ~ O(N) A lot of sources of parallelism MD simulation: executed in parallel (with tight communication) each QM simulation: executed in parallel (with tight communication) QM simulations: executed independently (without communication) MD and QM simulations: executed in parallel (loosely coupled) QM1 QM2 loose independent MD simulation QM simulation tight

Modifying the Original Program Eliminating initial set-up routine in the QM program Adding initialization function Eliminating the loop structure in the QM program Tailoring the QM simulation as a function Replacing MPI routine to Ninf-G function calls MD partQM part initial set-up Calculate MD forces of QM+MD regions Update atomic positions and velocities Calculate QM force of the QM region Calculate QM force of the QM region Calculate QM force of the QM region Calculate MD forces of QM region initial set-up Initialization Initial parameters Data of QM atoms QM forces Data of QM atoms QM forces Calculate QM force of the QM region Calculate QM force of the QM region Calculate QM force of the QM region

Implementation of a scheduling mechanism Inserting scheduling layer between application and grpc layers in the client program Application does not care about scheduling Functions of the layer Dynamic switching of target clusters Checking availabilities of clusters Available period Maximum execution time Error detection/recovery Detecting server errors/time-outing Time-outing Preventing application from long wait Long wait in the batch queue Long data transfer time Trying to continue simulation on other clusters Implemented using Ninf-G Client program QMMD simulation Layer (Fortran) Scheduling Layer GRPC layer (Ninf-G System)

Long run experiment on the PRAGMA testbed Goals Continue simulation as long as possible Check the availability of our programming approach Experiment Time Started at the 18 th Apr. End at the end of May (hopefully) Target Simulation 5 QM atoms inserted in the box-shaped Si Totally 1728 atoms 5 QM regions each of which consists of only 1 atom Entire regionCentral regionTime evolution of the system

Testbed for the experiment AIST: UME NCHC: ASE SINICA: PRAGMA SDSC: Rocks-52 Rocks-47 SDSC: Rocks-52 Rocks-47 UNAM: Malicia KU: AMATA NCSA: TGC 8 clusters of 7 institutes in 5 countries AIST, KU, NCHC, NCSA, SDSC, SINICA and UNAM under porting for other 5 clusters Using 2 CPUS for each QM simulation Change target the cluster at every 2 hours CNIC KISTI BII TITECH USM

Porting the application 5 steps to port our application (1) Check the accessibility using ssh (2) Executing sequential program using globus-job-run (3) Executing MPI program using globus-job-run (4) Executing Ninfied program (5) Executing our applicationTroubles Jobmanager-sge had bugs to execute MPI programs Fixed version was released from AIST Inappropriate MPI was specified in jobmanagers LAM/MPI does not support execution through Globus Mpich-G is not available due to the certificate problem Recommended to use mpich library Full Cert GRAM Limited Cert PBS/SGE mpirun GRAM

Executing the application Expiration of certificates We had to care about many kinds of globus related certs User cert, host cert, CA cert, CRL … Globus error message is bad check host and port Poor I/O performance Programs compiled by Intel fortran compiler takes a lot of time for I/O 2 hours to output several Mbytes data! Specifying buffered I/O Using NFS file system is another cause of poor I/O performance Remaining processes Server processes remain on the backend nodes while job is deleted from a batch-queue SCMS web is very convenient to find such remaining processes

Preliminary result of the experiment Succeeded in calculating ~ time steps during 2 weeks No. of GRPC executed: times No. of failure/time-out: 524 times Most of them (~80 %) occurred in the connection phase Due to connection failure/batch system down/queuing time out Time out for queueing: ~ 60 sec Other failures include; Exceeding max. execution time (2 hours) Exceeding max. execution time/1 time step (5 min) Exceeding max. CPU time the cluster specified (900 sec)

Giving a demonstration!!

Execution Profile: Scheduling Example of exceeding max. execution time (~60 sec) (~80 sec)

Execution Profile: Error Recovery Example of error recovering Batch system fault Queueing time-out Execution time-out Batch System FaultQueueing time-out Execution time-out