Severs AIST Cluster (50 CPU) Titech Cluster (200 CPU) KISTI Cluster (25 CPU) Climate Simulation on ApGrid/TeraGrid at SC2003 Client (AIST) Ninf-G Severs.

Slides:



Advertisements
Similar presentations
PRAGMA – TeraGrid – AIST Interoperation Testing Philip Papadopoulos.
Advertisements

Express5800/ft series servers Product Information Fault-Tolerant General Purpose Servers.
Resource WG Breakout. Agenda How we will support/develop data grid testbed and possible applications (1 st day) –Introduction of Gfarm (Osamu) –Introduction.
National Institute of Advanced Industrial Science and Technology Experiences through Grid Challenge Event Yoshio Tanaka.
National Institute of Advanced Industrial Science and Technology Flexible, robust, and efficient multiscale QM/MD simulation using GridRPC and MPI Yoshio.
Multi-organisation Grid Accounting System (MOGAS): PRAGMA deployment update A/Prof. Bu-Sung Lee, Francis School of Computer Engineering, Nanyang Technological.
National Institute of Advanced Industrial Science and Technology Status report on the large-scale long-run simulation on the grid - Hybrid QM/MD simulation.
Cindy Zheng, PRAGMA 8, Singapore, 5/3-4/2005 Status of PRAGMA Grid Testbed & Routine-basis Experiments Cindy Zheng Pacific Rim Application and Grid Middleware.
National Institute of Advanced Industrial Science and Technology Running flexible, robust and scalable grid application: Hybrid QM/MD Simulation Hiroshi.
Resource WG Report. Projects Applications EOL Ninf-G Climate model GridBlast GOC Gangla / SCMSWeb => Uniform Database Goodness Status map (e.g. IVDGL)
National Institute of Advanced Industrial Science and Technology Advance Reservation-based Grid Co-allocation System Atsuko Takefusa, Hidemoto Nakada,
National Institute of Advanced Industrial Science and Technology Ninf-G - Core GridRPC Infrastructure Software OGF19 Yoshio Tanaka (AIST) On behalf.
Does the implementation give solutions for the requirements? Flexibility GridRPC enables dynamic join/leave of QM servers. GridRPC enables dynamic expansion.
Three types of remote process invocation
OptorSim: A Replica Optimisation Simulator for the EU DataGrid W. H. Bell, D. G. Cameron, R. Carvajal, A. P. Millar, C.Nicholson, K. Stockinger, F. Zini.
Simulazione di Biomolecole: metodi e applicazioni giorgio colombo
Avionics Panel Go For Luna Landing! Graham ONeil United Space Alliance March 2008.
Advanced Industrial Science and Technology 10 Aug / Globus Retreat Ninf-G: Grid RPC system based on the Globus Toolkit Yoshio Tanaka (AIST, Japan)
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
GWDAW 16/12/2004 Inspiral analysis of the Virgo commissioning run 4 Leone B. Bosi VIRGO coalescing binaries group on behalf of the VIRGO collaboration.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Developing an Agricultural Monitoring System from Remote Sensing Data Using GridRPC on Ninf-G Shamim Akther, Yann Chemin, Honda Kiyoshi Asian Institute.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
The Virtual Data Toolkit distributed by the Open Science Grid Richard Jones University of Connecticut CAT project meeting, June 24, 2008.
Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing Qing Lu CMSC 838 Presentation.
National Institute of Advanced Industrial Science and Technology Introduction to Grid Activities in the Asia Pacific Region jointly presented by Yoshio.
DatacenterMicrosoft Azure Consistency Connectivity Code.
Large-Scale Density Functional Calculations James E. Raynolds, College of Nanoscale Science and Engineering Lenore R. Mullin, College of Computing and.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
1 Reasons for parallelization Can we make GA faster? One of the most promising choices is to use parallel implementations. The reasons for parallelization.
Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,
1 First-Principles Molecular Dynamics for Petascale Computers François Gygi Dept of Applied Science, UC Davis
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
The Nuts and Bolts of First-Principles Simulation Durham, 6th-13th December : DFT Plane Wave Pseudopotential versus Other Approaches CASTEP Developers’
National Institute of Advanced Industrial Science and Technology Introduction of PRAGMA routine-basis experiments Yoshio Tanaka
Scheduling Many-Body Short Range MD Simulations on a Cluster of Workstations and Custom VLSI Hardware Sumanth J.V, David R. Swanson and Hong Jiang University.
Optimization Problems - Optimization: In the real world, there are many problems (e.g. Traveling Salesman Problem, Playing Chess ) that have numerous possible.
Routine-Basis Experiments in PRAGMA Grid Testbed Yusuke Tanimura Grid Technology Research Center National Institute of AIST.
One-sided Communication Implementation in FMO Method J. Maki, Y. Inadomi, T. Takami, R. Susukita †, H. Honda, J. Ooba, T. Kobayashi, R. Nogita, K. Inoue.
ALGORITHM IMPROVEMENTS AND HPC 1. Basic MD algorithm 1. Initialize atoms’ status and the LJ potential table; set parameters controlling the simulation;
Large Scale Parallel File System and Cluster Management ICT, CAS.
McLean HIGHER COMPUTER NETWORKING Lesson 15 (a) Disaster Avoidance Description of disaster avoidance: use of anti-virus software use of fault tolerance.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
National Institute of Advanced Industrial Science and Technology APGrid PMA: Stauts Yoshio Tanaka Grid Technology Research Center,
TURBOMOLE Lee woong jae.
SAN DIEGO SUPERCOMPUTER CENTER Advanced User Support Project Overview Adrian E. Roitberg University of Florida July 2nd 2009 By Ross C. Walker.
Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,
1 Grid Activity Summary » Grid Testbed » CFD Application » Virtualization » Information Grid » Grid CA.
Timeshared Parallel Machines Need resource management Need resource management Shrink and expand individual jobs to available sets of processors Shrink.
Role of Theory Model and understand catalytic processes at the electronic/atomistic level. This involves proposing atomic structures, suggesting reaction.
Interaction of Voids and Nanoductility in Silica Glass In a paper soon to be published in the Physical Review Letters, we report the results of multimillion-to-billion.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
HPC HPC-5 Systems Integration High Performance Computing 1 Application Resilience: Making Progress in Spite of Failure Nathan A. DeBardeleben and John.
National Institute of Advanced Industrial Science and Technology Developing Scientific Applications Using Standard Grid Middleware Hiroshi Takemiya Grid.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
System Models Advanced Operating Systems Nael Abu-halaweh.
Self-service, with applications to distributed classifier construction Michael K. Reiter and Asad Samar April 27, 2006 Properties & Related Work Self-Service.
Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.
Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison
Enabling Grids for E-sciencE University of Perugia Computational Chemistry status report EGAAP Meeting – 21 rst April 2005 Athens, Greece.
Design Decisions / Lessons Learned
US CMS Testbed.
Grid Datafarm and File System Services
Department of Computer Science, University of Tennessee, Knoxville
Presentation transcript:

Severs AIST Cluster (50 CPU) Titech Cluster (200 CPU) KISTI Cluster (25 CPU) Climate Simulation on ApGrid/TeraGrid at SC2003 Client (AIST) Ninf-G Severs NCSA Cluster (225 CPU)

National Institute of Advanced Industrial Science and Technology Example - Hybrid QM/MD Simulation -

QM/MD simulation over the Pacific at SC2004 QM Server MD Client TCS (512 PSC Total number of CPUs: 1792 Ninf-G Close-up view corrosion of Sillicon under stress P32 (512 CPU) F32 (256 CPU)

Total number of CPUs: 1793 Total Simulation Time: 10 hour 20 min # steps: 10 (= 7fs) Average time / step: 1 hour Size of generated files / step: 4.5GB

(some of) Lessons Learned Practically impossible to occupy a large-scale single system for few weeks. How can we long-run the simulation? Faults (e.g. HDD crush, network down) cannot be avoided. We don t prefer manual restart. The simulation should be capable of automatic recovery from faults. How can the simulation recover from faults?

Objectives Develop flexible, robust, and efficient Grid- enabled simulation. Flexible -- allow dynamic resource allocation/migration, robust -- detect errors and recover from faults automatically for long runs, and efficient -- manage thousands of CPUs. Verify our strategy through large-scale experiments. Implemented Grid-enabled SIMOX (Separation by Implanted Oxygen) simulation Run the simulation on Japan-US Grid testbed for few weeks.

Hybrid QM/CL Simulation (1) Enabling large scale simulation with quantum accuracy Combining classical MD Simulation with quantum simulation CL simulation Simulating the behavior of atoms in the entire region Based on the classical MD using an empirical inter-atomic potential QM simulation Modifying energy calculated by MD simulation only in the interesting regions Based on the density functional theory (DFT) MD Simulation QM simulation based on DFT

simulation algorithm Each QM computation is independent with each other compute intensive usually implemented as a MPI program Hybrid QM/CL Simulation (2) MD partQM part initial set-up Calculate MD forces of QM+MD regions Update atomic positions and velocities Calculate QM force of the QM region Data of QM atoms QM forces Calculate QM force of the QM region Calculate QM force of the QM region Calculate MD forces of QM region

National Institute of Advanced Industrial Science and Technology Implementation of Grid-enabled Simulation - multi-scale QM/MD simulation using GridRPC and MPI -

Approach to gridify applications Grid RPC enhances the flexibility and robustness by; dynamic allocation of server programs, and detection of network/cluster trouble. MPI enhances the efficiency by; highly parallel computing on a cluster for both client and server programs. The new programming approach, combining GridRPC with MPI, takes advantages of both programming models complementarily to run large-scale applications on the Grid for a long time. Client Server GridRPC MPI MD MPI QM MPI QM