Department of Computer Science, University of Tennessee, Knoxville

Slides:



Advertisements
Similar presentations
OpenMP Optimization National Supercomputing Service Swiss National Supercomputing Center.
Advertisements

Parallel Processing with OpenMP
SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Thoughts on Shared Caches Jeff Odom University of Maryland.
Dynamic Load Balancing for VORPAL Viktor Przebinda Center for Integrated Plasma Studies.
ATLSS and Uncertainty: relativity and spatially-explicit ecological models as methods to aid management planning in Everglades restoration Louis J. Gross,
Sensitivity Analysis of a Spatially Explicit Fish Population Model Applied to Everglades Restoration Ren é A. Salinas and Louis J. Gross The Institute.
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
A Parallel Structured Ecological Model for High End Shared Memory Computers Dali Wang Department of Computer Science, University of Tennessee, Knoxville.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Models for the masses: bringing computational resources for addressing complex ecological problems to stakeholders Louis J. Gross, Eric Carr and Mark.
Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.
N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.
Efficient Parallel Implementation of Molecular Dynamics with Embedded Atom Method on Multi-core Platforms Reporter: Jilin Zhang Authors:Changjun Hu, Yali.
Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,
GmImgProc Alexandra Olteanu SCPD Alexandru Ştefănescu SCPD.
1 Babak Behzad, Yan Liu 1,2,4, Eric Shook 1,2, Michael P. Finn 5, David M. Mattli 5 and Shaowen Wang 1,2,3,4 Babak Behzad 1,3, Yan Liu 1,2,4, Eric Shook.
Energy Profiling And Analysis Of The HPC Challenge Benchmarks Scalable Performance Laboratory Department of Computer Science Virginia Tech Shuaiwen Song,
Scientific Computing Topics for Final Projects Dr. Guy Tel-Zur Version 2,
Computational issues in Carbon nanotube simulation Ashok Srinivasan Department of Computer Science Florida State University.
The WRF Model The Weather Research and Forecasting (WRF) Model is a mesoscale numerical weather prediction system designed for both atmospheric research.
Efficient Data Accesses for Parallel Sequence Searches Heshan Lin (NCSU) Xiaosong Ma (NCSU & ORNL) Praveen Chandramohan (ORNL) Al Geist (ORNL) Nagiza Samatova.
HPC User Forum Back End Compiler Panel SiCortex Perspective Kevin Harris Compiler Manager April 2009.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Integrated Ecological Assessment February 28, 2006 Long-Term Plan Annual Update Carl Fitz Recovery Model Development and.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Parallel Landscape Fish Model for South Florida Ecosystem Simulation Dali Wang 1, Michael W. Berry 2, Eric A. Carr 1, Louis J. Gross 1 The ATLSS Hierarchy.
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
1/50 University of Turkish Aeronautical Association Computer Engineering Department Ceng 541 Introduction to Parallel Computing Dr. Tansel Dökeroğlu
Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.
Overview of MSU ESRDC Activities related to Computational Tools for Early State Design Dr. Noel Schulz Associate Professor and TVA Endowed Professorship.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
 155 South 1452 East Room 380  Salt Lake City, Utah  This research was sponsored by the National Nuclear Security Administration.
F1-17: Architecture Studies for New-Gen HPC Systems
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Current Generation Hypervisor Type 1 Type 2.
GMAO Seasonal Forecast
4D-VAR Optimization Efficiency Tuning
Across Trophic Level System Simulation
How can the science of ecological modeling be used as a management tool in assessing the relative impact of alternative hydrology scenarios on populations.
Foundations of Spatially Explicit Species Index Models
FPGA: Real needs and limits
Across Trophic Level System Simulation (ATLSS)
Myoungjin Kim1, Yun Cui1, Hyeokju Lee1 and Hanku Lee1,2,*
University of Technology
Software Architecture in Practice
Performance Evaluation of Adaptive MPI
Model-Driven Analysis Frameworks for Embedded Systems
CMAQ PARALLEL PERFORMANCE WITH MPI AND OpenMP George Delic, Ph
Applying Twister to Scientific Applications
Department of Computer Science University of California, Santa Barbara
Development of the Nanoconfinement Science Gateway
SDM workshop Strawman report History and Progress and Goal.
Dynamic Code Mapping Techniques for Limited Local Memory Systems
Chapter 1 Introduction.
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Hybrid Programming with OpenMP and MPI
Dr. Tansel Dökeroğlu University of Turkish Aeronautical Association Computer Engineering Department Ceng 442 Introduction to Parallel.
Parallel Computing Explained How to Parallelize a Code
Parallel Programming in C with MPI and OpenMP
Department of Computer Science University of California, Santa Barbara
Parallel Exact Stochastic Simulation in Biochemical Systems
Computational issues Issues Solutions Large time scale
Parallel I/O for Distributed Applications (MPI-Conn-IO)
Presentation transcript:

Department of Computer Science, University of Tennessee, Knoxville A Parallel Structured Ecological Model for High End Shared Memory Computers Dali Wang Department of Computer Science, University of Tennessee, Knoxville dwang@cs.utk.edu

Background and Context to provide a quantitative modeling package to assist stakeholders in the South and Central Florida restoration effort. to aid in understanding how the biotic communities of South Florida are linked to the hydrologic regime and other abiotic factors, and to provide a tool for both scientific research and ecosystem management. IWOMP 2005, D. Wang

Previous Experiments IWOMP 2005, D. Wang A successful approach to the parallelization of landscape based (spatially-explicit) fish models is spatial decomposition. For these cases, each processor only simulates the ecological behaviors of fish on a partial landscape. This approach is efficient in standalone fish simulations because the low movement capability of fish does not force large data movement between processors. IWOMP 2005, D. Wang

Motivations & Objectives However, in an integrated simulation with an individual-based wading bird model, intensive data immigration across all processors is inevitable, since a bird’s flying distance may cover the whole landscape. Typical memory-intensive applications Design a new partition approach to minimize the data transfer; to efficiently utilize the advanced features of shared-memory computational platforms; IWOMP 2005, D. Wang

Model Structure and Fish Dynamics Computational domain: approximately 111,000 landscape cells, each has two basic types of area: marsh and pond. Fish Dynamics Escape, Diffusive Movement, Mortality, Aging, Reproduction, Growth IWOMP 2005, D. Wang

Parallelization Strategy Layer-wised Partition: 1) Data transfer between processes can be minimized; 2) Dynamic load balancing can be easily implemented . IWOMP 2005, D. Wang

Computational Model Model Initialization Computational thread(s) Master thread Computational thread(s) Escape (OMP_NUM_THREADS -1) threads Update low trophic data Big simulation loop Nested OpenMP parallel region Get total fish density (one thread) Diffusive Movement (OMP_NUM_THREADS -1) threads Get fish consumption (one thread) Update hydrological data Mortality (OMP_NUM_THREADS -1) threads Aging Reproduction I/O operations (one thread) Growth Model Finalization IWOMP 2005, D. Wang

Computational Platform A SGI Altix system at the Center for Computational Sciences (CCS) of ORNL. 256 Intel Itanium2 processors running at 1.5 GHz, each with 6 MB of L3 cache, 256K of L2 cache, and 32K of L1 cache. 8 GB of memory per processor for a total of 2 Terabytes of total system memory The operating system is a 64-bit version of Linux. The parallel programming model is supported by OpenMP. IWOMP 2005, D. Wang

Model Result and Performance IWOMP 2005, D. Wang

Future Work (Ecological aspect) Field Data Calibration and Verification Ecological Model Integration With individual-based wading bird model spatially-explicit spices index model Ecological Impact Assessment (scenario analysis, …) Simulation-based ecosystem management (spatial optimal control, real-time ecological system analysis) IWOMP 2005, D. Wang

Future Work (Computational aspect) Large-scale simulation: Fine resolution- A hybrid, reconfigurable two-dimensional (spatial/temporal) partitioning using a hybrid MPI/OpenMP model. Fault tolerant computing/simulation Model integration: A component based parallel simulation framework IWOMP 2005, D. Wang

Related References IWOMP 2005, D. Wang Parallel Implementation D. Wang, et al. Design and Implementation of a Parallel Fish Model for South Florida. Proceedings of the 37th Hawaii International Conference on System Sciences. D. Wang, et al. A Parallel Landscape Fish Model for Ecosystem Modeling, Simulation: The Transactions of The Society of Modeling and International. Grid Computing Module D. Wang, et al. A Grid Service Module for Natural Resource Managers, Internet Computing. Performance Evaluations D. Wang, et al. On Parallelization of a Structured Ecological Model, International Journal on High Performance Computing Applications. Simulation Framework D. Wang, et al. Toward Ecosystem Modeling on Computing Grids, Computing in Science and Engineering. D. Wang, et al. A Parallel Simulation Framework for Regional Ecosystem Modeling, IEEE Transactions on Distributed and Parallel Systems Websites www.atlss.org www.tiem.utk.edu/gem IWOMP 2005, D. Wang

Acknowledgement National Science Foundation U.S. Geological Survey, through Cooperative Agreement with UT Department of Interior’s Critical Ecosystem Studies Initiative Computational Science Initiatives through the Science Alliance at UT/ORNL This research used resources of the Center for Computational Sciences at ORNL, which is supported by the Office of Science of the Department of Energy IWOMP 2005, D. Wang

Parallel code section IWOMP 2005, D. Wang

Lessons MPI/OpenMP IWOMP 2005, D. Wang Easy Process/Thread Management (dynamic vs. static) Minimum Code Modification vs. Flexible Performance Tuning (user involvement needed) Parallel Profiling Tools (TAU, PAPI, …) High Portability (SMP and Cluster, even New systems (multi-core, embedded ) IWOMP 2005, D. Wang