Game of Life Courtesy: Dr. David Walker, Cardiff University.

Slides:



Advertisements
Similar presentations
Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.
Advertisements

CHP-5 LinkedList.
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Sharks and Fishes – The problem
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
1 Parallel Parentheses Matching Plus Some Applications.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Cellular Automata: Life with Simple Rules
ParCeL-5/ParSSAP: A Parallel Programming Model and Library for Easy Development and Fast Execution of Simulations of Situated Multi-Agent Systems Stéphane.
Rumor Routing in Sensor Networks David Braginsky and Deborah Estrin.
Parallelizing stencil computations Based on slides from David Culler, Jim Demmel, Bob Lucas, Horst Simon, Kathy Yelick, et al., UCB CS267.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Definitions A Synchronous application is one where all processes must reach certain points before execution continues. Local synchronization is a requirement.
Introduction to Parallel Processing Final Project SHARKS & FISH Presented by: Idan Hammer Elad Wallach Elad Wallach.
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Week 9: Methods 1.  We have written lots of code so far  It has all been inside of the main() method  What about a big program?  The main() method.
ECE669 L4: Parallel Applications February 10, 2004 ECE 669 Parallel Computer Architecture Lecture 4 Parallel Applications.
StreamMD Molecular Dynamics Eric Darve. MD of water molecules Cutoff is used to truncate electrostatic potential Gridding technique: water molecules are.
CS 536 Spring Run-time organization Lecture 19.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
ECE669 L5: Grid Computations February 12, 2004 ECE 669 Parallel Computer Architecture Lecture 5 Grid Computations.
I/O Optimization for ENZO Cosmology Simulation Using MPI-IO Jianwei Li12/06/2001.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
Multiscalar processors
Efficient Parallelization for AMR MHD Multiphysics Calculations Implementation in AstroBEAR.
Device Management.
Parallel Computation in Biological Sequence Analysis Xue Wu CMSC 838 Presentation.
Run-time Environment and Program Organization
A. Frank - P. Weisberg Operating Systems Introduction to Cooperating Processes.
Advanced Topics in Algorithms and Data Structures 1 Two parallel list ranking algorithms An O (log n ) time and O ( n log n ) work list ranking algorithm.
Project 1CS-4513, D-Term Programming Project #1 Concurrent Game of Life Due Friday, March 20.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Chapter 13 Finite Difference Methods: Outline Solving ordinary and partial differential equations Finite difference methods (FDM) vs Finite Element Methods.
Parallel Programming: Case Studies Todd C. Mowry CS 495 September 12, 2002.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
1 High-Performance Grid Computing and Research Networking Presented by Yuming Zhang Instructor: S. Masoud Sadjadi
Discrete Bee Dance Algorithms for Pattern Formation on the Grid Noam Gordon Israel A. Wagner Alfred M. Bruckstein Technion IIT, Israel.
Lecture 8 – Stencil Pattern Stencil Pattern Parallel Computing CIS 410/510 Department of Computer and Information Science.
Lukasz Grzegorz Maciak Micheal Alexis
C OMPUTER S CIENCE G EMS 5 Challenging Projects. P ROJECT 1: M ASTERMIND Summary: A code breaking game 12 guesses Feedback is # correct and # correct.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Molecular Dynamics Sathish Vadhiyar Courtesy: Dr. David Walker, Cardiff University.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
SEARCHING.  This is a technique for searching a particular element in sequential manner until the desired element is found.  If an element is found.
Chapter 3 Parallel Programming Models. Abstraction Machine Level – Looks at hardware, OS, buffers Architectural models – Looks at interconnection network,
Application Paradigms: Unstructured Grids CS433 Spring 2001 Laxmikant Kale.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Dynamic Scenes Paul Arthur Navrátil ParallelismJustIsn’tEnough.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Cellular Automata Introduction  Cellular Automata originally devised in the late 1940s by Stan Ulam (a mathematician) and John von Neumann.  Originally.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Adaptive Mesh Applications Sathish Vadhiyar Sources: - Schloegel, Karypis, Kumar. Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes. JPDC.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures - CSCI 102 Selection Sort Keep the list separated into sorted and unsorted sections Start by finding the minimum & put it at the front.
Introductory Parallel Applications Courtesy: Dr. David Walker, Cardiff University.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
Courtesy: Dr. David Walker, Cardiff University
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Parallel Shared Memory
Parallel and Distributed Simulation Techniques
Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.
Sathish Vadhiyar Courtesy: Dr. David Walker, Cardiff University
CS6068 Applications: Numerical Methods
Notes on Assignment 3 OpenMP Stencil Pattern
Adaptivity and Dynamic Load Balancing
Introduction to High Performance Computing Lecture 16
Parallel Programming in C with MPI and OpenMP
Presentation transcript:

Game of Life Courtesy: Dr. David Walker, Cardiff University

A Dynamical System - WaTor Courtesy: Dr. David Walker, Cardiff  Tracking evolution of life  A 2-D ocean in which sharks and fish survive  2 important features a.Need for dynamic load distribution b.Potential conflicts due to updates by different processors  Features shared by other advanced parallel applications

WaTor – The problem  Ocean divided into grids  Each grid cell can be empty or have a fish or a shark  Grid initially populated with fishes and sharks in a random manner  Population evolves over discrete time steps according to certain rules

WaTor - Rules Fish:  At each time step, a fish tries to move to a neighboring empty cell. If not empty, it stays  If a fish reaches a breeding age, when it moves, it breeds, leaving behind a fish of age 0. Fish cannot breed if it doesn’t move.  Fish never starves

WaTor - Rules Shark:  At each time step, if one of the neighboring cells has a fish, the shark moves to that cell eating the fish. If not and if one of the neighboring cells is empty, the shark moves there. Otherwise, it stays.  If a shark reaches a breeding age, when it moves, it breeds, leaving behind a shark of age 0. shark cannot breed if it doesn’t move.  Sharks eat only fish. If a shark reaches a startvation age (time steps since last eaten), it dies.

Inputs and Data Structures Inputs:  Size of the grid  Distribution of sharks and fishes  Shark and fish breeding ages  Shark starvation age Data structures: A 2-D grid of cells struct ocean{ int type /* shark or fish or empty */ struct swimmer* occupier; }ocean[MAXX][MAXY] A linked list of swimmers struct swimmer{ int type; int x,y; int age; int last_ate; int iteration; swimmer* prev; swimmer* next; } *List; Sequential Code Logic Initialize ocean array and swimmers list In each time step, go through the swimmers in the order in which they are stored and perform updates

Towards a Parallel Code  2-D data distribution similar to Laplace and molecular dynamics is used. Each processor holds a grid of ocean cells.  For communication, each processor needs data from 4 neighboring processors.  2 new challenges – potential for conflicts, load balancing

1 st Challenge – Potential for Conflicts  Unlike previous problems, border cells may change during updates due to fish or shark movement  Border cells need to be communicated back to the original processor. Hence update step involves communication  In the meantime, the original processor may have updated the border cell. Hence potential conflicts Time T Time T+1 S F S F F F F F F F S F F S F F F F F S

2 Techniques  Rollback updates for those particles (fish or shark) that have crossed processor boundary and are in potential conflicts.  May lead to several rollbacks until a free space is found.  2 nd technique is synchronization during updates to avoid conflicts in the first place.

2 Techniques  During update, a processor x sends its data first to processor y, allows y to perform its updates, get the updates from y, and then performs its own updates.  Synchronization can be done by sub-partitioning.  Divide a grid owned by a processor into sub-grids.  This way, some parallelism is achieved in neighbor updates update

Load Imbalance  The workload distribution changes over time  2-D block distribution is not optimal Techniques:  Dynamic load balancer  Static load balancing by a different data distribution

Dynamic load balancing  Performed at each time step  Orthogonal Recursive Bisection (ORB) Problems: Complexity in finding the neighbors and data for communication

Static Data Distribution  Cyclic 0,0 0,1 0,2 0,3 1,0 1,1 1,2 1,3 1,0 1,1 1,2 1,3 2,0 2,1 2,2 2,3 2,0 2,1 2,2 2,3 3,0 3,1 3,2 3,3 3,0 3,1 3,2 3,3 Problems: Increase in boundary data; increase in communication