Parallel Programming Fundamentals Eric Shook Department of Geography Kent State University
Our Parallel Programming Focus Shared Memory with Data Parallelism Processing Core Processing Core 1 Processing Core Processing Core 1 Task A Task A Data (Half) [40.742, -74.245] Data (Half) Task B Task B Memory space is shared between processing core 0 and 1 Data (Half) Data (Half)
Local Operations: Starting off Easy LocalSum Input Layers + Decomposition Subdomains Processing Core Processing Core 1 Parallel Processing + + Output Layer
Non-Local Operations: FocalMean Example Input Layer 1 2 2 Decomposition = 2 2 3 2 3 2 1 Subdomains Processing Core Processing Core 1 Parallel Processing
+ Edge Cases = 2 FocalMean Edge case (literally) causes a problem The bottom row of cells on the left (core 0) and the top row of cells on the right (core 1) do not have all the data they need for processing (missing data). FocalMean Input Layer Input Layers + 1 2 2 Decomposition Decomposition = 2 2 3 2 3 2 1 Subdomains Subdomains Processing Core Processing Core 1 Processing Core Missing Processing Core 1 Missing Parallel Processing
Ghost Zones "Ghost zones surround a local computing environment as proxies for communicating with remote processors and, thus, simplify the handling of inter-processor communication" (Shook, et al. 2013) Local Copy 1 2 2 = 2 2 3 2 3 2 1 Local Copy Ghost zones can be expanded to handle zonal and global operations too Shook, E., Wang, S., & Tang, W. (2013). A communication-aware framework for parallel spatially explicit agent-based models. International Journal of Geographical Information Science, 27(11), 2160-2181.
Data Dependencies How much water accumulates at the bottom of the hill? Answering this question depends on: What is up hill (spatial dependency) How long it has been raining (temporal dependency) Ghost zones help to resolve spatial dependencies in parallel programs Procedures in cartographic modeling help to resolve temporal dependencies
Data Hazards Data dependencies may lead to data hazards Read After Write (RAW) Write After Read (WAR) Write After Write (WAW) Core 0 X = 2 + 2 X = X + 2 X = X + 4 10 (Correct) Core 0 Core1 X = 2 + 2 X = X + 2 X = X + 4 RAW error Core 0 Core1 X = 2 + 2 X = X + 4 X = X + 2 X = 8 (WAR) Core 0 Core1 X = 2 + 2 X = X + 4 X = X + 2 X = 6 (WAW) Not defined
PCML – Automatic Parallelization PCML supports automatic parallelization so developers do not have to worry about all the details of parallelism Each location in a layer is decomposed and assigned to a single subdomain. So as long as developers: Only write to the location that is being processed, then they will avoid data hazards Properly define ghost zones, then all spatial dependencies will be satisfied Properly create procedures and operations, then all temporal dependencies will be satisfied