Presentation is loading. Please wait.

Presentation is loading. Please wait.

Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems D. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan.

Similar presentations


Presentation on theme: "Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems D. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan."— Presentation transcript:

1 Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems D. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan Parthasarthy, Atanas Rountev, P. Sadayappan

2 Background Trees and graphs can concisely represent relationships between data Data sets are becoming increasingly large and can require compute-intensive processing Developing efficient, memory hierarchy- aware applications is hard

3 Sample Applications n-body simulation Fast Multipole Methods (FMM) multiresolution analysis clustering and classification frequent pattern mining

4 Key Contributions Efficient fine-grained data access with a global view of data Exploit linked structure to provide fast global pointer dereferencing High-level, locality-aware, parallel operations on linked data structures Application-driven customization Empirical validation of the approach

5 Framework Design

6 Global Chunk Layer (GCL) API and run-time library for managing chunks - built on ARMCI Abstracts common functionality for handling irregular, linked data Provides a global namespace with access and modification operations Extensible and highly customizable to maximize functionality and performance

7 Chunks A chunk is: Contiguous memory segment Globally accessible Physically local to only one process Collection of user-defined elements Unit of data transfer

8 Programming Model SPMD with MIMD-style parallelism Global pointers permit fine-grained access Chunks allow coarse-grained data movement Uses get/compute/put model for globally shared data access Provides both uniform global view and chunked global view of data

9 Global Pointers c = &p.child[i] + p.child[i].ci + p.child[i].no }} c p 4252 + -4252 + 4340

10 Global Trees (GT) Run-time library and API for global view programming trees on DM clusters Built on GCL chunk communication framework High-level tree operations which work in parallel and are locality aware Each process can asynchronously access any portion of the shared tree structure

11 GT Concepts Tree Groups set of global trees allocations are made from the same chunk pool Global Node Pointers Tree Nodes link structure managed by GT body is user-defined structure

12 Example: Copying a Tree

13 Tree Traversals GT provides optimized, parallel traversals for common traversal orders Visitor callbacks are application-defined computations on a single node GT currently provides top-down, bottom- up, and level-wise traversals

14 Sample Traversal Usage

15 Node Mapping

16 Custom Allocation No single mapping of data elements to chunks will be optimal GT/GCL supports custom allocators to improve spatial locality Allocators can use a hint from call-site and can keep state between calls Default allocation is local-open

17 Experimental Results Evaluate using: Barnes-Hut from SPLASH-2 Compression operation from MADNESS GT compared with: Intel’s Cluster OpenMP and TreadMarks runtime UPC

18 Global Pointer Overhead Barnes-Hut compress()

19 Chunk Size and Bandwidth Experiments run on the department WCI Cluster - 2.33GHz Intel Xeon, 6GB RAM, Infiniband

20 Impact of Chunk Utilization Barnes-Hut Experiments run on the department WCI Cluster - 2.33GHz Intel Xeon, 6GB RAM, Infiniband

21 Barnes-Hut Chunk Size Selection Barnes-Hut application from SPLASH-2 suite

22 Barnes-Hut Scaling chunk size = 256, bodies = 512k

23 Local vs. Remote Access MADNESS compress()

24 Related Approaches Distributed Shared Memory (DSM) Cluster OpenMP, TreadMarks, Cashmere, Midway Distributed Shared Objects (DSO) Charm++, Linda, Orca Partitioned Global Address Space (PGAS) Languages and Systems UPC, Titanium, CAF, ARMCI, SHMEM, GASNET Shared pointer-based data structure support on distributed memory clusters Parallel Irregular Trees, Olden HPCS Programming Languages Chapel, X10

25 Future Work Global Graphs GT data reference locality tools More applications

26 Questions email: larkins@cse.ohio-state.edularkins@cse.ohio-state.edu


Download ppt "Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems D. Brian Larkins, James Dinan, Sriram Krishnamoorthy, Srinivasan."

Similar presentations


Ads by Google