Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors: Dorit Hammerling Raghuraj Prasanna Kumar Rich Loft
Introduction High resolution global measurements of large areas. Accurate representation and processing of spatial data. Predict trends in global climate. Traditional methods – computationally infeasible. Multi-resolution approximation (MRA)
Multi-resolution approximation (MRA) Spatial statistics – make parameter inference and spatial predictions Computational inference using traditional spatial statistical approach – difficult to parallelize. For the number of observations n: Computational complexity - O(n3) Memory complexity – O(n2) MRA - approximate remainder independently Exploit parallelism Reduce memory requirement Sequential MRA: Computational complexity – O(n log2 n) Memory complexity – O(n log n)
Multi-resolution Approximation (MRA) Spatial domain is recursively partitioned Spatial process – linear combination of basis functions at multiple spatial resolutions Similar to multi-grid algorithm
Outline of the algorithm Creation of prior Posterior inference
Implementations Existing implementation Original implementation - sequential MRA Full-layer parallelism Alternatives: Hyper-segmentation Shallow trees
Full-layer parallel approach Sequential execution
Full-layer parallel approach Sequential execution
Full-layer parallel approach Sequential execution
Full-layer parallel approach Sequential execution
Full-layer parallel approach Regions within a resolution layer are executed in parallel Layers are executed sequentially Sequential execution
Hyper-segmentation
Hyper-segmentation
Hyper-segmentation
Hyper-segmentation
Hyper-segmentation
Hyper-segmentation
Hyper-segmentation
Hyper-segmentation First step towards reducing memory footprint Trades-off parallelism for memory
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach
Shallow tree approach Partitioning into shallow trees start at a certain resolution level Sub-trees (shallow trees) can be executed sequentially or in a distributed fashion Regions within the shallow tree resolution layers can be executed in parallel
Experimental methodology Matlab used for implementation Geyser: Hardware per node: Four 10 core, 2.4 GHz Intel Xeon E7-4870 (Westmere EX) processors per node 1 TB DDR3-1600 memory per node Single node (40 cores) implementation of full-layer parallel, hyper-segmentation, and shallow trees. PMET = Peak memory × Execution time
Performance Number of layers M=9, Children/parent J=4, Knots/region r=32
Performance Number of layers M=9, Children/parent J=4, Knots/region r=64
Performance Number of layers M=10, Children/parent J=4, Knots/region r=32
Performance Number of layers M=11, Children/parent J=4, Knots/region r=25
Performance Number of layers M=11, Children/parent J=4, Knots/region r=40
Execution cost - PMET
Moving to distributed memory No distributed computing server for Matlab on Yellowstone Hack: Matlab MPI by Lincoln laboratory, MIT Uses file I/O protocol to implement MPI There must be a directory visible to every machine Python to run MPI and call Matlab functions
Conclusion Improvement over existing implementations Capable of reducing the memory footprint by a factor of ~3 Able to increase the maximum size of data set that could be processed by MRA The implemented algorithms are theoretically well scalable
Future work Restructure the data types. Rewrite the code in lower level language to exploit more levels of parallelism. Potential for GPU implementation.
Acknowledgements Thanks to my mentors: Dorit Hammerling, Raghuraj Prasanna Kumar, Rich Loft. Thanks to Sophia Chen (high school intern) for the graphics used in this presentation. Thanks to Patrick Nichols, Shiquan Su, Brian Vanderwende, Davide Del Vento, Richard Valent. Thanks to all the NCAR administrative staff.
Thank you Questions?