Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:

Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors: Dorit Hammerling Raghuraj Prasanna Kumar Rich Loft

Introduction High resolution global measurements of large areas.
Accurate representation and processing of spatial data. Predict trends in global climate. Traditional methods – computationally infeasible. Multi-resolution approximation (MRA)

Multi-resolution approximation (MRA)
Spatial statistics – make parameter inference and spatial predictions Computational inference using traditional spatial statistical approach – difficult to parallelize. For the number of observations n: Computational complexity - O(n3) Memory complexity – O(n2) MRA - approximate remainder independently Exploit parallelism Reduce memory requirement Sequential MRA: Computational complexity – O(n log2 n) Memory complexity – O(n log n)

Multi-resolution Approximation (MRA)
Spatial domain is recursively partitioned Spatial process – linear combination of basis functions at multiple spatial resolutions Similar to multi-grid algorithm

Outline of the algorithm
Creation of prior Posterior inference

Implementations Existing implementation
Original implementation - sequential MRA Full-layer parallelism Alternatives: Hyper-segmentation Shallow trees

Full-layer parallel approach
Sequential execution

Full-layer parallel approach
Regions within a resolution layer are executed in parallel Layers are executed sequentially Sequential execution

Hyper-segmentation

Hyper-segmentation First step towards reducing memory footprint
Trades-off parallelism for memory

Shallow tree approach

Shallow tree approach Partitioning into shallow trees start at a certain resolution level Sub-trees (shallow trees) can be executed sequentially or in a distributed fashion Regions within the shallow tree resolution layers can be executed in parallel

Experimental methodology
Matlab used for implementation Geyser: Hardware per node: Four 10 core, 2.4 GHz Intel Xeon E (Westmere EX) processors per node 1 TB DDR memory per node Single node (40 cores) implementation of full-layer parallel, hyper-segmentation, and shallow trees. PMET = Peak memory × Execution time

Performance Number of layers M=9, Children/parent J=4, Knots/region r=32

Execution cost - PMET

Moving to distributed memory
No distributed computing server for Matlab on Yellowstone Hack: Matlab MPI by Lincoln laboratory, MIT Uses file I/O protocol to implement MPI There must be a directory visible to every machine Python to run MPI and call Matlab functions

Conclusion Improvement over existing implementations
Capable of reducing the memory footprint by a factor of ~3 Able to increase the maximum size of data set that could be processed by MRA The implemented algorithms are theoretically well scalable

Future work Restructure the data types.
Rewrite the code in lower level language to exploit more levels of parallelism. Potential for GPU implementation.

Acknowledgements Thanks to my mentors: Dorit Hammerling, Raghuraj Prasanna Kumar, Rich Loft. Thanks to Sophia Chen (high school intern) for the graphics used in this presentation. Thanks to Patrick Nichols, Shiquan Su, Brian Vanderwende, Davide Del Vento, Richard Valent. Thanks to all the NCAR administrative staff.

Thank you Questions?

Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:

Similar presentations

Presentation on theme: "Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:

Similar presentations

Presentation on theme: "Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:"— Presentation transcript:

Similar presentations

About project

Feedback