Presentation is loading. Please wait.

Presentation is loading. Please wait.

Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:

Similar presentations


Presentation on theme: "Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:"— Presentation transcript:

1 Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors: Dorit Hammerling Raghuraj Prasanna Kumar Rich Loft

2 Introduction High resolution global measurements of large areas.
Accurate representation and processing of spatial data. Predict trends in global climate. Traditional methods – computationally infeasible. Multi-resolution approximation (MRA)

3 Multi-resolution approximation (MRA)
Spatial statistics – make parameter inference and spatial predictions Computational inference using traditional spatial statistical approach – difficult to parallelize. For the number of observations n: Computational complexity - O(n3) Memory complexity – O(n2) MRA - approximate remainder independently Exploit parallelism Reduce memory requirement Sequential MRA: Computational complexity – O(n log2 n) Memory complexity – O(n log n)

4 Multi-resolution Approximation (MRA)
Spatial domain is recursively partitioned Spatial process – linear combination of basis functions at multiple spatial resolutions Similar to multi-grid algorithm

5 Outline of the algorithm
Creation of prior Posterior inference

6 Implementations Existing implementation
Original implementation - sequential MRA Full-layer parallelism Alternatives: Hyper-segmentation Shallow trees

7 Full-layer parallel approach
Sequential execution

8 Full-layer parallel approach
Sequential execution

9 Full-layer parallel approach
Sequential execution

10 Full-layer parallel approach
Sequential execution

11 Full-layer parallel approach
Regions within a resolution layer are executed in parallel Layers are executed sequentially Sequential execution

12 Hyper-segmentation

13 Hyper-segmentation

14 Hyper-segmentation

15 Hyper-segmentation

16 Hyper-segmentation

17 Hyper-segmentation

18 Hyper-segmentation

19 Hyper-segmentation First step towards reducing memory footprint
Trades-off parallelism for memory

20 Shallow tree approach

21 Shallow tree approach

22 Shallow tree approach

23 Shallow tree approach

24 Shallow tree approach

25 Shallow tree approach

26 Shallow tree approach

27 Shallow tree approach

28 Shallow tree approach

29 Shallow tree approach

30 Shallow tree approach

31 Shallow tree approach

32 Shallow tree approach

33 Shallow tree approach

34 Shallow tree approach

35 Shallow tree approach

36 Shallow tree approach Partitioning into shallow trees start at a certain resolution level Sub-trees (shallow trees) can be executed sequentially or in a distributed fashion Regions within the shallow tree resolution layers can be executed in parallel

37 Experimental methodology
Matlab used for implementation Geyser: Hardware per node: Four 10 core, 2.4 GHz Intel Xeon E (Westmere EX) processors per node 1 TB DDR memory per node Single node (40 cores) implementation of full-layer parallel, hyper-segmentation, and shallow trees. PMET = Peak memory × Execution time

38 Performance Number of layers M=9, Children/parent J=4, Knots/region r=32

39 Performance Number of layers M=9, Children/parent J=4, Knots/region r=64

40 Performance Number of layers M=10, Children/parent J=4, Knots/region r=32

41 Performance Number of layers M=11, Children/parent J=4, Knots/region r=25

42 Performance Number of layers M=11, Children/parent J=4, Knots/region r=40

43 Execution cost - PMET

44 Moving to distributed memory
No distributed computing server for Matlab on Yellowstone Hack: Matlab MPI by Lincoln laboratory, MIT Uses file I/O protocol to implement MPI There must be a directory visible to every machine Python to run MPI and call Matlab functions

45 Conclusion Improvement over existing implementations
Capable of reducing the memory footprint by a factor of ~3 Able to increase the maximum size of data set that could be processed by MRA The implemented algorithms are theoretically well scalable

46 Future work Restructure the data types.
Rewrite the code in lower level language to exploit more levels of parallelism. Potential for GPU implementation.

47 Acknowledgements Thanks to my mentors: Dorit Hammerling, Raghuraj Prasanna Kumar, Rich Loft. Thanks to Sophia Chen (high school intern) for the graphics used in this presentation. Thanks to Patrick Nichols, Shiquan Su, Brian Vanderwende, Davide Del Vento, Richard Valent. Thanks to all the NCAR administrative staff.

48 Thank you Questions?


Download ppt "Restructuring the multi-resolution approximation for spatial data to reduce the memory footprint and to facilitate scalability Vinay Ramakrishnaiah Mentors:"

Similar presentations


Ads by Google