Out-of core Streamline Generation Using Flow-guided File Layout Chun-Ming Chen 788 Project 1
Background Visualize flow fields with streamlines Scientific data is huge – Traditional: Compute in clusters – Drawbacks: High Equipment Cost Inter-node communication 2
Background Nowadays: multi-core CPU on single machine May not have enough memory capacity Out-of-core computation is needed – Out-of-core: data cannot be fully loaded into main memory 3
Goal Compute streamlines on a lower-cost multi-core machine with limited memory, given arbitrary seeds 4
Demand Paging Algorithm Preparation Stage: – Break flow fields into blocks Streamline Generation Stage: – Only load needed blocks during computation – Release least recently used (LRU) block when memory full 5 Load data from Disk Compute Release data (LRU) Store data in memory pool
Multi-core streamline computation 6 Threaded Computation Seeds for block 1 Seeds for block 2 Seeds for block 3 Seeds for block 4 Threaded Computation New seeds generated from block 1 Job Queue
Problem of Out-of-core Computation Earlier tests: 1Gb Data – Environment: 8-core Intel Machine Limit 25Mb memory usage – Time Generating streamlines: s – Time Loading flow field : s IO is the bottle neck 7
More tests Read all blocks in a 6Gb data Unit block size: float 16x16x16 (49152 bytes) Total 131,072 blocks – Random access: sec – Sequential read: sec – Reverse-Sequential read: sec Sequential read can be 20 times faster Reason: Disk Prefetching 8
File Layout Re-arrange data to increase more sequential reads Hilbert Curve Layout: 9
Result of Scheduling for Hilbert Curve Layout Scheduler: only read forward Test: 1Gb Data – Environment: 8-core Intel Machine Limit 25Mb memory usage Old test: – Time Generating streamlines: s – Time Loading flow field : s Hilbert layout: – Time Generating streamlines: s – Time Loading flow field : s 10
Layout By Flow Direction 11
Next And Conclusion Next: – Better layout? – Re-arrange data based on flow direction – NP-hard Problem Conclusion: – If we want to analyze large scientific data in a single machine, out-of-core computation is required now and also in the future – Good File layout is important for out-of-core computation 12