Presentation is loading. Please wait.

Presentation is loading. Please wait.

EAVL EAVL E XTREME - SCALE A NALYSIS AND V ISUALIZATION L IBRARY Jeremy Meredith SDAV Next-Gen Library Meeting September, 2012.

Similar presentations


Presentation on theme: "EAVL EAVL E XTREME - SCALE A NALYSIS AND V ISUALIZATION L IBRARY Jeremy Meredith SDAV Next-Gen Library Meeting September, 2012."— Presentation transcript:

1 EAVL EAVL E XTREME - SCALE A NALYSIS AND V ISUALIZATION L IBRARY Jeremy Meredith SDAV Next-Gen Library Meeting September, 2012

2 History Originally ORNL LDRD Originally ORNL LDRD – Jeremy Meredith, Sean Ahern, Dave Pugmire – plus Rob Sisneros joined as a postdoc Many hours sitting in conference rooms arguing over things like “what does it mean to have one of your dimensions be unstructured?” Many hours sitting in conference rooms arguing over things like “what does it mean to have one of your dimensions be unstructured?” – then determine what to do that’s practical without falling off the data modeling deep end.... Exascale focus Exascale focus

3 Approaching the Exascale Problems Update traditional data model to handle modern simulation codes and a wider range of data. Update traditional data model to handle modern simulation codes and a wider range of data. Investigate how an updated data and execution model can achieve the necessary computational, I/O, and memory efficiency. Investigate how an updated data and execution model can achieve the necessary computational, I/O, and memory efficiency. Explore methods for visualization algorithm developers to achieve these efficiency gains and better support exascale architectures. Explore methods for visualization algorithm developers to achieve these efficiency gains and better support exascale architectures.

4 D ATA M ODELING C HALLENGES

5 Connectivity 3D Point Coordinates Cell Fields Point Fields Dimensions 3D Point Coordinates Cell Fields Point Fields Dimensions 3D Axis Coordinates Cell Fields Point Fields A Traditional Data Set Model Data Set RectilinearStructuredUnstructured

6 Challenge: Non-Physical Data Analysis Graph Data Graph Data – topologically 0D vertices, 1D edges – non-spatial; storing X/Y/Z values is wasted space Pure Parameter Studies Pure Parameter Studies – e.g. reaction rate of combustion FOUR “spatial” dimensions – e.g. methane concentration vs oxygen concentration vs temperature vs pressure more complex reaction  higher dimensionality oxygen methane temperature pressure

7 Challenge: Molecular Data (e.g., LAMMPS, VASP) To represent using vtkPolyData or vtkUnstructuredGrid: To represent using vtkPolyData or vtkUnstructuredGrid: – VTK_VERTEX cells for the atoms – VTK_LINE cells for the bonds Any field data must exist on both element types Any field data must exist on both element types – Not only inefficient: dummy bond strengths on the atoms? dummy atomic numbers on the bonds? – But also incorrect: e.g. average(BondStrength) uses dummy values from atoms? H C H C H H BondStr 1 2 AtomicNum 6 1

8 Challenge: Side Sets (e.g. Exodus, flux surfaces) The flow from A to B is defined on a set of faces The flow from A to B is defined on a set of faces The flux variable is defined only on those faces The flux variable is defined only on those faces – do you combine them into a single mesh? waste space on dummy values, potentially introducing errors – or create a separate mesh and lose the mapping info? horribly expensive and error-prone to recalculate mapping AB flux surface lives inside the volumetric mesh

9 Challenge: Dimensionality, Refinement (e.g. GenASiS) (a) seven (or eight) dimensional mesh (a) seven (or eight) dimensional mesh – f(x,y,z,ϴ,ϕ,λ,F)=E, plus time (b) refinement occurs on a per-cell basis (b) refinement occurs on a per-cell basis – can’t assume per-block refinement – sometimes referred to as “unstructured AMR”

10 Challenge: Unique Mesh Topologies (e.g. MADNESS) MADNESS does not have a traditional mesh MADNESS does not have a traditional mesh – Just a quad-tree with polynomial coefficients – Up to 30 refinement levels / tree depth www.vacet.org 3 1 2 4 5 root 1 2 3 4 5 root 1 2 3 4 5 spatial structure internal tree representation

11 Challenge: Very High Order Fields (e.g. MADNESS) Legendre polynomial series at each tree node Legendre polynomial series at each tree node – Each tree node has K dim coefficients – K can be up to approx. 20 i.e. 400 coeffs per tree node in 2D, 8000 in 3D www.vacet.org (example with K=3, dim=2) 0.8340.5920.003 0.5920.0030.010 0.0030.0100.007

12 T HE EAVL D ATA M ODEL

13 Connectivity 3D Point Coordinates Cell Fields Point Fields Dimensions 3D Point Coordinates Cell Fields Point Fields Dimensions 3D Axis Coordinates Cell Fields Point Fields A Traditional Data Set Model (again) Data Set RectilinearStructuredUnstructured

14 TreeConnectivityDimensionsFieldName Component Name Association Values Cells[] Points[] Fields[] The EAVL Data Set Model Data Set CellSet ExplicitStructured Coords Field QuadTree CellList Subset

15 Connectivity: (a bunch of cells) FieldName: “c” “c” “c” Component: 0 1 2 Name: “c” Association: Points Values[3*npts] Cells[1] Points[1] Fields[1] Example: An Unstructured Grid (with interleaved coordinates) eavlDataSet eavlExplicitCellSet eavlCoordinates eavlField

16 Connectivity: (a bunch of cells) FieldName: “x” “y” “z” Component: 0 0 0 Name: “x” Association: Points Values[npts] Cells[1] Points[1] Fields[3] Example: An Unstructured Grid (with separated coordinates) eavlDataSet eavlExplicitCellSet eavlCoordinates eavlField #0 Name: “y” Association: Points Values[npts] eavlField #1 Name: “z” Association: Points Values[npts] eavlField #2

17 RegularStructure: 30 40 30 FieldName: “x” “y” “z” Component: 0 0 0 Name: “x” Association: Points Values[npts] Cells[1] Points[1] Fields[3] Example: A Curvilinear Grid eavlDataSet eavlStructuredCellSet eavlCoordinates eavlField #0 Name: “y” Association: Points Values[npts] eavlField #1 Name: “z” Association: Points Values[npts] eavlField #2

18 RegularStructure: 30 40 30 FieldName: “x” “y” “z” Component: 0 0 0 Name: “x” Association: LogicalDim0 Values[ni] Cells[1] Points[1] Fields[3] Example: A Rectilinear Grid eavlDataSet eavlStructuredCellSet eavlCoordinates eavlField #0 Name: “y” Association: LogicalDim1 Values[nj] eavlField #1 Name: “z” Association: LogicalDim2 Values[nk] eavlField #2

19 RegularStructure: 30 40 30 4 360 FieldName: “x” “y” “z” “μ” “ϴ” Component: 0 0 0 0 0 Name: “x” Association: LogicalDim0 Values[ni] Cells[1] Points[1] Fields[5] Example: High-Dimensional Grid eavlDataSet eavlStructuredCellSet eavlCoordinates eavlField #0 Name: “y” Association: LogicalDim1 Values[nj] eavlField #1 Name: “z” Association: LogicalDim2 Values[nk] eavlField #2 Name: “μ” Association: LogicalDim3 Values[nμ] eavlField #3 Name: “ϴ” Association: LogicalDim4 Values[nϴ] eavlField #4

20 RegularStructure: 30 40 FieldName: “lat” “lon” Component: 0 0 Name: “lat” Association: LogicalDim0 Values[ni] Cells[1] Points[2] Fields[3] Example: Geospatial Data eavlDataSet eavlStructuredCellSet eavlCoordinates eavlField #0 Name: “lon” Association: LogicalDim1 Values[nj] eavlField #1 Name: “c” Association: Points Values[3*npts] eavlField #2 FieldName: “c” “c” “c” Component: 0 1 2 eavlCoordinates

21 Example: Molecular Data Connectivity: the atoms FieldName: “c” “c” “c” Component: 0 1 2 Name:”atomic number” Association: Cell Set #0 Values[ncells #0] Cells[2] Points[1] Fields[3] eavlDataSet eavlExplicitCellSet #0 eavlCoordinates eavlField #1 Connectivity: the bonds eavlExplicitCellSet #1 Name: “c” Association: Points Values[3*npts] eavlField #0 Name: “bond strength” Association: Cell Set #1 Values[ncells #1] eavlField #2

22 Example: Face-centered Data Connectivity: volumetric FieldName: “c” “c” “c” Component: 0 1 2 Cells[2] Points[1] Fields[2] eavlDataSet eavlExplicitCellSet eavlCoordinates Parent: ( ) eavlAllFacesOfExplicit Name: “c” Association: Points Values[3*npts] eavlField #0 Name: “facevariable” Association: Cell Set #2 Values[nfaces] eavlField #1

23 F ILTERING IN EAVL

24 Data flow networks in EAVL (or not) A “Filter” is a stage in a data flow network A “Filter” is a stage in a data flow network – Creates a new data set from an old one Many operations do not change a mesh structure (assuming data model is sufficiently descriptive) Many operations do not change a mesh structure (assuming data model is sufficiently descriptive) – Arithmetic expressions: only modifies fields – External facelist: points and structure remain – Feature edges: just a new cell set with old points – Smooth, displace, elevate: only modify coordinates So: eavlMutator is an alternative to eavlFilter So: eavlMutator is an alternative to eavlFilter – Modifies a data set in-place

25 eavlMutator In-place data set modification In-place data set modification Support for destructive in-place operation Support for destructive in-place operation – free memory as you go Execute multiple mutators simultaneously on the same data set (barring conflicts) Execute multiple mutators simultaneously on the same data set (barring conflicts) – e.g. displace (coords) + threshold (cells) concurrently How about data flow network support? How about data flow network support? – encapsulate an eavlMutator through a eavlFilterFromMutator facade Of course, some operations are natively eavlFilters Of course, some operations are natively eavlFilters – can facade through eavlMutatorFromFilter (?)

26 FieldName: “x” “y” “z” Component: 0 0 0 Explicit cells can be combined with structured coordinates. Explicit cells can be combined with structured coordinates. Example: Thresholding an RGrid (a) eavlCoordinates Name: “x” Association: LogicalDim0 Values[ni] eavlField#0 Name: “y” Association: LogicalDim1 Values[nj] eavlField#1 Name: “z” Association: LogicalDim2 Values[nk] eavlField#2 RegularStructure: 30 40 30 eavlStructuredCellSet FieldName: “x” “y” “z” Component: 0 0 0 eavlCoordinates Name: “x” Association: LogicalDim0 Values[ni] eavlField#0 Name: “y” Association: LogicalDim1 Values[nj] eavlField#1 Name: “z” Association: LogicalDim2 Values[nk] eavlField#2 Connectivity: (a bunch of cells) eavlExplicitCellSet

27 Cells: (…) Parent: ( ) A second Cell Set can be added which refers to the first one A second Cell Set can be added which refers to the first one Example: Thresholding an RGrid (b) RegularStructure: 30 40 30 eavlStructuredCellSeteavlSubset FieldName: “x” “y” “z” Component: 0 0 0 eavlCoordinates Name: “x” Association: LogicalDim0 Values[ni] eavlField#0 Name: “y” Association: LogicalDim1 Values[nj] eavlField#1 Name: “z” Association: LogicalDim2 Values[nk] eavlField#2 RegularStructure: 30 40 30 eavlStructuredCellSet Name: “x” Association: LogicalDim0 Values[ni] eavlField#0 Name: “y” Association: LogicalDim1 Values[nj] eavlField#1 Name: “z” Association: LogicalDim2 Values[nk] eavlField#2 FieldName: “x” “y” “z” Component: 0 0 0 eavlCoordinates

28 30x30 or 30x40 eavlStructSubset 30x30 or 30x40 eavlStructSubset Add six new subset-cell sets to original mesh Add six new subset-cell sets to original mesh Example: Structured External Facelist FieldName: “x” “y” “z” Component: 0 0 0 eavlCoordinates Name: “x” Association: Points Values[npts] eavlField#0 Name: “y” Association: Points Values[npts] eavlField#1 Name: “z” Association: Points Values[npts] eavlField#2 FieldName: “x” “y” “z” Component: 0 0 0 eavlCoordinates Name: “x” Association: Points Values[npts] eavlField#0 Name: “y” Association: Points Values[npts] eavlField#1 Name: “z” Association: Points Values[npts] eavlField#2 RegularStructure: 30 40 30 eavlStructuredCellSet 30 40 30 eavlStructCellSet 30x30 or 30x40 eavlStructSubset x6

29 No problem-sized data modifications. No problem-sized data modifications. – Interleaved and separated coordinates can be used simultaneously. Example: Elevating a Structured Grid FieldName: “c” “c” Component: 0 1 eavlCoordinates Name: “c” Association: Points Value[2*npts] eavlField#0 Name: “val” Association: Points Values[npts] eavlField#1 RegularStructure: 30 40 eavlStructuredCellSet FieldName: “c” “c” “val” Component: 0 1 0 eavlCoordinates RegularStructure: 30 40 eavlStructuredCellSet Name: “c” Association: Points Value[2*npts] eavlField#0 Name: “val” Association: Points Values[npts] eavlField#1

30 No problem-sized data modifications. No problem-sized data modifications. – Some axes on logical dims, with others on the points. Example: Elevating a Regular Grid FieldName: “x” “y” Component: 0 0 eavlCoordinates Name: “x” Association: LogicalDim0 Values[ni] eavlField#0 Name: “y” Association: LogicalDim1 Values[nj] eavlField#1 Name: “val” Association: Points Values[npts] eavlField#2 RegularStructure: 30 40 eavlStructuredCellSet Name: “x” Association: LogicalDim0 Values[ni] eavlField#0 Name: “y” Association: LogicalDim1 Values[nj] eavlField#1 Name: “val” Association: Points Values[npts] eavlField#2 FieldName: “x” “y” “val” Component: 0 0 0 eavlCoordinates RegularStructure: 30 40 eavlStructuredCellSet

31 D EALING W ITH C ONCURRENCY

32 Concurrency at Multiple Levels Distributed Parallelism Distributed Parallelism – Message passing still works well – Avoid global communication local domain interconnectivity information – Hybrid (e.g. spatiotemporal) parallelism Task Parallelism Task Parallelism – Fine-grain dependency tracking e.g. displace (coords) + threshold (cells) concurrently – eavlMutator helps – single eavlDataSet container class helps Thread Parallelism Thread Parallelism – Fine-grain data parallelism; CUDA, OpenMP

33 Data Parallelism for Developers Functor + iterator paradigm Functor + iterator paradigm Iteration patterns for mesh topologies Iteration patterns for mesh topologies CUDA + OpenMP execution back-ends CUDA + OpenMP execution back-ends

34 A Simple Data-Parallel Operation void CellToCellDivide(Field &a, Field &b, Field &b, Field &c) Field &c){ for_each(i) for_each(i) c[i] = a[i] / b[i]; c[i] = a[i] / b[i];} void CalculateDensity(...) { //... //... CellToCellDivide(mass, volume, density); CellToCellDivide(mass, volume, density);} Internal Library API Provides This Algorithm Developer Writes This

35 Functor + Iterator Approach void CalculateDensity(...) { //... //... CellToCellBinaryOp(mass, volume, density, Divide()); CellToCellBinaryOp(mass, volume, density, Divide());} template void CellToCellBinaryOp (Field &a, Field &b, Field &b, Field &c Field &c T &f) T &f){ for_each(i) for_each(i) f(a[i],b[i],c[i]); f(a[i],b[i],c[i]);} struct Divide { void operator()(float &a, void operator()(float &a, float &b, float &b, float &c) float &c) { c = a / b; c = a / b; }}; Internal Library API Provides This Algorithm Developer Writes This

36 Custom Functor void CalculateDensity(...) { //... //... CellToCellBinaryOp(mass, volume, density, MyFunctor()); CellToCellBinaryOp(mass, volume, density, MyFunctor());} template void CellToCellBinaryOp (Field &a, Field &b, Field &b, Field &c Field &c T &f) T &f){ for_each(i) for_each(i) f(a[i],b[i],c[i]); f(a[i],b[i],c[i]);} struct MyFunctor { void operator()(float &a, void operator()(float &a, float &b, float &b, float &c) float &c) { c = a + 2*log(b); c = a + 2*log(b); }}; Algorithm Developer Writes These Internal Library API Provides This

37 Functor Efficiency on CPU and GPU Data: noise.silo Data: noise.silo Surface normal Surface normal

38 Binding Values to Functors struct ScaleByConst { float scale; float scale; ScaleByConst(float s) : scale(s) { } ScaleByConst(float s) : scale(s) { } void operator()(float &a, float &b) void operator()(float &a, float &b) { b = a * scale; b = a * scale; }}; void CalculateDensity(...) { //... //... cell_volume = mesh_volume / mesh_numcells; cell_volume = mesh_volume / mesh_numcells; CellToCellUnaryOp(mass, density, ScaleByConst(1.0/cell_volume)); CellToCellUnaryOp(mass, density, ScaleByConst(1.0/cell_volume));}

39 D ATA P ARALLELISM B ASICS

40 Map with 1 input, 1 output Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone. x 370140045310 01234567891011 61402800810620 struct f { float operator()(float x) { return x*2; } float operator()(float x) { return x*2; }}; result

41 Map with 2 inputs, 1 output With two input arrays, the functor takes two inputs. You can also have multiple outputs. x 370140045310 01234567891011 5 221239910431 struct f { float operator()(float a, float b) { return a+b; } float operator()(float a, float b) { return a+b; }}; result y 242183955121

42 Scatter with 1 input (and thus 1 output) Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.) Often used in a scatter_if –type construct. x 370140045310 01234567891011 0130 No functor result indices 4 241550421214

43 Gather with 1 input (and thus 1 output) Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient. x 370140045310 01234567891011 73031 No functor result indices 19693

44 Reduction with 1 input (and thus 1 output) Example: max-reduction. Sum is also common. Often a fat-tree-based implementation. x 370140045310 01234567891011 7 result struct f { float operator()(float a, float b) { return a>b ? a : b; } float operator()(float a, float b) { return a>b ? a : b; }};

45 Inclusive Prefix Sum (a.k.a. Scan) with 1 input/output Value at result[i] is sum of values x[0]..x[i]. Surprisingly efficient parallel implementation. Basis for many more complex algorithms. x 370140045310 01234567891011 310 1115 19242728 No functor. result +++++++++++

46 Exclusive Prefix Sum (a.k.a. Scan) with 1 input/output Initialize with zero, value is sum of only up to x[i-1]. May be more commonly used than inclusive scan. x 370140045310 01234567891011 010 1115 19242728 No functor. result 3 +++++++++++ 0

47 D ATA P ARALLELISM ON M ESHES

48 Example: Surface Normal For each 2D cell (i.e. each polygon): For each 2D cell (i.e. each polygon): – Get three adjacent points – Pair-wise vector subtract – Cross product Data-parallel: Data-parallel: – Repeat for all cells

49 Example: Surface Normal INPUT: INPUT: – 3-dimensional coordinates array on the mesh NODES – example: length = 9 OUTPUT: OUTPUT: – 3-component surface normals array on the mesh CELLS – example: length = 4

50 Under the Covers: Node-to-Cell on CPU void NodeToCellOp3::ExecuteCPU() { #pragma omp parallel for for (int i=0; i NumCells(); i++) for (int i=0; i NumCells(); i++) { // get cell node indices // get cell node indices int nNodes, nodeIds[8]; int nNodes, nodeIds[8]; float nodeValues[3][8]; float nodeValues[3][8]; conn.GetCellNodes(index, nNodes, nodeIds); conn.GetCellNodes(index, nNodes, nodeIds); // get coordinates for nodes // get coordinates for nodes for (int i=0; i<nNodes; i++) for (int i=0; i<nNodes; i++) { nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; } // call functor // call functor functor(nodeValues[0], nodeValues[1], nodeValues[2], functor(nodeValues[0], nodeValues[1], nodeValues[2], &out0[i], &out1[i], &out2[i]); &out0[i], &out1[i], &out2[i]); }}

51 Under the Covers: Node-to-Cell on GPU void NodeToCellOp3::ExecuteGPU() { float *d_arr0 = (float*)array0->GetCUDAArray(); float *d_arr0 = (float*)array0->GetCUDAArray(); float *d_arr1 = (float*)array1->GetCUDAArray(); float *d_arr1 = (float*)array1->GetCUDAArray(); float *d_arr2 = (float*)array2->GetCUDAArray(); float *d_arr2 = (float*)array2->GetCUDAArray(); float *d_out = (float*)output0->GetCUDAArray(); float *d_out = (float*)output0->GetCUDAArray(); // calculate CUDA thread grid num blocks & threads // calculate CUDA thread grid num blocks & threads // based on input->NumCells() // based on input->NumCells() nodeToCellKernel3 >>(d_arr0, d_arr1, d_arr2, nodeToCellKernel3 >>(d_arr0, d_arr1, d_arr2, d_out0, d_out1, d_out2, d_out0, d_out1, d_out2, conn, functor); conn, functor);}

52 Under the Covers: The CUDA Kernel template template __global__ void NodeToCellKernel3(float *array0, float *array1, float *array2, float *out0, float *array2, float *out0, float *out1, float *out2, float *out1, float *out2, ExplicitConnectivity conn, ExplicitConnectivity conn, F functor) F functor){ const int index = blockIdx.x * blockDim.x + threadIdx.x; const int index = blockIdx.x * blockDim.x + threadIdx.x; // get cell node indices // get cell node indices int nNodes, nodeIds[8]; int nNodes, nodeIds[8]; float nodeValues[3][8]; float nodeValues[3][8]; conn.GetCellNodes(index, nNodes, nodeIds); conn.GetCellNodes(index, nNodes, nodeIds); // get coordinates for nodes // get coordinates for nodes for (int i=0; i<nNodes; i++) for (int i=0; i<nNodes; i++) { nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[0][i] = array0[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[1][i] = array1[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; nodeValues[2][i] = array2[nodeIds[i]]; } // call functor // call functor functor(nodeValues[0], nodeValues[1], nodeValues[2], functor(nodeValues[0], nodeValues[1], nodeValues[2], &out0[i], &out1[i], &out2[i]); &out0[i], &out1[i], &out2[i]);}

53 W RITING A LGORITHMS IN EAVL

54 Face Surface Normal Functor+Iterator struct PolyNormalFunctor { void operator()(int nvals, int shapetype, void operator()(int nvals, int shapetype, float x[], float y[], float z[], float x[], float y[], float z[], float *nx, float *ny, float *nz) float *nx, float *ny, float *nz) { // get two adjacent edge vectors // get two adjacent edge vectors float ax = x[1]-x[0], ay = y[1]-y[0], az = z[1]-z[0]; float ax = x[1]-x[0], ay = y[1]-y[0], az = z[1]-z[0]; float bx = x[2]-x[1], by = y[2]-y[1], bz = z[2]-z[1]; float bx = x[2]-x[1], by = y[2]-y[1], bz = z[2]-z[1]; // calculate their cross product // calculate their cross product *nx = ay*bz - az*by; *nx = ay*bz - az*by; *ny = az*bx - ax*bz; *ny = az*bx - ax*bz; *nz = ax*by - ay*bx; *nz = ax*by - ay*bx; }}; void CalculateFaceNormals(...) { eavlExecutor::AddOperation( eavlExecutor::AddOperation( new eavlTopologyMap_3_3(inputcells, EAVL_NODES_OF_CELLS, new eavlTopologyMap_3_3(inputcells, EAVL_NODES_OF_CELLS, xcoord, ycoord, zcoord, xcoord, ycoord, zcoord, xnormal, ynormal, xnormal, xnormal, ynormal, xnormal, PolyNormalFunctor())); PolyNormalFunctor()));}

55 Making it Easy for Developers Single functor code works on either CPU or GPU Single functor code works on either CPU or GPU Iteration patterns have multiple back-ends Iteration patterns have multiple back-ends CPU vs GPU execution choice is made at runtime CPU vs GPU execution choice is made at runtime EAVL array class automatically supports heterogeneous memory spaces for GPUs EAVL array class automatically supports heterogeneous memory spaces for GPUs Internal APIs support CUDA if developers want to write their own CUDA kernels Internal APIs support CUDA if developers want to write their own CUDA kernels

56 Algorithms are Sequences of Operations (a) calculate face surface normals with node-to-cell operation (b) average to point normals via cell-to-node operation

57 E XAMPLE : T HRESHOLD

58 475263 3249 31 Starting Mesh We want to threshold a mesh based on its density values (shown here). 43475263 32384249 31374138 density 434752633238424931374138 01234567891011 If we threshold 35 < density < 45, we want this result: 43 3842 374138

59 Which Cells to Include? Evaluate a Map operation with this functor: struct InRange { float lo, hi; float lo, hi; InRange ( float l, float h ) : lo ( l ), hi ( h ) { } InRange ( float l, float h ) : lo ( l ), hi ( h ) { } int operator ()( float x ) { return x>lo && x lo && x<hi ; }} 1000 0110 0111 density 434752633238424931374138 01234567891011 100001100111 inrange InRange()

60 How Many Cells in Output? Evaluate a Reduce operation using the Add<> functor. We can use this to create output cell length arrays. 1000 0110 0111 01234567891011 100001100111 inrange 6 result plus

61 Where Do the Output Cells Go? InputindicesOutputindices 0123 4567 891011 01234567891011 012345output cell 0 12 345 input cell How do we create this mapping?

62 Create Input-to-Output Indexing? Exclusive Scan (exclusive prefix sum) gives us the output index positions. 0 12 345 01234567891011 100001100111 inrange 011111233345 startidx +++++++++++ 0

63 Create Output-to-Input Indexing? We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index. 01234567891011 05691011 revindex 011111233345 0 56 91011

64 density Gather Input Mesh Arrays to Output? We can now use simple gathers to pull input arrays (density, pressure) into the output mesh. 01234567891011 434752633238424931374138 05691011 revindex 433842374138 43 3842 374138 output_ density

65 P LANS AND S TATUS

66 Plans and Status Now Open Source (BSD “2-clause”) Now Open Source (BSD “2-clause”) – Project website at http://ft.ornl.gov/eavl – Source code, docs at https://github.com/jsmeredith/EAVL Data Model Data Model – Most infrastructure in place – Still a few areas not fully fleshed out – The client-facing API needs work Execution Model Execution Model – Currently imperative style – Exploring pipeline paths, contract/metadata needs Productization Status Productization Status – More iterators – More optimization – Lots of internal cleanup


Download ppt "EAVL EAVL E XTREME - SCALE A NALYSIS AND V ISUALIZATION L IBRARY Jeremy Meredith SDAV Next-Gen Library Meeting September, 2012."

Similar presentations


Ads by Google