Algebraic Manipulation of Scientific Datasets Bill Howe and David Maier OGI School of Science and Engineering at Oregon Health and Science University Portland State University
Environmental Observation and Forecasting on the Columbia River Sensors Simulation Data Products
Gridded Scientific Datasets C 12.6 C 13.1 C 13.2 C 12.8 C 12.5 C
Some CORIE Grids H = 2d Horizontal Grid T = 1d Time Grid V = 1d Vertical Grid Mean Sea Level Underground (not shown)
Thesis Grid topology requires explicit data model support. Transformations can be expressed via composition of a few logical operators. Performance can be preserved via algebraic optimization and specialized operator implementations.
Roadmap DomainIntroduction Model Introduction Conventional Approaches Examples of Optimization Conclusion
Grid Topology –A collection of cells of various dimensions, –implicit or explicit incidence relationships 1 A B m n o p q 2-Cells0-Cells A0 A1 A3 B1 B2 B3 1-Cells0-Cells m 0 m 1 n 1 n 2 : : 2-Cells = {A,B} 1-Cells = {m,n,o,p} 0-Cells = {0,1,2,3}
Grid Properties Topology <> geometry A grid may contain cells of –multiple dimensions –multiple “shapes” Dimension of a grid is the maximum dimension of its cells 1 A B m n o p q 4 r
GridField: Grid with Bound Data Tuples of numeric primitives Total functions over cells of dimension k Two gridfields may share a grid xysalttemp x1y x2y x3y x4y fluxarea
Roadmap DomainIntroduction Model Introduction Conventional Approaches Examples of Optimization Conclusion
1) Modeling with Relations trivial join dependency embedded in the key –decomposition won’t help –no notion of “grid” xytsalttemp x1y x2y x3y x4y x1y x2y x3y x4y Node Data cidfluxarea a b c Cell Data G G cidxy ax1y1 ax2y2 ax4y4 bx2y2 ::: Incidence
2) Spatial Extensions Incidence relationship dependent on geometry rather than topology Geometry information redundantly defined in nodes and cells No concept of a “grid”: impedance mismatch with visualization applications Node::Pointtsalttemp Point(x1,y1) Point(x2,y2) Point(x3,y3) Point(x4,y4) Point(x1,y1) Point(x2,y2) Point(x3,y3) Point(x4,y4) Node Data Cell::Polygonfluxarea Polygon(Point(x1,y1),…) Polygon(Point(x2,y2),…) Polygon(Point(x1,y1),…) Cell Data
3) Visualization Libraries Different algorithms, each dependent on data characteristics. Programmer’s responsibility to match algorithms with data Logical equivalences are obscured vtkExtractGeometry vtkThreshold vtkExtractGrid vtkExtractVOI vtkThresholdPoints Grid restriction: With VTK: restrict
Roadmap DomainIntroduction Model Introduction Conventional Approaches Examples of Optimization Conclusion
associate grids with data combine grids topologically reduce a grid using data values transform grids or data bind (b) union, intersection, cross product ( ) restrict (r) aggregate (a) TaskOperator Operators
Restrict Semantics restrict(<24) Values bound to 0-cells (nodes) Values bound to 2-cells (triangles)
Working With GridFields H : (x,y,b) V : (z) r(z>b ) b(s) r(region) rende r HV (H V) r(H V) b(r(H V))r(b(r(H V))) “wetgrid”
Optimize: Push Restricts salt,temp defined on G Materialize pointers to elements of salt, temp Bind salt, temp to a subgrid of G, G ' G = s1s1 s2s2 s3s3 s4s4 s5s5 t1t1 t2t2 t3t3 t4t4 t5t5 :::: G' = s1s1 s3s3 s5s5 :::: t1t1 t3t3 t5t5 salt = temp = salt' = temp' = r(p(x,y)) r(p(z)) r(z>b ) b(s) H : (x,y,b) V : (z)
Optimization Results
Horizontal Slice H(x,y,b) V(z) r(z>b) b(s ) slice H(x,y,b) r(z>b) b(s ) apply
Transect (Vertical Slice) H(x,y,b) V(z) r(z>b) b(s) “join” P P V
Transect (Vertical Slice) V(z) P H(x,y,b) “join” b(s) “join” A B C P
A
B C
Transect Optimizations