Presentation is loading. Please wait.

Presentation is loading. Please wait.

C ROSS D ISCIPLINARY A PPLICATIONS OF M ULTIPLEX O BSERVATIONAL AND C OMPUTATIONAL D ATASETS USING FOR A RCHIVING AND H IGH P ERFORMANCE P ROCESSING. Marcel.

Similar presentations


Presentation on theme: "C ROSS D ISCIPLINARY A PPLICATIONS OF M ULTIPLEX O BSERVATIONAL AND C OMPUTATIONAL D ATASETS USING FOR A RCHIVING AND H IGH P ERFORMANCE P ROCESSING. Marcel."— Presentation transcript:

1 C ROSS D ISCIPLINARY A PPLICATIONS OF M ULTIPLEX O BSERVATIONAL AND C OMPUTATIONAL D ATASETS USING FOR A RCHIVING AND H IGH P ERFORMANCE P ROCESSING. Marcel Ritter, Werner Benger, Joseph Stoeckl, Donna Delparte, Mike Folk, Quincey Koziol, Frank Steinbacher and Markus Aufleger Center for Computation & Technology ASTRO@U IBK

2 Outlook Motivation Requirements on a Data Format Introduction HDF5 F5 – Introduction – Examples of Data Sets Application Example: – The Hawaiian Geospatial Data Repository Conclusion

3 Motivation Workgroup A Workgroup B Workgroup C Workgroup D Software 3 Software 4 Scientific Collaboration

4 Motivation Workgroup A Workgroup B Workgroup C Workgroup D Software Tool 1 Software 3 Software 4 Software Tool 2 File Format 2 Scientific Collaboration File Format 1

5 Motivation Workgroup A Workgroup B Workgroup C Workgroup D Software Tool 1 Software 3 Software 4 Software Tool 2 File Format 2 File Format 1 Data Exchange

6 Motivation Workgroup C Workgroup D Software 3 Software 4 File Format 2 File Format 1 File Format 3 File Format 4 File Format 5 … File Format N

7 Motivation Workgroup C Workgroup D Software 3 Software 4 File Format 2 File Format 1 File Format 3 File Format 4 File Format 5 … File Format N Huge Implementation Effort o(N 2 )

8 Motivation Workgroup C Workgroup D Software 3 Software 4 File Format 2 File Format 1 File Format 3 File Format 4 File Format 5 … File Format N Common Data Format Common Data Format Less Implementation Effort o(N)

9 Motivation Workgroup A Workgroup B Workgroup C Workgroup D Software 3 Software Tool 1 Software 4 Software Tool 2 Common Data Format Common Data Format Easier collaboration More time for science

10 Easy to read and write Fast and efficient Hold huge data sets ( Terabytes ) Multiple operating systems Hold huge variety of data Store meta information of the data Self-descriptive Well-documented, active support and community Sustainable (still easily accessible in >10 years) Easy access Fast and efficient Huge data (Terabytes) Huge variety of data Self- descriptive Well documented and user community Sustainable (>10 years)

11 Hierarchical Data Format 5 http://www.hdfgroup.org/HDF5

12 - A Few Analogies File system (in a file) Binary XML file PDF for numerical data Database (container for array variables)

13 - Relationships lat | lon | temp ----|-----|----- 12 | 23 | 3.1 12 | 23 | 3.1 15 | 24 | 4.2 15 | 24 | 4.2 17 | 21 | 3.6 17 | 21 | 3.6 / Parameters 10;100;1000 Timestep 36,000 Group Dataset Attribute

14 -What Users Get… A multi-platform library and tools built on over 10 years experience in large data handling from the high performance computing community (HPC). A capability that: – Lets them organize large and/or complex collections of data – Gives them efficient and scalable data storage and access – Lets them integrate a wide variety of types of data and data sources – Guarantees long-term data integrity and preservation 14

15 Shapefiles: HDF5 as container format Browser application

16 Shapefiles: HDF5 as container format Browser application Vector data Pixel data Attribute data

17 - More Applications Billions of elements/dozens associated values Earth Science (Earth Observing System) Big simulations Movie Making Flight Testing

18

19 Fiber Bundle Data Model http://www.fiberbundle.net

20 Based on HDF5 Inspired by concepts of: – Topology – Differential Geometry – Geometric Algebra Separation of Geometry (Grids) and Datafield (Fields) Grid Field

21 Hierarchical Structure:

22 Visible to the end user Hierarchical Structure:

23

24 Multi Channel – Multi Resolution Images:

25 TimeGridTopologyRepresentationField [Datatype] /1.4/Satellite/VertexRefinement1x1/Cartesian/Positions [uniform-grid] /RGB [byte,byte,byte] /N-IR [float64] /T-IR [float64] /VertexRefinement2x2/Cartesian/Positions /RGB “ /N-IR /T-IR /1.6/ … /1.4/Satellite/VertexRefinement1x1/Cartesian/Positions [uniform-grid] /RGB [byte,byte,byte] /N-IR [float64] /T-IR [float64] /VertexRefinement2x2/Cartesian/Positions /RGB “ /N-IR /T-IR /1.6/ …

26 Full Waveform LIDAR: t1t2 t3 t_emission

27 TimeGridTopologyRepresentationField[Datatype] /CorseTime/LASER/POINTS/CartesianCoords/Positions [point3D] /TimeStamp [float64] /Waveform [uint16,uint16] /Reflectance [float32] /SHOTS /SHOTSAsPOINTS/Positions vlen[uint32] /Origin[point3D] /Direction[vector3D] /EmissionTime[float64] /CorseTime/LASER/POINTS/CartesianCoords/Positions [point3D] /TimeStamp [float64] /Waveform [uint16,uint16] /Reflectance [float32] /SHOTS /SHOTSAsPOINTS/Positions vlen[uint32] /Origin[point3D] /Direction[vector3D] /EmissionTime[float64] Full Waveform LIDAR: - Laser Data t1t2 t3 t_emission

28 Full Waveform LIDAR: - Airplane Data /CorseTime/PLANE/POINTS/CartesianCoords/Positions [point3D] /Rotation [rotor3D] /TimeStamps[float64] /CorseTime/PLANE/POINTS/CartesianCoords/Positions [point3D] /Rotation [rotor3D] /TimeStamps[float64]

29 Bringing together in F5: – Satellite data – LIDAR – Shapefiles Features of HDF5 Sustainable storage Meta data Compression Parallel IO Hyperslab access Consistent data organization of simple and complex spatial-temporal data Handle time series of data easily Make tools of other disciplines applicable to the Geo- science Community, such as astrophysics imaging mosaic tools for satellite data: Montage, http://montage.ipac.caltech.edu http://montage.ipac.caltech.edu Features of HDF5 Sustainable storage Meta data Compression Parallel IO Hyperslab access Consistent data organization of simple and complex spatial-temporal data Handle time series of data easily Make tools of other disciplines applicable to the Geo- science Community, such as astrophysics imaging mosaic tools for satellite data: Montage, http://montage.ipac.caltech.edu http://montage.ipac.caltech.edu Benefits

30 http://www.epscor.hawaii.edu Application Example

31 Centralized integrative capability to store and manage access to massive (terabytes) research datasets Goal: Users: University of Hawaii research teams Broad statewide research community Objectives: Collect, store and manage access to data Utilize user portals Utilize and link to the Maui High Performance Computing Center (MHPCC) Discovery, manipulation, fusion and visualization Mission:

32 Geospatial Information and Mass Storage

33 How to manage and store large complex datasets?!! Geospatial Information and Mass Storage

34

35

36 A common data format eases and reduces wasted time spent on data conversions Data formats for sustainable transparent storage of huge and complex data exist, one just has to use them – captures observational and simulation data consistently. Geoscience repositories, such as the can be built upon this format.

37 References: http://www.hdfgroup.org/HDF5 http://www.fiberbundle.net http://www.epscor.hawaii.edu http://montage.ipac.caltech.edu http://sciviz.cct.lsu.edu

38 - HDFView screenshot of shapefiles

39 Geospatial Information and Mass Storage Weather station data Marine buoy sensor data GPS data collection Database datasets, excel files Spatial data - imagery, LiDAR, GIS Weather station data Marine buoy sensor data GPS data collection Database datasets, excel files Spatial data - imagery, LiDAR, GIS Geoweb application services – WMS, WFS, WPC Database management Data streaming Data storage of statewide datasets Geoweb application services – WMS, WFS, WPC Database management Data streaming Data storage of statewide datasets Upload and download capability Metadata search capacity Visualization of spatial and non- spatial datasets Upload and download capability Metadata search capacity Visualization of spatial and non- spatial datasets Access to HPC services real-time modeling and analysis Access to HPC services real-time modeling and analysis

40 Grid – Manifold describing the base space Topology Refinement level Coordinate representation Vertex positions in representation


Download ppt "C ROSS D ISCIPLINARY A PPLICATIONS OF M ULTIPLEX O BSERVATIONAL AND C OMPUTATIONAL D ATASETS USING FOR A RCHIVING AND H IGH P ERFORMANCE P ROCESSING. Marcel."

Similar presentations


Ads by Google