Big Applications: Simulations, Models, Visualization, … Scientific data management for big computers and big data HDF5 (serial.

Slides:



Advertisements
Similar presentations
O AK R IDGE N ATIONAL L ABORATORY U. S. D EPARTMENT OF E NERGY Center for Computational Sciences Cray X1 and Black Widow at ORNL Center for Computational.
Advertisements

A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
What is a Computer?.
An Overview of the Computer System
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
HDF Update Mike Folk National Center for Supercomputing Applications
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
Xuan Guo Chapter 1 What is UNIX? Graham Glass and King Ables, UNIX for Programmers and Users, Third Edition, Pearson Prentice Hall, 2003 Original Notes.
Simulation of natural organic matter adsorption to soils: A preliminary report Indiana Biocomplexity Symposium, Notre Dame, IN, April 2003 Leilani Arthurs.
Current Visualization Software NCL, Amira, and OpenDX By Drew Brumm.
Introduction to GIS. Watershed Discretization (model elements) + Land Cover Soil Rain Results Intersect model elements with Digital Elevation Model (DEM)
Hardware and Software Basics. Computer Hardware  Central Processing Unit - also called “The Chip”, a CPU, a processor, or a microprocessor  Memory (RAM)
The Operating System. Operating Systems (F) What you need to know about –operating system as a program; –directory/folder.
© Paradigm Publishing Inc. 4-1 Chapter 4 System Software.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
Introduction to Computers
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Cthru Technical Brief Gary Morris Center of Higher Learning Stennis Space Center.
1 Overview of HDF5 HDF Summit Boeing Seattle The HDF Group (THG) September 19, 2006.
Global Land Cover Facility The Global Land Cover Facility (GLCF) is a member of the Earth Science Information Partnership (ESIP) Federation providing data,
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
Chapter 1 Intro to Computer Department of Computer Engineering Khon Kaen University.
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
The Saguaro Digital Library for Natural Asset Management Dr. Sudha RamSudha Ram Advanced Database Research Group Dept. of MIS The University of Arizona.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
Environmental Science
Alastair Duncan STFC Pre Coffee talk STFC July 2014 The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project.
Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
, Key Components of a Successful Earth Science Subsetter Architecture ASDC Introduction The Atmospheric Science Data Center (ASDC) at NASA Langley Research.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
GES DISC DAAC February 28, 2002HDF-EOS Workshop V1 The Goddard DAAC The Goddard DAAC Presented by:
View_hdf Kam-Pui Lee Science Applications International Corporation CERES Data Management Team Linda Hunt Computer Sciences Corporation Atmospheric Sciences.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
1 11 CHAPTER Information Technology, the Internet, and You computing ESSENTIALS.
Presented by Scientific Data Management Center Nagiza F. Samatova Network and Cluster Computing Computer Sciences and Mathematics Division.
Building the e-Minerals Minigrid Rik Tyer, Lisa Blanshard, Kerstin Kleese (Data Management Group) Rob Allan, Andrew Richards (Grid Technology Group)
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
An Overview of the Computer System lesson 1. This lesson includes the following sections: The Parts of a Computer System Looking Inside the Machine Software:
VAPoR: A Discovery Environment for Terascale Scientific Data Sets Alan Norton & John Clyne National Center for Atmospheric Research Scientific Computing.
I/O for Structured-Grid AMR Phil Colella Lawrence Berkeley National Laboratory Coordinating PI, APDEC CET.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Capacity and Capability Computing using Legion Anand Natrajan ( ) The Legion Project, University of Virginia (
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Visualization Support for XSEDE and Blue Waters DOE Graphics.
Welcome to the PRECIS training workshop
Supercomputing 2006 Scientific Data Management Center Lead Institution: LBNL; PI: Arie Shoshani Laboratories: ANL, ORNL, LBNL, LLNL, PNNL Universities:
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
Using Cyberinfrastructure to Study the Earth’s Climate and Air Quality Don Wuebbles Department of Atmospheric Sciences University of Illinois, Urbana-Champaign.
INTRODUCTION TO COMPUTERS. A computer system is an electronic device used to input data, process data, store data for later use and produce output in.
An Overview of the Computer System
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
HDF5 October 8, 2017 Elena Pourmal Copyright 2016, The HDF Group.
TYPES OFF OPERATING SYSTEM
Looking Inside the machine (Types of hardware, CPU, Memory)
An Overview of the Computer System
An Overview of the Computer System
BlueGene/L Supercomputer
GCSE OCR 4 Storage Computer Science J276 Unit 1
Presentation transcript:

Big Applications: Simulations, Models, Visualization, … Scientific data management for big computers and big data HDF5 (serial and/or parallel) Parallel UDM Software Stacks Applications and readers, often customized for particular technical fields, enable users to create, manipulate, and view scientific and engineering data. With the support of intervening libraries, common interfaces, and HDF5, scientists and engineers in many fields are able to share data and software. Specialized libraries and Common Interfaces use HDF5 layer for data management and often provide specialized metadata, context, and tools for data transformations and exchange. The HDF5 layer provides many data management functions, including machine-independent storage of all datatypes, metadata describing datatypes, user-defined attributes, etc., sophisticated subsetting and subsampling capabilities. Parallel HDF5 uses MPI-IO to provide parallel file system functionality and global file access. SAFLibSheafHDF-EOS ReadersCommon Interfaces Examples: Thermonuclear simulations Product modeling Data mining tools Visualization tools Climate models IDL Storage HDF5 virtual file layer (I/O drivers) File on parallel file system File MPI I/O Split metadata and raw data files Split Files Stdio User-defined device Custom ? Virtual File Layer The HDF5 VFL, or virtual file layer, provides access to many different data input and output mechanisms. The standard (stdio), split, and MPI drivers read from and write to files on storage media; the stream driver reads and writes virtual files or streams of data. The VFL also enables the creation of custom drivers, such as the stream driver, for specialized or user-defined situations. Across the network or to/from another application or library Stream Representative Technical Fields* in which HDF5 Is Used * from selected HDF5 download registrations, 15 October 2001 through 22 February 2002 Tools Various tools provide means of accessing HDF5 files, including the data, metadata, and hierarchical structure, without having to write new software. HDFview, illustrated at the top of this image, displays the structure of a simple HDF5 file in one panel, raw data in another, and if appropriate an image or portion of it in a third. The larger image is the full, independently- generated gravity wave image. HDF5 runs on almost all computers, including many parallel computers Lawrence Livermore National Laboratory National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Matter & the universe Weather and climate A15-projector display wall (resolution 6400 x 3072) for viewing interactive applications and pre-computed animations at Lawrence Livermore National Laboratory. August 24, 2001 August 24, 2002 Total Column Ozone (Dobson) Answering big questions … involves big data … The ASCI White system contains 8,192 interconnected processors. Its 6.2 terabyte (trillion byte) memory is about 97,000 times that of a 64-MB PC. Its 7,000 disk drives with 160 terabytes of storage space has about 16,000 times the storage capacity of a desktop computer with a 10-GB hard disk. on big computers. Life and nature How do we… Describe big data? Store it? Find it? Share it? Mine it? Move it into, out of, and between computers? A file format and software to describe, organize, store, share, and access big data: Store large, complex scientific and engineering data sets Retrieve complete data or partial data, easily and quickly Enable parallel I/O, remote access, specialized access A free, open standard developed by NCSA and the Lawrence Livermore, Sandia, and Los Alamos National Laboratories, with additional support from NASA The name HDF5 derives from the term hierarchical data format. An HDF5 file is a hierarchically structured set of groups, datasets, and metadata. Density gradient in the plasma causes the laser beam to self-focus and then split up into several "filaments". Simulation of a NIF laser beam passing through a plasma. Simulation by Bert Still, Visualization by Steve Langer, LLNL HDF5 File Structure Copyright 2002 by the Board of Trustees of the University of Illinois HDF5 Courtesy of Arthur Mirin, LLNL University of Illinois NASA National Science Foundation DOE SciDAC LANLLLNL, SNLTriLabNASA Visualization courtesy of John Shalf, NERSC/Lawrence Berkeley Laboratory, using data computed on the NERSC SP2 by Dennis Pollney and the Cactus Team, Albert Einstein Institute Aerospace Agricultural research Air traffic control Aircraft emissions database Applied mathematics Astrophysics Astrophysics / supernovae Atmospheric chemistry Atmospheric physics Bioengineering CEM Simulation Climatology / hydrology Computational fluid dynamics Computational physics Computational physics / education Computational physics and computational astrophysics Computer modeling Computer science Data processing Earth observation / atmospheric science Earth science Environment Fast searching, sorting and retrieval Film making special effects Fluid mechanics GIS Geodetic Science Geology Gravitational physics Hydrology Information technology Magnetic mass spectrometer development Marine biology / ecology Materials science Meteorological data products Meteorology Microscopy Molecular biology Nano device simulation Neutron scattering Ocean color Ocean remote sensing Optics / optoelectronics Petroleum engineering Photonic band gap studies Photonic crystals Photonics Post-fire erosion analysis Protein crystallography, molecular modeling Protostellar accretion discs Remote sensing SAR processing Satellite / weather radar remote sensing Satellite oceanography Semiconductor process simulation Software engineering, distributed systems Space geodesy Space physics Surface water flow and sediment transport Theoretical chemistry Visualization Volcanology Water resources management X-ray physics Computers and operating systems include: MacOS X MS Windows UNIX Linux FreeBSD OSF1 HP-UX IBM SP SGI IRIX64 Cray T3E Cray SV1 Sun Solaris IA-32 and IA-64 Clusters and high performance computers include: ASCI Red ASCI Blue Mountain ASCI Blue Pacific ASCI White Various experimental clusters Other HDF5 sponsors include