Jay Lofstead Input/Output APIs and Data Organization for High Performance Scientific Computing November.

Slides:



Advertisements
Similar presentations
Hasan Abbasi Matthew Wolf Jay Lofstead Fang Zheng Greg Eisenhauer Karsten Schwan Analyzing large data sets quickly Scott Klasky Ron Oldfield Norbert Podhorszki.
Advertisements

Jialin Liu, Bradly Crysler, Yin Lu, Yong Chen Oct. 15. Seminar Data-Intensive Scalable Computing Laboratory (DISCL) Locality-driven High-level.
A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
Phillip Dickens, Department of Computer Science, University of Maine. In collaboration with Jeremy Logan, Postdoctoral Research Associate, ORNL. Improving.
File Systems.
Texan Streamline Processing (TSP) Eliminating the intermediate step in converting Texan raw data to standard SEG-Y. Steven Harder Presented at the Active.
Katie Antypas NERSC User Services Lawrence Berkeley National Lab NUG Meeting 1 February 2012 Best Practices for Reading and Writing Data on HPC Systems.
HPC User Forum 9/10/08 Managed by UT-Battelle for the Department of Energy 1 HPC User Forum 9/10/2008 Scott Klasky S. Ethier, S. Hodson,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
NetCDF An Effective Way to Store and Retrieve Scientific Datasets Jianwei Li 02/11/2002.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved The Operating System Machine.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
DESIGN OF LARGE SCALE DATA ARCHIVAL AND RETRIEVAL SYSTEM FOR TRANSPORTATION SENSOR (WRITE-ONCE-READ-MANY TYPE) DATA. by Nirish Dhruv Department of Computer.
Connecting HPIO Capabilities with Domain Specific Needs Rob Ross MCS Division Argonne National Laboratory
ADIOS IO introduction Yufei Dec 10. System at Oak Ridge 672 OSTs 10 Petabytes of storage 60 GB/sec = 480 Gbps aggregate performance (theoretical) 225,000.
Mapping Physical Formats to Logical Models to Extract Data and Metadata Tara Talbott IPAW ‘06.
File System. NET+OS 6 File System Architecture Design Goals File System Layer Design Storage Services Layer Design RAM Services Layer Design Flash Services.
Nutch Search Engine Tool. Nutch overview A full-fledged web search engine Functionalities of Nutch  Internet and Intranet crawling  Parsing different.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
July, 2001 High-dimensional indexing techniques Kesheng John Wu Ekow Otoo Arie Shoshani.
DISTRIBUTED DATA FLOW WEB-SERVICES FOR ACCESSING AND PROCESSING OF BIG DATA SETS IN EARTH SCIENCES A.A. Poyda 1, M.N. Zhizhin 1, D.P. Medvedev 2, D.Y.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
HDF5 A new file format & software for high performance scientific data management.
The HDF Group Parallel HDF5 Design and Programming Model May 30-31, 2012HDF5 Workshop at PSI 1.
NPP/ NPOESS Product Data Format Richard E. Ullman NASA/GSFC/NPP NOAA/NESDIS/IPOAlgorithm / System EngineeringData / Information Architecture
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
Pursuing Faster I/O in COSMO POMPA Workshop May 3rd 2010.
Extending Petascale I/O with Data Services Hasan Abbasi Karsten Schwan Matthew Wolf Jay Lofstead Scott Klasky (ORNL)
A Domain-Specific Modeling Language for Scientific Data Composition and Interoperability Hyun ChoUniversity of Alabama at Birmingham Jeff GrayUniversity.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Presented by On the Path to Petascale: Top Challenges to Scientific Discovery Scott A. Klasky NCCS Scientific Computing End-to-End Task Lead.
N P O E S S I N T E G R A T E D P R O G R A M O F F I C E NPP/ NPOESS Product Data Format Richard E. Ullman NOAA/NESDIS/IPO NASA/GSFC/NPP Algorithm Division.
HDF 1 New Features in HDF Group Revisions HDF and HDF-EOS Workshop IX November 30, 2005.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
Opportunities in Parallel I/O for Scientific Data Management Rajeev Thakur and Rob Ross Mathematics and Computer Science Division Argonne National Laboratory.
The european ITM Task Force data structure F. Imbeaux.
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
Alastair Duncan STFC Pre Coffee talk STFC July 2014 The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
May 30-31, 2012 HDF5 Workshop at PSI May Shared Object Headers Dana Robinson The HDF Group Efficient Use of HDF5 With High Data Rate X-Ray Detectors.
ESMF/V3: Managed by UT-Battelle for the Department of Energy.
F. Douglas Swesty, DOE Office of Science Data Management Workshop, SLAC March Data Management Needs for Nuclear-Astrophysical Simulation at the Ultrascale.
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
NPOESS Enhanced Description Tool - “ned” Richard E. Ullman NASA/GSFC/NPP NOAA/NESDIS/IPO Data / Information Architecture Algorithm / System Engineering.
Using IOR to Analyze the I/O Performance
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Project18’s Communication Drawing Design By: Camilo A. Silva BIOinformatics Summer 2008.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
Supercomputing 2006 Scientific Data Management Center Lead Institution: LBNL; PI: Arie Shoshani Laboratories: ANL, ORNL, LBNL, LLNL, PNNL Universities:
Intro to Parallel HDF5 10/17/151ICALEPCS /17/152 Outline Overview of Parallel HDF5 design Parallel Environment Requirements Performance Analysis.
May 30-31, 2012 HDF5 Workshop at PSI May Metadata Journaling Dana Robinson The HDF Group Efficient Use of HDF5 With High Data Rate X-Ray Detectors.
SDM Center High-Performance Parallel I/O Libraries (PI) Alok Choudhary, (Co-I) Wei-Keng Liao Northwestern University In Collaboration with the SEA Group.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
SDM Center Parallel I/O Storage Efficient Access Team.
Documenting LabVIEW Data & Data Mining with LabVIEW and DIAdem Presentation with self paced training exercises.
PIDX PIDX - a parallel API to capture the data models used by HPC application and write it out in an IDX format. PIDX enables simulations to write out.
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package Christian Chilan, Kent Yang, Albert Cheng, Quincey Koziol, Leon Arber.
1 Programming and problem solving in C, Maxima, and Excel.
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
ADIOS – adiosapi.org1 Jay Lofstead Flexible IO and Integration for Scientific Codes Through The Adaptable IO System (ADIOS) Jay Lofstead (GT),
Distributed Network Traffic Feature Extraction for a Real-time IDS
Locality-driven High-level I/O Aggregation
SDM workshop Strawman report History and Progress and Goal.
The Operating System Machine Level
Presentation transcript:

Jay Lofstead Input/Output APIs and Data Organization for High Performance Scientific Computing November 17, 2008 Petascale Data Storage Institute Workshop at Supercomputing 2008 Jay Lofstead (GT), Fang Zheng (GT), Scott Klasky (ORNL), Karsten Schwan (GT)

2Jay Lofstead Overview Motivation Architecture Performance

3Jay Lofstead Motivation Many codes write lots of data, but rarely read TBs of data different types and sizes HDF-5 and pNetCDF used convenient tool integration portable format

4Jay Lofstead Performance/Resilience Challenges pNetCDF “right sized” header coordination for each data declaration data stored as logically described HDF-5 b-tree format coordination for each data declaration single metadata store vulnerable to corruption.

5Jay Lofstead Architecture (ADIOS) Change IO method by changing XML file Switch between synchronous and asynchronous Hook into other systems like visualization and workflow External Metadata (XML file) Scientific Codes ADIOS API DART LIVE/DataTap MPI-IO POSIX IO HDF-5 pnetCDF Viz Engines Others (plug-in) bufferingschedulefeedback

6Jay Lofstead Architecture (BP) Individual outputs into “process group” segments Metadata indices next Index offsets and version flag at end Process Group 1 Process Group 2... Process Group n Process Group Index Vars Index Attributes Index Index Offsets and Version #

7Jay Lofstead Resilience Features Random node failure timeouts mark index entries as suspect Root node failure scan file to rebuild index use local size values to find offsets

8Jay Lofstead Data Characteristics Identify file contents efficiently min/max local array sizes Local-only makes it “free” no communication Indices for summaries/direct access copies for resilience

9Jay Lofstead Architecture (Strategy) ADIOS API for flexibility Use PHDF-5/PNetCDF during development for “correctness” Use POSIX/MPI-IO methods (BP output format) during production runs for performance

10Jay Lofstead Performance Overview Chimera (supernova) (8192 processes) relatively small writes (~1 MB per process) 1400 seconds pHDF-5 vs. 1.4 seconds POSIX (or 10 seconds MPI-IO independent) GTC (fusion) (29,000 processes) 25 GB/sec (out of 40 GB/sec theoretical max) writing restarts 3% of wall clock time spent on IO > 60 TB of total output

11Jay Lofstead Performance Overview Collecting characteristics unmeasurable 10, 50, 100 million entry arrays per processes processes weak scaling

12Jay Lofstead Performance Overview Data conversion Chimera 8192 process run took 117 seconds to convert to HDF-5 (compare 1400 seconds to write directly) on a single process Other tests have shown linear conversion performance with size Parallel conversion will be faster...

13Jay Lofstead Summary Use ADIOS API selectively choose consistency BP intermediate format performance resilience later convert to HDF-5/NetCDF Questions?