I/O and the SciDAC Software API Robert Edwards U.S. SciDAC Software Coordinating Committee May 2, 2003.

Slides:



Advertisements
Similar presentations
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Advertisements

I/O and the SciDAC Software API Robert Edwards U.S. SciDAC Software Coordinating Committee May 2, 2003.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Introduction to the BinX Library eDIKT project team Ted Wen Robert Carroll
Data Format Description Language (DFDL) WG Martin Westhead EPCC, University of Edinburgh Alan Chappell PNNL
SciDAC Software Infrastructure for Lattice Gauge Theory
Streaming NetCDF John Caron July What does NetCDF do for you? Data Storage: machine-, OS-, compiler-independent Standard API (Application Programming.
Data-Parallel Programming Model Basic uniform operations across lattice: C(x) = A(x)*B(x) Distribute problem grid across a machine grid Want API to hide.
E-Science Data Information and Knowledge Transformation The BinX Language.
The Future of NetCDF Russ Rew UCAR Unidata Program Center Acknowledgments: John Caron, Ed Hartnett, NASA’s Earth Science Technology Office, National Science.
QDP++: Low Level Plumbing for Chroma Bálint Joó Jefferson Lab, Newport News, VA given at HackLatt'06 NeSC, Edinburgh March 29, 2006.
QDP++ and Chroma Robert Edwards Jefferson Lab
ILDG File Format Chip Watson, for Middleware & MetaData Working Groups.
HackLatt MILC with SciDAC C Carleton DeTar HackLatt 2008.
MILC Code Basics Carleton DeTar KITPC MILC Code Capabilities Molecular dynamics evolution –Staggered fermion actions (Asqtad, Fat7, HISQ,
HackLatt MILC Code Basics Carleton DeTar HackLatt 2008.
SciDAC Software Infrastructure for Lattice Gauge Theory DOE Grant ’01 -- ’03 (-- ’05?) All Hands Meeting: FNAL Feb. 21, 2003 Richard C.Brower Quick Overview.
Grid IO APIs William Gropp Mathematics and Computer Science Division.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
TRANSIMS Version 5 Software Architecture January 20, 2011 David Roden – AECOM.
JGMA: A Reference Implementation of the Grid Monitoring Architecture Mat Grove Distributed Systems Group University of Portsmouth
Data Formats CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
Avro Apache Course: Distributed class Student ID: AM Name: Azzaya Galbazar
SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
Lattice 2004Chris Maynard1 QCDml Tutorial How to mark up your configurations.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
College of Nanoscale Science and Engineering A uniform algebraically-based approach to computational physics and efficient programming James E. Raynolds.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
QCD Project Overview Ying Zhang September 26, 2005.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower & Robert Edwards June 24, 2003.
CHEP 2000, Giuseppe Andronico Grid portal based data management for Lattice QCD data ACAT03, Tsukuba, work in collaboration with A.
Web Services Kanda Runapongsa Dept. of Computer Engineering Khon Kaen University.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 Relational APDM & Relational ASDM models effort done in online.
UKQCD QCDgrid Richard Kenway. UKQCD Nov 2001QCDgrid2 why build a QCD grid? the computational problem is too big for current computers –configuration generation.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
SciDAC All Hands Meeting, March 2-3, 2005 Northwestern University PIs:Alok Choudhary, Wei-keng Liao Graduate Students:Avery Ching, Kenin Coloma, Jianwei.
HackLatt MILC Code Basics Carleton DeTar First presented at Edinburgh EPCC HackLatt 2008 Updated 2013.
Parallel and Grid I/O Infrastructure W. Gropp, R. Ross, R. Thakur Argonne National Lab A. Choudhary, W. Liao Northwestern University G. Abdulla, T. Eliassi-Rad.
ILDG Middleware Status Bálint Joó UKQCD University of Edinburgh, School of Physics on behalf of ILDG Middleware Working Group alternative title: Report.
Project 4 : SciDAC All Hands Meeting, September 11-13, 2002 A. Choudhary, W. LiaoW. Gropp, R. Ross, R. Thakur Northwestern UniversityArgonne National Lab.
Lattice QCD Data Grid Middleware: status report M. Sato, CCS, University of Tsukuba ILDG6, May, 12, 2005.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
1 HDF5 Life cycle of data Boeing September 19, 2006.
Chroma: An Application of the SciDAC QCD API(s) Bálint Joó School of Physics University of Edinburgh UKQCD Collaboration Soon to be moving to the JLAB.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see
UKQCD Grid Status Report GridPP 13 th Collaboration Meeting Durham, 4th—6th July 2005 Dr George Beckett Project Manager, EPCC +44.
1 Metadata Working G roup Report Members (fixed in mid-January) G.AndronicoINFN,Italy P.CoddingtonAdelaide,Australia R.EdwardsJlab,USA C.MaynardEdinburgh,UK.
Connections to Other Packages The Cactus Team Albert Einstein Institute
Parallel I/O Performance Study and Optimizations with HDF5, A Scientific Data Package MuQun Yang, Christian Chilan, Albert Cheng, Quincey Koziol, Mike.
IBM Research ® © 2007 IBM Corporation A Brief Overview of Hadoop Eco-System.
PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.
E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen
ESMF,WRF and ROMS. Purposes Not a tutorial Not a tutorial Educational and conceptual Educational and conceptual Relation to our work Relation to our work.
May 30-31, 2012 HDF5 Workshop at PSI May Metadata Journaling Dana Robinson The HDF Group Efficient Use of HDF5 With High Data Rate X-Ray Detectors.
QDP++ and Chroma Robert Edwards Jefferson Lab Collaborators: Balint Joo.
The HDF Group Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format 1 Copyright © 2010 The HDF Group. All Rights Reserved.
Apache Avro CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
Unidata Infrastructure for Data Services Russ Rew GO-ESSP Workshop, LLNL
® Sponsored by Improving Access to Point Cloud Data 98th OGC Technical Committee Washington DC, USA 8 March 2016 Keith Ryden Esri Software Development.
1 Input-Output A complex issue in programming language design. The interface to the outside world. –Differences must be accommodated as transparently as.
LQCD Computing Project Overview
ILDG Implementation Status
Chroma: An Application of the SciDAC QCD API(s)
Presentation transcript:

I/O and the SciDAC Software API Robert Edwards U.S. SciDAC Software Coordinating Committee May 2, 2003

Sci D A C Through entific iscovery dvanced omputing

SciDAC Project Goals Portable, scalable software High performance optimization on two target architectures Exploitation and Optimization of existing application base Infrastructure for (US) national community Sharing of valuable lattice data, and data management GRID (ILDG) ClustersQCDOC

Optimised Dirac Operators, Inverters Level 3 QDP (QCD Data Parallel) Lattice Wide Operations, Data shifts Level 2 QMP (QCD Message Passing) QLA (QCD Linear Algebra) Level 1 QIO XML I/O DIME SciDAC Software Structure Exists in C/C++, implemented over MPI, GM, QCDOC Optimised for P4 and QCDOC Focus of talk Exists in C/C++

Data Parallel QDP/C,C++ API Hides architecture and layout Operates on lattice fields across sites Linear algebra tailored for QCD Shifts and permutation maps across sites Reductions Subsets

Data-parallel Operations Unary and binary: -a; a-b; … Unary functions: adj(a), cos(a), sin(a), … Random numbers: // platform independent random(a), gaussian(a) Comparisons (booleans) a <= b, … Broadcasts: a = 0, … Reductions: sum(a), …

QDP Expressions Can create expressions QDP/C++ code multi1d u(Nd); LatticeDiracFermion b, c, d; int mu; c = u[mu] * shift(b,mu) + 2 * d; PETE: Portable Expression Template Engine Temporaries eliminated, expressions optimised

Generic QDP Binary File Formats Composed of 1 or more application records Single application record has 1 QDP field or an array of fields Binary data in lexicographic site major order Physics metadata for file and for each record Using DIME to package

Metadata Use XML for file and record metadata File and record metadata managed at user convenience No agreed minimum standard Use binX to describe binary binX not in record metadata – provides serialization info

Gauge Fields For published data, will use UKQCD schema Write arrays of fields as one record – all 3 rows Site major order – slowest varying Will adopt single format and byte ordering

File Format File physics metadata Application record 1 Physics metadata binX description Binary data [may have array indices within sites] Checksum Record 2 - possible additional records Physics metadata binX description Binary data Checksum Record 3….

Data Hierarchy Project built from datasets (e.g. gauge fields and propagators) Dataset built from files (e.g. gauge fields) File built from records (e.g. eigenvectors) Record = QDP field and metadata

Direct Internet Message Encapsulation (DIME)  Data written to (read from) a list of records  Each record has  DIME Type (required)  URL or like MIME type  DIME Id (optional URL)  Maximum record size is 2Gb  Data larger than 2Gb can be split into successive record “ chunks”  Chunking easy, file size > 2Gb a problem

QIO: Grid Friendly I/O  Metadata & Physics data Reader / Writer API  Read/Write simple XML documents  Not using data binding  Metadata used like a buffer, physics data like a stream  QDP IO (QIO)  Serial – all nodes stream through one node  Parallel – if available, many nodes to parallel filesystem MetaWriter file_xml, rec_xml; SerialFileWriter out(file_xml,“foo.dat”); LatticeDiracFermion psi; out.write(rec_xml, psi);

MetaReader struct foo_t foo; struct bar_t bar; double kappa; MetaReader in; char *key=”/foo/bar/kappa”; File xml:  XML Reader/Writer supports recursive serialization  To/From buffers (strings - MetaData)  To/From files (PhysicsData)  Intended to drive codes rather than DataGrid  C,C++ versions in.get (foo,“/foo”) in.get (kappa,key);

Current Status Releases and documentation  QMP, QDP/C,C++ in first release  Performance improvements/testing underway  Porting & development efforts of physics codes over QDP on-going  QIO near completion  DIME completed (Balint Joó)  XML Reader/Writer in development