NUG Meeting 1 File and Data Conversion Jonathan Carter NERSC User Services 510-486-7514.

Slides:



Advertisements
Similar presentations
ECMWF 1 Com Intro training course Compiling environment Compiling Environment – ecgate Dominique Lucas User Support.
Advertisements

Pointers.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Mixed Language Programming on Seaborg Mark Durst NERSC User Services.
Input and Output READ WRITE OPEN. FORMAT statement Format statements allow you to control how data are read or written. Some simple examples: Int=2; real=
I/O: SPARC Assembly Department of Computer Science Georgia State University Georgia State University Updated Spring 2014.
Wannabe Lecturer Alexandre Joly inst.eecs.berkeley.edu/~cs61c-te
Names and Bindings.
Dr. Kalpakis CMSC 661, Principles of Database Systems Representing Data Elements [12]
Chapter 3 Loaders and Linkers
Data Types in Java Data is the information that a program has to work with. Data is of different types. The type of a piece of data tells Java what can.
Computer Science Basics CS 216 Fall Operating Systems interface to the hardware for the user and programs The two operating systems that you are.
January 13, Csci 2111: Data and File Structures Week1, Lecture 2 Basic File Processing Operations.
COSC 120 Computer Programming
Chapter 11 C File Processing Acknowledgment The notes are adapted from those provided by Deitel & Associates, Inc. and Pearson Education Inc.
Microsoft Visual Basic 2005: Reloaded Second Edition Chapter 9 Structures and Sequential Access Files.
Guide To UNIX Using Linux Third Edition
Protected Mode. Protected Mode (1 of 2) 4 GB addressable RAM –( to FFFFFFFFh) Each program assigned a memory partition which is protected from.
CHAPTER 6 FILE PROCESSING. 2 Introduction  The most convenient way to process involving large data sets is to store them into a file for later processing.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Comparison of Communication and I/O of the Cray T3E and IBM SP Jonathan Carter NERSC User.
Homework Reading –Finish K&R Chapter 1 (if not done yet) –Start K&R Chapter 2 for next time. Programming Assignments –DON’T USE and string library functions,
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
Beyond Record Structures Dr. Robert J. Hammell Assistant Professor Towson University Computer and Information Sciences Department 8000 York Road - Suite.
Homework Reading Programming Assignments
CHP - 9 File Structures. INTRODUCTION In some of the previous chapters, we have discussed representations of and operations on data structures. These.
Introduction to FORTRAN
Fortran 1- Basics Chapters 1-2 in your Fortran book.
MIPS coding. SPIM Some links can be found such as:
Instruction Set Architecture
Chapter 3 Elements of Assembly Language. 3.1 Assembly Language Statements.
Names Variables Type Checking Strong Typing Type Compatibility 1.
Advanced UNIX progamming Fall 2002 Instructor: Ashok Srinivasan Lecture 5 Acknowledgements: The syllabus and power point presentations are modified versions.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
Copyright © 2002 W. A. Tucker1 Chapter 7 Lecture Notes Bill Tucker Austin Community College COSC 1315.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 Porting from the Cray T3E to the IBM SP Jonathan Carter NERSC User Services.
Debug and Assembler By, B.R.Chandavarkar Lect. COMP Department NITK, Surathkal.
Lecture Set 12 Sequential Files and Structures Part C – Reading and Writing Binary Files.
Scientific Computing Division A tutorial Introduction to Fortran Siddhartha Ghosh Consulting Services Group.
Introduction to Computer Programming Using C Session 23 - Review.
Chapter 18 – Miscellaneous Topics. Multiple File Programs u Makes possible to accommodate many programmers working on same project u More efficient to.
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money.
N ATIONAL E NERGY R ESEARCH S CIENTIFIC C OMPUTING C ENTER 1 I/O Strategies for the T3E Jonathan Carter NERSC User Services.
Other data types. Standard type sizes b Most machines store integers and reals in 4 bytes (32 bits) b Integers run from -2,147,483,648 to 2,147,483,647.
DBT544. DB2/400 Advanced Features Level Check Considerations Database Constraints File Overrides Object and Record Locks Trigger Programs.
MT311 Java Application Development and Programming Languages Li Tak Sing ( 李德成 )
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming Input and Output.
Announcements Assignment 1 due Wednesday at 11:59PM Quiz 1 on Thursday 1.
Microsoft Visual Basic 2012 CHAPTER FOUR Variables and Arithmetic Operations.
Microsoft Visual Basic 2005: Reloaded Second Edition Chapter 9 Structures and Sequential Access Files.
Learners Support Publications Working with Files.
C Programming Day 2. 2 Copyright © 2005, Infosys Technologies Ltd ER/CORP/CRS/LA07/003 Version No. 1.0 Union –mechanism to create user defined data types.
 2007 Pearson Education, Inc. All rights reserved C File Processing.
Some of the utilities associated with the development of programs. These program development tools allow users to write and construct programs that the.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Microsoft Visual Basic 2005: Reloaded Second Edition
CHP - 9 File Structures.
System Programming and administration
Variables and Arithmetic Operations
Ken D. Nguyen Department of Computer Science Georgia State University
Topics Introduction to File Input and Output
File Input and Output.
Advanced UNIX progamming
Learning VB2005 language (basics)
Real-World File Structures
Software Development Environment, File Storage & Compiling
Ken D. Nguyen Department of Computer Science Georgia State University
Topics Introduction to File Input and Output
C Language B. DHIVYA 17PCA140 II MCA.
F T T T F.
Presentation transcript:

NUG Meeting 1 File and Data Conversion Jonathan Carter NERSC User Services

NUG Meeting 2 Introduction Converting file and data for use on the IBM SP IBM uses IEEE data representation Industry standard Fortran unformatted file structure Tools available on the Cray systems Tools available on the IBM SP

NUG Meeting 3 Demand for File Conversion Currently, CTSS text files ctou, rlib will be available on the IBM SP After decommissioning the Cray Systems in October 2002 Cray Fortran unformatted files Cray C binary files

NUG Meeting 4 Tools on the Cray Systems - FFIO Flexible File I/O - general system of specifying how data should be written or read Can be used without recompiling or linking (Fortran) Can be changed at runtime Various layers available to convert both file structure and data Controlled via the assign command

NUG Meeting 5 assign Command Can specify how I/O is done On a Fortran unit basis: assign –F f77 u:10 On a filename basis: assign –F f77 f:filename Common options Clear assigns: assign -R See current assigns in effect: assign -V

NUG Meeting 6 Fortran Unformatted Sequential-access Files Cray uses a vendor specific format called COS blocked, or simply blocked IBM (and most Unix vendors) use f77 blocking Use –F f77 option to have the FFIO f77 blocking layer used instead of the default COS blocking: assign –F f77 u:10 T3E already uses IEEE arithmetic, so –F f77 is sufficient Note that default real and integer data types on the T3E are 64 bit SV1 data needs to be converted, so an IEEE conversion layer is needed -N ieee performs basic conversion assign –F f77 -N ieee f:filename

NUG Meeting 7 Fortran Unformatted Direct-access Files Files are not blocked on Cray or IBM Data conversion layers can be used as in sequential- access files for the SV1 machines assign -N ieee u:20 T3E files don’t need any conversion

NUG Meeting 8 C Binary Files Files are not blocked on Cray or IBM FFIO conversion layer not easy to use Use library routines such as cry2cri

NUG Meeting 9 Using FFIO to Convert a File Isolate I/O statements for the file from program to make a simple conversion program Pair each read with a write Use assign to have all written data converted, or use data conversion routines

NUG Meeting 10 Tools on the IBM SP - NCARU Library Library developed by the SCD at NCAR Read COS blocked file Convert Cray data to IEEE data Does not use Fortran API, so program modification is required Basic calls are crayopen, crayread, crayrew, crayback, crayclose Calls to crayread can convert data if record is composed of one data type only, otherwise user must handle explicitly Conversion routines are ctodpf, ctospf, ctospi Cray Fortran I/O sometimes inserts padding, user must handle explicitly

NUG Meeting 11 Using the NCARU Library To use: module load ncaru xlf -o a.out b.f $NCARU Limitations 2GB limit for unblocked files Currently no 64 bit address space support Not thread-safe No support for 128 bit data

NUG Meeting 12 Dealing with Different Files Open using blocked option to crayopen for Fortran unformatted sequential access, open with unblocked option for Fortran unformatted direct access If written on the SV1 use conversion option on read, or call conversion routines directly C binary files can be read by the unblocked I/O calls or by usual C I/O followed by data conversion routines

NUG Meeting 13 Records with Mixed Data Types Read into a buffer and convert items one by one real x(50) integer n(50) real*8 buffer(100) ! open in blocked mode ifc = crayopen(‘filename’,10,0) ! read record without converting nwds = crayread(ifc,buffer,100,0) ! convert data call ctospf(buffer,x,50) call ctospi(buffer(51),n,50)

NUG Meeting 14 Data Padding With Cray Fortran I/O, extra bytes are inserted into the user data. In cases where padding occurs, bytes are inserted so that any datum of length 8 bytes is at a byte offset, which is measured from the beginning of the record, that is a multiple of 8 bytes. Then the end of the record is padded so that the whole record length is a multiple of 8. Padding will only occur if you have used character variables that are not of lengths that are a multiple of 8 or have used real*4 or integer*4 data on the T3E (on the SV1 systems, 8 bytes are used).

NUG Meeting 15 Example A Fortran record is written on an SV1: real a(50) integer n(50) character*17 label write(50) n, a, label The lengths of n, a, and label are 8 bytes, 8 bytes, and 17 bytes respectively. Within the Fortran record, n starts at offset 0, a at offset 400, and label at offset 800. The only padding that occurs is at the end of the record, where 7 bytes are added to make the total record length 816 bytes, which is a multiple of 8.

NUG Meeting 16 Example A Fortran record is written on an SV1: real a(50) integer n(50) character*17 label write(50) label, n, a Without padding, the alignments are label at offset 0, a at offset 17, and n at offset 417. Since a has elements of length 8 bytes, it must be written at an offset that is a multiple of 8 bytes; therefore a pad of 7 bytes is inserted between the end of label and the beginning of a. In the record that is written to the file, the alignments are label at offset 0, a at offset 24, and n at offset 424.

NUG Meeting 17 Example A Fortran record is written on the T3E: real a(40), b(40) integer*4 n(13), m(13) character*12 label write(50) label, n, a, m, b The data has lengths: label 12 bytes, n and m 52 bytes, and a and b both 320 bytes. Without padding, the alignments are label at offset 0, n at offset 12, a at offset 64, m at offset 384, and b at offset 436. a and b need to be at offsets that are a multiple of 8 bytes; the offset of a is already correct, but 4 bytes must be inserted before b, so that it starts at offset 440.

NUG Meeting 18 crayconv Utility crayconv automatically converts files written on the SV1 to IBM compatible format Basic Fortran data types only Sequential access unformatted files only Possible problem if compiler option -Onofastint used, or integer*8 explicitly declared and written-- Integers over 2 46 not correctly interpreted Pad data not removed Extension to T3E data and direct access unformatted files planned

NUG Meeting 19 More Information -by Mike Stewart ml man ncaru

NUG Meeting 20