SDF: Stanford Data Format Zak Sahul, Oscar Carrillo, Jed Black Stanford University.

Slides:



Advertisements
Similar presentations
Rune Hagelund, WesternGeco Stewart A. Levin, Halliburton
Advertisements

Chapter 3 DATA: TYPES, CLASSES, AND OBJECTS. Chapter 3 Data Abstraction Abstract data types allow you to work with data without concern for how the data.
SOAP.
Data Manipulation Overview and Applications. Agenda Overview of LabVIEW data types Manipulating LabVIEW data types –Changing data types –Byte level manipulation.
Representing Data Elements Gayatri Gopalakrishnan.
Processor Technology and Architecture
Chapter 12.2: Records Kristen Mori CS 257 – Spring /4/2008.
13.5 Representing Data Elements Fields, Records, Blocks Variable-length Data Modifying Records.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Data Representation Kieran Mathieson. Outline Digital constraints Data types Integer Real Character Boolean Memory address.
File Exchange Format for Vital Signs, ENV and its use in Electronic Interchange of Polysomnography Data Alpo Värri Institute of Signal Processing,
EDF and EDF+ Bob Kemp Leiden University, Neurology Westeinde Hospital, Sleep Centre
Representation and Conversion of Numeric Types 4 We have seen multiple data types that C provides for numbers: int and double 4 What differences are there.
MIS316 – BUSINESS APPLICATION DEVELOPMENT – Chapter 14 – Files and Streams 1Microsoft Visual C# 2012, Fifth Edition.
CSC 125 Introduction to C++ Programming Chapter 2 Introduction to C++
WFDB / PhysioNet Formats George B. Moody Harvard-MIT Division of Health Sciences and Technology Cambridge, Massachusetts.
C Programming Lecture 4 : Variables , Data Types
Copyright © 2002 W. A. Tucker1 Chapter 7 Lecture Notes Bill Tucker Austin Community College COSC 1315.
C Tokens Identifiers Keywords Constants Operators Special symbols.
An Introduction to MINC John G. Sled. What is MINC? A medical image file format based on NetCDF A core set tools and libraries for image processing A.
CS1 Lesson 2 Introduction to C++ CS1 Lesson 2 -- John Cole1.
CSE 1301 Lecture 2 Data Types Figures from Lewis, “C# Software Solutions”, Addison Wesley Richard Gesick.
Chapter 4 Introduction to MySQL. MySQL “the world’s most popular open-source database application” “commonly used with PHP”
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
Lec 6 Data types. Variable: Its data object that is defined and named by the programmer explicitly in a program. Data Types: It’s a class of Dos together.
NPSG Data Exchange Where We Are Today and What We Need David M. Rapoport, MD NYU School of Medicine Philadelphia January 27, 2006.
5 BASIC CONCEPTS OF ANY PROGRAMMING LANGUAGE Let’s get started …
Fundamentals of C and C++ Programming. EEL 3801 – Lotzi Bölöni Sub-Topics  Basic Program Structure  Variables - Types and Declarations  Basic Program.
Chapter 3 – Variables and Arithmetic Operations. Variable Rules u Must declare all variable names –List name and type u Keep length to 31 characters –Older.
 Lossless › Integer wavelet (+/- reversible color transformation).  Either lossless or lossy › Integer or floating point wavelet  Many features! › Region-of-interest.
Copyright © – Curt Hill Types What they do.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
C++ Programming Lecture 3 C++ Basics – Part I The Hashemite University Computer Engineering Department (Adapted from the textbook slides)
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
Programming Fundamentals. Summary of previous lectures Programming Language Phases of C++ Environment Variables and Data Types.
CSCI 465 D ata Communications and Networks Lecture 24 Martin van Bommel CSCI 465 Data Communications & Networks 1.
8 Chapter Eight Server-side Scripts. 8 Chapter Objectives Create dynamic Web pages that retrieve and display database data using Active Server Pages Process.
1.2 Primitive Data Types and Variables
AJAX. Ajax  $.get  $.post  $.getJSON  $.ajax  json and xml  Looping over data results, success and error callbacks.
NTFS Filing System CHAPTER 9. New Technology File System (NTFS) Started with Window NT in 1993, Windows XP, 2000, Server 2003, 2008, and Window 7 also.
Java Basics. Tokens: 1.Keywords int test12 = 10, i; int TEst12 = 20; Int keyword is used to declare integer variables All Key words are lower case java.
1 A more complex example Write a program that sums a sequence of integers and displays the result. Assume that the first integer read specifies the number.
Basic Data Types อ. ยืนยง กันทะเนตร คณะเทคโนโลยีสารสนเทศและการสื่อสาร มหาวิทยาลัยพะเยา Chapter 4.
Utilities for netCDF-4 Dr. Dennis Heimbigner Unidata Advanced netCDF Workshop July 25, 2011.
CSE 110: Programming Language I Matin Saad Abdullah UB 1222.
1Object-Oriented Program Development Using C++ Built-in Data Types Data type –Range of values –Set of operations on those values Literal: refers to acceptable.
Chapter 2 Variables and Constants. Objectives Explain the different integer variable types used in C++. Declare, name, and initialize variables. Use character.
BINARY I/O IN JAVA CSC 202 November What should be familiar concepts after this set of topics: All files are binary files. The nature of text files.
CHP - 9 File Structures.
Chapter 6: Data Types Lectures # 10.
Documentation Need to have documentation in all programs
Variables A variable is a placeholder for a value. It is a named memory location where that value is stored. Use the name of a variable to access or update.
Other Kinds of Arrays Chapter 11
A Closer Look at Instruction Set Architectures
Overview of Hadoop MapReduce MapReduce is a soft work framework for easily writing applications which process vast amounts of.
INTRODUCTION c is a general purpose language which is very closely associated with UNIX for which it was developed in Bell Laboratories. Most of the programs.
N. Capp, E. Krome, I. Obeid and J. Picone
Chapter 2 Variables.
Introduction C is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell.
CS 245: Database System Principles Disk Organization
Variable Length Data and Records
CS222/CS122C: Principles of Data Management Lecture #2 Storing Data: Disks and Files Instructor: Chen Li.
Introduction to Data Structure
C++ Programming Lecture 3 C++ Basics – Part I
ECE 103 Engineering Programming Chapter 8 Data Types and Constants
C Language B. DHIVYA 17PCA140 II MCA.
Database Instructor: Bei Kang.
Variables and Constants
Chapter 10 Instruction Sets: Characteristics and Functions
Presentation transcript:

SDF: Stanford Data Format Zak Sahul, Oscar Carrillo, Jed Black Stanford University

SDF Philosophy ● Allow storage of arbitrary data types – signed and unsigned integers – arbitrary size representations (1, 2, 4, etc bytes) – floating point numbers – complex numbers – ascii ● Record events – as flags, or text with a timestamp – simultaneous events

SDF Philosophy ● Arbitrary signal characteristics – each signal at different sampling frequency – signal range – any number of signals ● Store any and all ancilliary information – calibration parameters – filter settings – patient identification ● Easily extensible – need to store new information

SDF Overview

SDF ● Each study in a folder ● “Control.txt” file contains information about the study and about the other files in ASCII text form. ● Each channel data in a separate file – Each file contains header and data sections ● Each type of event in a separate file – Each file contains header and data sections ● Header section contains information on what is in the data section. ● Header section in ASCII text human readable form.

control.txt ● human readable text file ● contains lines of keywords with attribute name and value pairs: – patient -id “John Doe” -age 32 ● Similar to tcl/tk, attribute names start with a “-”. ● Minimal but not necessary set of keywords defined for PSGs ● An application may define new keywords or attribute name-value pairs. ● unfamiliar keywords or name-value pairs should be ignored.

control.txt keywords ● patient – -name – -id ● recording – -id ● datafile – -name – -label ● eventfile – -name – -label

Example patient -name “John Doe” -age 42 -number recording -id “Sleep Study” datafile -name ch0.x -label EEG datafile -name ch1.x -label EEG datafile -name ekg.x -label EKG eventfile -name artifact.dat -label “Artifact” eventfile -name stage.dat -label “Sleep Stage” videofile -name video.mpeg -label “PLMS_Video”

Data Files ● Header and data sections with header in ascii. ● Similar to control.txt, there are keywords and zero or more attribute name and value pairs. ● Keywords and attributes are used to describe the data and how to read it. ● A minimal but not necessary set of keywords are defined for PSGs. ● Unrecognized keywords and attributes should be ignored by applications.

Data Files ● Default values should be assumed for keywords and attributes that are not present in the data - so theoretically no application should die because of what the application considers as incomplete header. ● Raw data starts after the keyword “data” with no attributes

Data File Minimal Keyword Set ● patient – -id – -name ● recording – -id ● data – -asciitrue or false – -binarytrue or false – -bytesbyte length per data point – -typechar, int, float, double,etc

Data File Minimal Keyword Set ● data – -samplingratein Hz – -calibrationcalibration method – -storage“big-endian or little endian”

Example data file patient -name “John Doe” -id age 42 recording -site “Sleep Disorders Clinic” data -ascii true data -type “integer” -calibration insignal data -sampling rate data

Event files ● Header and event data section ● All event data should be ASCII as well ● Data section consists of columns. ● First column represents a measure of time - clock time, time elapsed since start of recording, epoch number, sample number etc. ● Rest of the columns represent the event that happened the time specified in the first column. ● The header documents how the events are stored.

Event File Minimal Keyword Set ● patient – -name -id ● recording – -id ● event – -labelname of event – -basistime, sample or epoch # – -basisunitsunits of the basis (eg. seconds) – -variablevariable name representing data listed in order – -typeint, double, char, string, etc

Example patient -name “John Doe” -age 42 recording -site “Sleep Clinic” event -label “Calibration” -datafile ch0.x event -basis “time” -basisunits “seconds” event -variable “slope” -type “float” -units “uV/t” event -variable “offset” -type “float” -units “uV” data

SDF Advantages ● ASCII header - easy to understand and decipher data files ● Minimal keyword and attribute pairs allows exchange of data ● Other keywords can be defined by individual institutions for their own benefit. ● All types of data supported. Event data stored efficiently. ● Each study is a collection of smaller files

SDF Advantages ● Can access each channel's data independently and quickly - faster throughput of analysis programs and one can easily blind certain channels for RCTs. ● One can encrypt certain attribute values after data has been recorded. ● Applications not encumbered with epoch duration, sampling rate, number of signals etc imposed by data format.

SDF Disadvantages ● A collection of files can lead to data loss if not careful in cataloging. ● Carriage-return, line-feed in unix versus DOS has been an irritant. ● Combining ASCII and binary in one file not aesthetically pleasing.