DataTools Models Data, models and tools: Dealing with any complex hydraulic engineering problem invariable use is made of: data, models and tools.

Slides:



Advertisements
Similar presentations
Week 2 DUE This Week: Safety Form and Model Release DUE Next Week: Project Timelines and Website Notebooks Lab Access SharePoint Usage Subversion Software.
Advertisements

SubVersion SubVersion svn.oss.deltares.nl. workflow.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Datawarehouse Workflow: ETLP Extract Transform LoadProvide Make user- friendly formats Dynamic database Charts & Maps Tools & websites Archive native formats.
Data Standards Workflow Raw dataScriptsDatabase Store raw data in subversion to keep track of history Stored files (netcdf) accessible through the web.
JTX Overview Overview of Job Tracking for ArcGIS (JTX)
Unveiling ProjectWise V8 XM Edition. ProjectWise V8 XM Edition An integrated system of collaboration servers that enable your AEC project teams, your.
Kick-off meeting Delft, April FP – SPACE no Data Fast First step in data management – repository Gerrit Hendriksen Gerben.
Use of OpenEarthTools to make netCDF
OpenEarth OpenEarthTools = Open source management of Data, Models and Tools for marine & coastal science & technology.. and what about More information.
Word Lesson 8 Increasing Efficiency Using Word
SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)
YOU? OpenEarth PMR-NCV … OpenEarthTools = Open source management of
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Data Warehouse success depends on metadata
Microsoft Visual Source Safe 6.01 Microsoft Visual Source Safe (MVSS) Presented By: Rachel Espinoza.
Активное распределенное хранилище для многомерных массивов Дмитрий Медведев ИКИ РАН.
Data Standards Workflow Raw dataScriptsDatabase Store raw data in subversion to keep track of history Stored files (netcdf) accessible through the web.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
workflow. SubVersion Version control Quality control: “Something adheres to some criteria” Without a properly defined something there.
Version Control. What is Version Control? Manages file sharing for Concurrent Development Keeps track of changes with Version Control SubVersion (SVN)
Chapter 5 Application Software.
Source Code Revision Control Software CVS and Subversion (svn)
Data Warehouse Tools and Technologies - ETL
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
AON Data Questionnaire Results 21 Respondents Last Updated 27 March 2007 First AON PI Meeting Scot Loehrer, Jim Moore.
DISTRIBUTED DATA FLOW WEB-SERVICES FOR ACCESSING AND PROCESSING OF BIG DATA SETS IN EARTH SCIENCES A.A. Poyda 1, M.N. Zhizhin 1, D.P. Medvedev 2, D.Y.
HDF5 A new file format & software for high performance scientific data management.
DM_PPT_NP_v01 SESIP_0715_AJ HDF Product Designer Aleksandar Jelenak, H. Joe Lee, Ted Habermann Gerd Heber, John Readey, Joel Plutchak The HDF Group HDF.
1st Workshop on Intelligent and Knowledge oriented Technologies Universal Semantic Knowledge Middleware Marek Paralič,
To enhance learning, service, and research through an advanced information technology environment. Our Mission:To enhance learning, service,and research.
© 2007 by Prentice Hall 1 Introduction to databases.
Animation of DSM2 Outputs in ArcMap Siqing Liu Bay Delta Office Department of Water Resources 2/17/2015.
Cloudifying Source Code Repositories: How much does it cost? 1 Hadi Salimi, Distributed Systems Labaratory, School of Computer Engineering, Iran University.
Information Systems and Network Engineering Laboratory II DR. KEN COSH WEEK 1.
Object-Oriented Analysis & Design Subversion. Contents  Configuration management  The repository  Versioning  Tags  Branches  Subversion 2.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
VAMDC use-case for the RDA Data Citation Working Group C.M. Zwölf and VAMDC consortium 6 th RDA Plenary PARIS September 2015.
DireXions – Your Tool Box just got Bigger PxPlus Version Control System Using TortoiseSVN Presented by: Jane Raymond.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
The european ITM Task Force data structure F. Imbeaux.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Version Control with SVN Images from TortoiseSVN documentation
Version Control Systems. Version Control Manage changes to software code – Preserve history – Facilitate multiple users / versions.
The HDF Group Data Interoperability The HDF Group Staff Sep , 2010HDF/HDF-EOS Workshop XIV1.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
00/XXXX 1 Data Processing in PRISM Introduction. COCO (CDMS Overloaded for CF Objects) What is it. Why is COCO written in Python. Implementation Data Operations.
© Geodise Project, University of Southampton, Integrating Data Management into Engineering Applications Zhuoan Jiao, Jasmin.
UC 2006 Tech Session 1 NetCDF in ArcGIS 9.2. UC 2006 Tech Session2 Overview Introduction to Multidimensional DataIntroduction to Multidimensional Data.
CARBOOCEAN Data management and SOCAT Benjamin Pfeil, Are Olsen, Jeremy Malzcyk, Steve Hanhin, Alex Kozyr and many others Partner 16 WDC-MARE Partner 19.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Information Systems and Network Engineering Laboratory I DR. KEN COSH WEEK 1.
NcBrowse: A Graphical netCDF File Browser Donald Denbo NOAA-PMEL/UW-JISAO
Source Control Repositories for Enabling Team Working Doncho Minkov Telerik Corporation
Source Control Dr. Scott Schaefer. Version Control Systems Allow for maintenance and archiving of multiple versions of code / other files Designed for.
Information Systems and Network Engineering Laboratory II
GML in CDI and CSR ISO using Ends&Bends
Source Control Dr. Scott Schaefer.
Databases.
Content Management Systems
Design and Programming
Database Systems Instructor Name: Lecture-3.
GitHub A Tool for software collaboration James Skon
Manuscript Transcription Assistant Initiative
Metadata The metadata contains
Oracle SQL Developer Data Modeler
Working with Temporal Data
Palestinian Central Bureau of Statistics
Presentation transcript:

DataTools Models Data, models and tools: Dealing with any complex hydraulic engineering problem invariable use is made of: data, models and tools.

Wat is the problem? Quality, quick availability and accessibility of data for analysis purposes currently not satisfactory Models and tools used and developed by engineers are not sufficiently documented nor version controlled We can do much better! Data: data not under version control, multitude of file formats, metadata not available within data files. Models and tools: different tool versions on users’ PC’s, confusion on version of tool used to perform calculations. Result: inefficiency!

OPeNDAP Server Raw DataTools Models SubVersion Server DetailedSimplified OpenEarth (BwN) provides the infrastructure to deal with this problem. Basic elements: SubVersion server & OPeNDAP server. Paradigm: Fixed structure – flexible access. User Supplier

X Z T Y An array based data structure for storing multidimensional data N-dimensional coordinates systems –X coordinate (e.g. longitude) –Y coordinate (e.g. latitude) –Z coordinate (e.g. altitude) –Time dimension –… other dimensions Variables – support for multiple variables –Temperature, humidity, pressure, salinity, etc Geometry – implicit or explicit –Regular grid (implicit) –Irregular grid –Points NetCDF: NASA's Earth Science Data Systems Standards Process Group recommends NetCDF as data storage standard. Pro’s: data exchangeability, platform independent, robust use and easy to understand. What is NetCDF?

Efficient data storage: Binary NetCDF format enables complete variable definition with a minimal set of numbers (see example) and minimal metadata repetition. Result: efficiency in disk space, easy database querying. XYZQ X YZ 32 numbers14 numbers

transect.nc netcdf transect.nc { dimensions: crossshore = 198 ; time = 3 ; variables: float crossshore_distance(crossshore), shape = [198] crossshore_distance:unit = "meter" float year(time), shape = [3] year:unit = "year" float height(time,crossshore), shape = [3 198] height:unit = "meter" data: coastward_distance = (-65:5:920); year = (2006:2008); height = [ … … … ]; } x = nc_varget(transect.nc, 'crossshore_distance'); y = nc_varget(transect.nc, 'time'); z = nc_varget(transect.nc, 'height'); surface(x, y, z); Example NetCDF file: 198 crossshore points, 3 timestamps, 3 x 198 surface elevations. Metadata in one file together with the data. NB: transect.nc is a binairy file. Easy Matlab routines available: nc_varput, nc_addvar, nc_varget (see upper right) Example:

SubVersion: open source version control system. Users ‘commit’ their files in one central database (update local copy regularly). Every commit receives a unique revisionnumber. Comments indicate per commit which changes were made.

Blame functionality: Subversion knows of each line of code who changed it, when and as part of what revision number. Colors indicate the age of the code (bluer = older). Any change can always be rolled back at any time.

Merge tool: Changes made between any two versions of a tool are easily revealed using the Merge tool. The Merge tool also helps to resolve coding conflicts in case multiple users modified the same code.

Version control: any routine/datafile can automatically be given a comment block with information on: last change date, author, revision number etc. Recording revision info of tools and data used in a project enhances reproducibility of results.

Statistics: Per project/tool a separate repository can be made. Combining reusable tools in one central repository provides large advantages (sharing, cooperation, learning). OpenEarth tools, is open source and freeware.

Raw dataScriptsDatabase Store raw data in subversion to keep track of history Stored files (netcdf) accessible through the web Extract Transform Load Charts & Maps Tools and websites Provide Add meta information Script to convert raw data into netcdf OpenEarth RawData OpenEarth OPeNDAP OpenEarth Tools Data workflow: OpenEarth pre-scribes the following steps to make data available: 1. put raw data in a SubVersion repository, 2. use scripts to transform data to NetCDF including meta data, 3. upload *.nc files to OpenDap server, and 4. provide easy access.

Community of practice: OpenEarth has a wide community of users (Building with Nature, EU FP7 MICORE, Delft Cluster etc.). A wide number of trainingsessions are available (SubVersion use, programmingstandards, etc.).