Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz.

Slides:



Advertisements
Similar presentations
Advanced SQL Topics Edward Wu.
Advertisements

Developing Metadata Standards for GLEON Barbara Benson.
Paul Hanson, Fang-Pang Lin, Miron Livny, Chin Wu, Chris Solomon, Many colleagues of the GLEON Transforming ecological sensor networks from data collectors.
IT Working Group Report Moderators: Barbara Benson Fang-Pang Lin.
GLEON 8-IT Working Group (ITWG8) Fang-Pang Lin, Feuchtmayr Heidrun Barbara Benson, Hsiu-Mei Chou, Hipsey Matthew, Honti Mark, Nakamura Ryosuke, Solomon.
Wouldnt it be cool if…. …at the press of a button, we could calculate Wedderburn number and other physical lake characteristics smooth buoy data to specific.
Information Systems Today: Managing in the Digital World
Access Tables 1. Creating a Table Design View Define each field and its properties Data Sheet View Essentially spreadsheet Enter fields You must go to.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
World Meteorological OrganizationIntergovernmental Oceanographic Commission of UNESCO Ship Observations Team ~ integrating and coordinating international.
Database management system (DBMS)  a DBMS allows users and other software to store and retrieve data in a structured way  controls the organization,
Computer Concepts BASICS 4th Edition
WELCOME to the LTER Data Co-op with PASTA (Provenance Aware Synthesis Tracking Architecture) All Scientists Meeting 2012 Your source for LTER data.
HydroServer A Platform for Publishing Space- Time Hydrologic Datasets Support EAR CUAHSI HIS Sharing hydrologic data Jeffery.
Flood Map Library MD. M. HAQUE DWR-HYDROLOGY. Building a Flood Map Library Indexing existing flood maps and geospatial data for search and retrieval Separate.
Use of an innovative meta-data search tool improves variable discovery in large-p data sets like the Simons Simplex Collection (SSC) Leon Rozenblit, JD,
ODM2: Developing a Community Information Model and Supporting Software to Extend Interoperability of Sensor and Sample Based Earth Observations Jeffery.
Components of an Integrated Environmental Observatory Information System Cyberinfrastructure to Support Publication of Water Resources Data Jeffery S.
NOAO/Gemini Data workshop – Tucson,  Hosted by CADC in Victoria, Canada.  Released September 2004  Gemini North data from May 2000  Gemini.
Peoplesoft Fundamentals David Lewis 10/18/02 (adapted from Psoft Training Materials)
Chapter 3 Database Management
15 November Review Introduction to Databases. Take Home: Hand In.
GLEON Data Management Luke Winslow PASEO 3/18/09.
Information Technology in Organizations
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Databases and Database Management Systems
1 Space-Time Datasets in Arc Hydro II by Steve Grise (ESRI), David Maidment, Ernest To, Clark Siler (CRWR)
Tools for Publishing Environmental Observations on the Internet Justin Berger, Undergraduate Researcher Jeff Horsburgh, Faculty Mentor David Tarboton,
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Exercises: Organizing, Loading, and Managing Point Observations Using HydroServer Support EAR CUAHSI HIS Sharing hydrologic data
Advancing an Information Model for Environmental Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Richard P. Hooper, Kerstin Lehnert, Kim Schreuders,
Workshop on QC in Derived Data Products, Las Cruces, NM, 31 January 2007 ClimDB/HydroDB Objectives Don Henshaw Improve access to long-term collections.
Computer Science 101 Database Concepts. Database Collection of related data Models real world “universe” Reflects changes Specific purposes and audience.
Key Applications Module Lesson 21 — Access Essentials
WaveMaker Visual AJAX Studio 4.0 Training Basics: Building Your First Application Binding Basics.
Data Models for Ecological Databases John Porter Department of Environmental Sciences University of Virginia John Porter Department of Environmental Sciences.
Google Visualization Mapper
Web Development Web development never ends: 1.Find out what the stakeholders need (sponsors, users, etc.) 2.Investigate available technology 3.Plan the.
Database Concepts Track 3: Managing Information using Database.
Central Arizona Phoenix LTER Center for Environmental Studies Arizona State University Database Design Peter McCartney (CAP) RDIFS Training Workshop Sevilleta.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
Information Systems Today: Managing in the Digital World TB3-1 3 Technology Briefing Database Management “Modern organizations are said to be drowning.
A table is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows.
The CUAHSI Observations Data Model Jeff Horsburgh David Maidment, David Tarboton, Ilya Zaslavsky, Michael Piasecki, Jon Goodall, David Valentine,
Data Model / Database Implementation (continued) Jeffery S. Horsburgh Hydroinformatics Fall 2014 This work was funded by National Science Foundation Grants.
1 Geog 357: Data models and DBMS. Geographic Decision Making.
PREPARED BY: PN. SITI HADIJAH BINTI NORSANI. LEARNING OUTCOMES: Upon completion of this course, students should be able to: 1. Understand the structure.
INFORMATION TECHNOLOGY DATABASE MANAGEMENT. A database is a collection of information organized to provide efficient retrieval. The collected information.
Lecture 5 Data Model Design Jeffery S. Horsburgh Hydroinformatics Fall 2012 This work was funded by National Science Foundation Grant EPS
Hydroinformatics Lecture 15: HydroServer and HydroServer Lite The CUAHSI HIS is Supported by NSF Grant# EAR CUAHSI HIS Sharing hydrologic data.
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Using Python to Retrieve Data from the CUAHSI HIS Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2015 This work was funded by National Science.
The CUAHSI Hydrologic Information System Spatial Data Publication Platform David Tarboton, Jeff Horsburgh, David Maidment, Dan Ames, Jon Goodall, Richard.
Web Development Web development never ends:
IGCSE 4 Cambridge Designing a database table Computer Science
Meeting the challenges of an international, grassroots organization of sites deploying sensor networks: the Global Lake Ecological Observatory Network.
Jeffery S. Horsburgh Utah State University
Information Systems Today: Managing in the Digital World
Lecture 8 Database Implementation
CUAHSI HIS Sharing hydrologic data
Chapter 4 Relational Databases
Data Acquisition, Management and Manipulation
Staying afloat in the sensor data deluge
To Be Safe For Now: Keep your shapefiles simple
The ultimate in data organization
Chapter 3 Database Management
New Technologies for Storage and Display of Meteorological Data
Cycle 3: Unit 27 Lessons 104 – 111.
Presentation transcript:

Vega: A Flexible Data Model for Environmental Time Series Data L. A. Winslow, B. J. Benson, K. E. Chiu, P. C. Hanson, T. K. Kratz

Storing High Resolution Sensor Data in a Relational Database Deploy system Create data table Date/Time column Each variable is unique column Mendota_Buoy_Table:

Accommodate Additional Site Create Additional Table Table Name from Site Name Mendota_Buoy_Table: Long_Lake_Buoy_Table: 12 What about 5 sites? Or 10?

Changes in Measured Variables Add or remove variables End up with many NULL fields Legacy Structure

Add Complex Metadata Add Metadata –Sensor Info –Data steward –Offset (depth, height) –Sampling Method Combine in Field Name –DO_05M –DO_DOPTO_05M –DO_YSI_10M –DO_YSI_CALIBRATED_10M –WIND_SPEED_VECTOR_AVG

Long-term datasets are becoming more common

Vega Data Model Goals –Accommodate dataset changes over time Eliminate legacy structure –Easy to understand and develop software –Maintain rapid query times Inspired by the CUAHSI ODM

Central Concepts Values –Individual observation (floating point format) –Air temp at airport at 12: (-5.1° C) –Individually linked to metadata Data Streams –Group of Values which vary only in time –Individual time series –All air temp sampled at airport Wind speed is different Data Stream

Vega: Simple

Indexing Speeds up searching through large tables –Vega impossible without it Similar to an alphabetized phonebook With Index: –Time ~ Log(number of rows) Without Index: –Time ~ number of rows Values Index (also Unique) –DateTime –StreamID

Performance 40 million Value Database Time to Query –One Value: 0.07 Sec –~20k Values: 0.5 Sec Data Volumes –GLEON ~90,000 new values per day –Currently storing 30 million values –Values table 2.6 GB

Software Development Gains Software for one site works for all sites Example: HTML –Many document formatting standards –HTML emerged as standard –Millions of websites can be read by one browser

Current software for GLEON and Madison LTER: Data Acquisition

Data Retrieval: dbBadger.gleonrcn.org

Data QA/QC

Vision Simple software package –No IT support required –Facilitate web-enabled data sharing Future –Expand to all GLEON sites –Include those with custom IM system in place

Acknowledgements This work was supported by awards from the National Science Foundation grants DEB , DBI , and DBI and the Gordon and Betty Moore Foundation.

Performance