Vegetation Data Management:

Slides:



Advertisements
Similar presentations
Oracle SQL Developer Data Modeler 3.0: Technical Overview March 2011.
Advertisements

Chapter 10: Designing Databases
Forest Markup / Metadata Language FML
The VegBank taxonomic datamodel Robert K. Peet Sponsored by: The Ecological Society of America US National Science Foundation Produced at: The National.
Test Case Management and Results Tracking System October 2008 D E L I V E R I N G Q U A L I T Y (Short Version)
The North American Carbon Program Google Earth Collection Peter C. Griffith, NACP Coordinator; Lisa E. Wilcox; Amy L. Morrell, NACP Web Group Organization:
VegBank.org: a Permanent, Open-Access Archive for Vegetation Plot Data. Michael T. Lee 1, Michael D. Jennings 2, Robert K. Peet 1. Interacting with the.
Plant Systematics databases: Users perspectives Robert K. Peet, University of North Carolina In collaboration with The National Center for Ecological Analysis.
Data models for Community information Robert K. Peet, University of North Carolina John Harris, Nat. Center for Ecol. Analysis & Synthesis Michael D. Jennings,
VegBank A vegetation field plot archive Sponsored by: The Ecological Society of America - Vegetation Classification Panel Produced at: The National Center.
EcoInformatics & Vegetation Science. The symposium message Plant community ecology is on the brink of a dramatic transformation that will be made possible.
The VegBank taxonomic datamodel Robert K. Peet Sponsored by: The Ecological Society of America US National Science Foundation Produced at: The National.
November 2011 At A Glance GREAT is a flexible & highly portable set of mission operations analysis tools that increases the operational value of ground.
Vegetation Plot Management: A National Plots Database Demo Funding: National Science Foundation (DBI ) John Harris - NCEAS Robert K. Peet - University.
“This presentation is for informational purposes only and may not be incorporated into a contract or agreement.”
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Database System Concepts and Architecture Lecture # 2 21 June 2012 National University of Computer and Emerging Sciences.
ISpheres Project. Project Overview iSpheresCore iSpheresImage Demonstration References.
Midterm Exam Chapters 1,2,3,5, 6,7 (closed book) March 11, 2014.
® IBM Software Group © 2007 IBM Corporation J2EE Web Component Introduction
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
Archivists' Toolkit - CRADLE Presentation, 10 Feb The Archivists’ Toolkit CRADLE Presentation 10 Feb
Putting it all together Dynamic Data Base Access Norman White Stern School of Business.
XML Registries Source: Java TM API for XML Registries Specification.
Project Overview Graduate Selection Process Project Goal Automate the Selection Process.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
METS at UC Berkeley Generating METS Objects. Background Kinds of materials: –primarily imaged content & tei encoded content archival materials: manuscripts.
Team Members Team Members Tim Geiger Joe Hunsaker Kevin Kocher David May Advisor Dr. Juliet Hurtig November 8, 2001.
Project Overview Graduate Selection Process Project Goal Automate the Selection Process.
Vegetation Data Management: VegBank Funding: National Science Foundation (DBI ) January 8, 2002 John Harris - NCEAS.
NMNH EMu DAMS Integration Project Rebecca Snyder Smithsonian, NMNH.
The VegBank taxonomic datamodel Sponsored by: The Ecological Society of America - Vegetation Classification Panel Produced at: The National Center for.
Collections. Vegetation sampling We observe and collect data on soil.
The VegBank Data Model. Biodiversity data structure Taxonomic database Plot/Inventory database Occurrence database Plot Observation/ Collection Event.
Application Development
The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.
VegBank A vegetation field plot archive Produced at: The National Center for Ecological Analysis and Synthesis Principal Investigators: Robert K. Peet,
The challenge of organism identity --- The flora of the Southeast The flora of the Southeast as a case study Robert K. Peet University of North Carolina.
The challenge of biodiversity: Plot, organism and taxonomic databases Robert K. Peet University of North Carolina The National Plots Database Committee.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
2) Database System Concepts and Architecture. Slide 2- 2 Outline Data Models and Their Categories Schemas, Instances, and States Three-Schema Architecture.
VegBank A vegetation field plot archive Produced at: The National Center for Ecological Analysis and Synthesis Principal Investigators: Robert K. Peet,
Overview of Basic 3D Experience (Enovia V6) Concepts
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
The Holmes Platform and Applications
CIS 375 Bruce R. Maxim UM-Dearborn
Databases (CS507) CHAPTER 2.
Data sharing and exchange: Experiences within the
Database Development (8 May 2017).
Chapter 2: Database System Concepts and Architecture - Outline
Chapter 2 Database System Concepts and Architecture
Lecture 8 Database Implementation
EVLA Archive The EVLA Archive is the E2E Archive
Web Engineering.
PHP / MySQL Introduction
Taxonomic and Community Classification Resources and Standards
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Design and Maintenance of Web Applications in J2EE
CS 174: Server-Side Web Programming February 12 Class Meeting
Chapter 2: Database System Concepts and Architecture
The Re3gistry software and the INSPIRE Registry
Chapter 2 Database Environment Pearson Education © 2009.
Overview of Basic 3D Experience (Enovia V6) Concepts
Distributed System Using Java 2 Enterprise Edition (J2EE)
Analysis models and design models
Challenge Grant Update
Chapter 2 Database Environment Pearson Education © 2009.
Oracle SQL Developer Data Modeler
Best Practices in Higher Education Student Data Warehousing Forum
Presentation transcript:

Vegetation Data Management: VegBank John Harris - NCEAS January 9, 2002 Funding: National Science Foundation (DBI-9906838)

VegBank Project organized and conducted by: Robert K. Peet, University of North Carolina Marilyn Walker, USDA Forest Service & U. Alaska Dennis Grossman, The Nature Conservancy / NatureServe Michael Jennings, USGS-BRD John Harris, NCEAS Project supported by: National Center for Ecological Analysis & Synthesis U.S. National Science Foundation USGS-BRD Gap Analysis Program NatureServe / The Nature Conservancy

VegBank Design Goals Support the National Vegetation Classification. Provide a comprehensive facility to store the most commonly collected vegetation plot data attributes. Provide the user with a large number of user-defined attributes to store not-so-commonly collected data. Integrate plots with the dynamic plant taxonomy and vegetation community data.

Core elements of the National Plots Database Project Plot Plot Observation Taxon Observation Taxon Interpretation Plot Interpretation

Partial view of the VegBank Entity-Relationship Diagram Partial view of the VegBank Entity-Relationship Diagram

VegBank Plots Database

Taxonomy Module Smithsonian meeting: Peet-Taswell model vs Berendsohn model FGDC Biological Nomenclature Working Group Update on NatureServe & HDMS Prospects for implementation

A usage represents a unique combination of a taxon and a name. Usages can be used to track nomenclatural synonyms Name Usage Taxon

‘www.VegBank.org’ beta release March 2002 Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Publication Analysis Extraction Archival Integration Collection

Development Cycle Preliminary design Build prototype #1 interface User evaluates Evaluation studied by designer Design modification are made prototype #n Supported by 3 other NCEAS Developers Database Design: Aug. 2000 – Jan. 2001 Interface Design: Nov. 2000 – Feb. 2001 Backend Development: Jan. 2001 - Interface Development: Mar. 2001 – Backend Version: Prototype 3 Interface Version: Prototype 1 Expected Beta Release: Late Sept. – Mid Oct.

Taxonomy Database Design Goals Logical separation of a "taxonomic name" with the "taxonomic concept", so that taxonomic data can be stored at the most 'atomic' level without ambiguity The ability to incorporate multiple organizations' 'views' of how a taxonomic name is applied to a taxonomic concept The ability to link a taxonomic name used in the Plots database with a 'name - concept' pair in the taxonomic database. *Although one can store vegetation community data in the same database table-structure as the plant taxonomy database, we have implemented two separate table structures and have created two separate data sets.

Development Choices Representative tools reflect the desire to have the following features: High performance Robust Open architecture Platform neutral Scaleable Recognition that tools will have to be flexible/scaleable to act as central server and a desktop client minimum hardware requirements this is not locked because the interface design may impact the use of some of these tools

Features - JAVA Java -- Write once, Cross platform – Linux, Windows, MacOS* Java Servlet -- Dynamic, database-driven, web content JDBC -- Connect to any database - Oracle, PostgreSql, SQL Server backend Swing -- Classy interface tools Beans -- Reusable components Servlets: dynamic html generation including media generated from: database query results logic calculations etc.. * Not tested yet :-)

Features - XML XML: is the format for structured data on the Web. Simple and flexible data conversions, using XSLT Straightforward to write generic tools which export parts of a relational database as XML encoded data, or even to write generic code that serializes Java (or other) objects as XML data structures. Examples later… Recognition that tools will have to be flexible/scaleable to act as central server and a desktop client

An Example Workflow Using Wisconsin Plots Data What data integration means to us Taxonomic / Semantic Integration Data formatting for database ingestion General Comments about Current Format Data Parsing Transformation to XML standard Legacy Data Loader

Plots Data Integration & DB Ingestion Reformat by Hand Research MS Access MS Excel Perl Shell scripts ? Plots DB Integration What is meant by data integration? …

(Ashe) Engler & Graebner Taxonomic Integration Carya ovata (Miller)K. Koch Carya carolinae-sept. (Ashe) Engler & Graebner sec. Gleason 1952 sec. Radford et al. 1968 Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies. Integration

Semantic Integration of Plot Attributes ‘Basic yet Important’ Cover Scales Strata Dimensions Environmental Attributes Integration

XML Parse Data from Forms into Table Structure to be Transformed into XML Consistent with the Database Structure Text Forms Columnar Tables XML Integration

Parsed Data Text Forms Columnar Tables Integration

Plots DB XML Data Definition (XML) Transform Parsed Data to XML Consistent with the Plots Database Columnar Tables Legacy Data Loader Plots DB XML Data Definition (XML)   Integration

Data Definition (XML) – Single file <plotDataPackage> <plotDataFile> <fileName>siteData.csv</fileName> <attributeDelimeter>’,’</attributeDelimeter> <fileTheme> site data </fileTheme> <attribute> <attributeName>plotCode</attributeName> <plotDBAttribute>authorPlotCode</plotDBAttribute> <attributePosition>1</attributePosition> </attribute> <attributeName>communityName</attributeName> <plotDBAttribute>communityName</plotDBAttribute> <attributePosition>2</attributePosition> … <state> </plotDataFile> </plotDataPackage> Integration

Data Definition (XML) – Multiple files <plotDataFile> <fileName>vegData.csv</fileName> <constraint> <fileName>siteData.csv</fileName> <themeName> site data </themeName> <attributeName>authorPlotCode</attributeName> <cardnality>'+'</cardnality> </constraint> <attributeDelimeter>‘,’</attributeDelimeter> <fileTheme>species</fileTheme> <attribute> <attributeName>plotName</attributeName> <plotDBAttribute>authorPlotCode</plotDBAttribute> <attributePosition>1</attributePosition> </attribute> <attributeName>scientificName</attributeName> <plotDBAttribute>taxonName</plotDBAttribute> <attributePosition>2</attributePosition> Integration

Plots Database XML Integration <strata> <plot> <authorPlotCode> </authorPlotCode> <plotType> </plotType> <samplingMethod> </samplingMethod> <plotOriginLat> </plotOriginLat> <plotOriginLong> </plotOriginLong> <plotShape> </plotShape> <plotSize> </plotSize> <plotSizeAcc> </plotSizeAcc> <altValue> </altValue> <altPosAcc> </altPosAcc> <slopeAspect> </slopeAspect> <slopeGradient> </slopeGradient> <slopePosition> </slopePosition> <hydrologicRegime> </hydrologicRegime> <soilDrainage> </soilDrainage> <surfGeo> </surfGeo> <plotObservation> <previousPlot> </previousPlot> <plotStartDate> </plotStartDate> <plotStopDate> </plotStopDate> <dateAccuracy> </dateAccuracy> <effortLevel> </effortLevel> <strata> <stratumCover> </stratumCover> <stratumHeight> </stratumHeight> </strata> <taxonObservations> <authNameId> </authNameId> <originalAuthority> </originalAuthority> <strataComposition> <strataType> </strataType> <percentCover> </percentCover> </strataComposition> </taxonObservations> <communityType> <classAssociation> </classAssociation> <classQuality> </classQuality> <startDate> </startDate> <stopDate> </stopDate> </communityType> </plotObservation> <plotContributor> <role> </role> <party> </party> </plotContributor> </plot> Integration

Existing Prototype Functionality Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Vegetation Database Client Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Agenda: Over-Arching Concepts Project Overview · Impact · Database Design · System Architecture · Challenges Use-Case Example: Wisconsin Data Data Management Recommendations Future Directions

Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

Vegetation Desktop Database Client

Extra slides to follow:

General Data Management Practices general formats weird formats unusable formats modeled the software after the way that people collect plots data -- at least that is what I thought At times tortuous path to the database in terms of reformating class indicies (these are rectified at the plots loading software step)

Management Case: Example from Wisconsin Baraboo Hills -- Collected Yesterday PEL -- Legacy Data

Data Transformation of Forms