Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vegetation Data Management:

Similar presentations


Presentation on theme: "Vegetation Data Management:"— Presentation transcript:

1 Vegetation Data Management:
VegBank John Harris - NCEAS January 9, 2002 Funding: National Science Foundation (DBI )

2 VegBank Project organized and conducted by:
Robert K. Peet, University of North Carolina Marilyn Walker, USDA Forest Service & U. Alaska Dennis Grossman, The Nature Conservancy / NatureServe Michael Jennings, USGS-BRD John Harris, NCEAS Project supported by: National Center for Ecological Analysis & Synthesis U.S. National Science Foundation USGS-BRD Gap Analysis Program NatureServe / The Nature Conservancy

3 VegBank Design Goals Support the National Vegetation Classification.
Provide a comprehensive facility to store the most commonly collected vegetation plot data attributes. Provide the user with a large number of user-defined attributes to store not-so-commonly collected data. Integrate plots with the dynamic plant taxonomy and vegetation community data.

4 Core elements of the National Plots Database
Project Plot Plot Observation Taxon Observation Taxon Interpretation Plot Interpretation

5

6

7

8

9

10 Partial view of the VegBank
Entity-Relationship Diagram Partial view of the VegBank Entity-Relationship Diagram

11 VegBank Plots Database

12 Taxonomy Module Smithsonian meeting:
Peet-Taswell model vs Berendsohn model FGDC Biological Nomenclature Working Group Update on NatureServe & HDMS Prospects for implementation

13 A usage represents a unique combination of a taxon and a name.
Usages can be used to track nomenclatural synonyms Name Usage Taxon

14

15

16 ‘www.VegBank.org’ beta release March 2002
Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

17

18 Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

19

20

21 Publication Analysis Extraction Archival Integration Collection

22 Development Cycle Preliminary design Build prototype #1 interface
User evaluates Evaluation studied by designer Design modification are made prototype #n Supported by 3 other NCEAS Developers Database Design: Aug – Jan. 2001 Interface Design: Nov – Feb. 2001 Backend Development: Jan Interface Development: Mar – Backend Version: Prototype 3 Interface Version: Prototype 1 Expected Beta Release: Late Sept. – Mid Oct.

23

24 Taxonomy Database Design Goals
Logical separation of a "taxonomic name" with the "taxonomic concept", so that taxonomic data can be stored at the most 'atomic' level without ambiguity The ability to incorporate multiple organizations' 'views' of how a taxonomic name is applied to a taxonomic concept The ability to link a taxonomic name used in the Plots database with a 'name - concept' pair in the taxonomic database. *Although one can store vegetation community data in the same database table-structure as the plant taxonomy database, we have implemented two separate table structures and have created two separate data sets.

25 Development Choices Representative tools reflect the desire to have the following features: High performance Robust Open architecture Platform neutral Scaleable Recognition that tools will have to be flexible/scaleable to act as central server and a desktop client minimum hardware requirements this is not locked because the interface design may impact the use of some of these tools

26 Features - JAVA Java -- Write once, Cross platform – Linux, Windows, MacOS* Java Servlet -- Dynamic, database-driven, web content JDBC -- Connect to any database - Oracle, PostgreSql, SQL Server backend Swing -- Classy interface tools Beans -- Reusable components Servlets: dynamic html generation including media generated from: database query results logic calculations etc.. * Not tested yet :-)

27 Features - XML XML: is the format for structured data on the Web.
Simple and flexible data conversions, using XSLT Straightforward to write generic tools which export parts of a relational database as XML encoded data, or even to write generic code that serializes Java (or other) objects as XML data structures. Examples later… Recognition that tools will have to be flexible/scaleable to act as central server and a desktop client

28 An Example Workflow Using Wisconsin Plots Data
What data integration means to us Taxonomic / Semantic Integration Data formatting for database ingestion General Comments about Current Format Data Parsing Transformation to XML standard Legacy Data Loader

29 Plots Data Integration & DB Ingestion
Reformat by Hand Research MS Access MS Excel Perl Shell scripts ? Plots DB Integration What is meant by data integration? …

30 (Ashe) Engler & Graebner
Taxonomic Integration Carya ovata (Miller)K. Koch Carya carolinae-sept. (Ashe) Engler & Graebner sec. Gleason 1952 sec. Radford et al. 1968 Splitting one species into two illustrates the ambiguity often associated with scientific names. If you encounter the name “Carya ovata (Miller) K. Koch” in a database, you cannot be sure which of two meanings applies. Integration

31 Semantic Integration of Plot Attributes
‘Basic yet Important’ Cover Scales Strata Dimensions Environmental Attributes Integration

32 XML Parse Data from Forms into Table Structure to be Transformed into
XML Consistent with the Database Structure Text Forms Columnar Tables XML Integration

33 Parsed Data Text Forms Columnar Tables Integration

34 Plots DB XML Data Definition (XML)
Transform Parsed Data to XML Consistent with the Plots Database Columnar Tables Legacy Data Loader Plots DB XML Data Definition (XML) Integration

35 Data Definition (XML) – Single file
<plotDataPackage> <plotDataFile> <fileName>siteData.csv</fileName> <attributeDelimeter>’,’</attributeDelimeter> <fileTheme> site data </fileTheme> <attribute> <attributeName>plotCode</attributeName> <plotDBAttribute>authorPlotCode</plotDBAttribute> <attributePosition>1</attributePosition> </attribute> <attributeName>communityName</attributeName> <plotDBAttribute>communityName</plotDBAttribute> <attributePosition>2</attributePosition> … <state> </plotDataFile> </plotDataPackage> Integration

36 Data Definition (XML) – Multiple files
<plotDataFile> <fileName>vegData.csv</fileName> <constraint> <fileName>siteData.csv</fileName> <themeName> site data </themeName> <attributeName>authorPlotCode</attributeName> <cardnality>'+'</cardnality> </constraint> <attributeDelimeter>‘,’</attributeDelimeter> <fileTheme>species</fileTheme> <attribute> <attributeName>plotName</attributeName> <plotDBAttribute>authorPlotCode</plotDBAttribute> <attributePosition>1</attributePosition> </attribute> <attributeName>scientificName</attributeName> <plotDBAttribute>taxonName</plotDBAttribute> <attributePosition>2</attributePosition> Integration

37 Plots Database XML Integration <strata> <plot>
<authorPlotCode> </authorPlotCode> <plotType> </plotType> <samplingMethod> </samplingMethod> <plotOriginLat> </plotOriginLat> <plotOriginLong> </plotOriginLong> <plotShape> </plotShape> <plotSize> </plotSize> <plotSizeAcc> </plotSizeAcc> <altValue> </altValue> <altPosAcc> </altPosAcc> <slopeAspect> </slopeAspect> <slopeGradient> </slopeGradient> <slopePosition> </slopePosition> <hydrologicRegime> </hydrologicRegime> <soilDrainage> </soilDrainage> <surfGeo> </surfGeo> <plotObservation> <previousPlot> </previousPlot> <plotStartDate> </plotStartDate> <plotStopDate> </plotStopDate> <dateAccuracy> </dateAccuracy> <effortLevel> </effortLevel> <strata> <stratumCover> </stratumCover> <stratumHeight> </stratumHeight> </strata> <taxonObservations> <authNameId> </authNameId> <originalAuthority> </originalAuthority> <strataComposition> <strataType> </strataType> <percentCover> </percentCover> </strataComposition> </taxonObservations> <communityType> <classAssociation> </classAssociation> <classQuality> </classQuality> <startDate> </startDate> <stopDate> </stopDate> </communityType> </plotObservation> <plotContributor> <role> </role> <party> </party> </plotContributor> </plot> Integration

38 Existing Prototype Functionality
Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

39 Vegetation Database Client
Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

40 Agenda: Over-Arching Concepts Project Overview · Impact · Database Design · System Architecture · Challenges Use-Case Example: Wisconsin Data Data Management Recommendations Future Directions

41 Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

42 Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

43 Bulk of work to date has been on the ‘back-end’ and that have been waiting for the meeting to discuss the ‘front-end’ or interface issues Maybe this should only be the backend discussion? Conceptual diagrams showing client server relationships show the architectural/system diagrams - servlet information question the importance of a standalone system and discuss the use of morpho and beans pinpoint the integration of the chosen tools to the system that already has been designed

44 Vegetation Desktop Database Client

45

46

47

48

49 Extra slides to follow:

50

51

52 General Data Management Practices
general formats weird formats unusable formats modeled the software after the way that people collect plots data -- at least that is what I thought At times tortuous path to the database in terms of reformating class indicies (these are rectified at the plots loading software step)

53 Management Case: Example from Wisconsin
Baraboo Hills -- Collected Yesterday PEL -- Legacy Data

54 Data Transformation of Forms

55


Download ppt "Vegetation Data Management:"

Similar presentations


Ads by Google