LTER Data Management Margaret O’Brien Santa Barbara Coastal Long Term Ecological Research (LTER) Project Santa Barbara Channel Biodiversity Observation Network
LTER Network 25 sites, funded by NSF (1980) Independent research, locally driven Primary production Populations Ecosystem cycling Organic Inorganic (nutrients) Disturbance Considerable leveraging Partnerships USDA, USFS, Integrated data management
LTER Data Management o Sites are the experts for their own data, and should work with the systems and partnerships that are available o Sites collaborate as appropriate within the LTER to develop the software they need, and create economy of scale by converging practices and mechanisms o Network sets goals and provides a collaborative atmosphere, but does not dictate mechanisms Cross-site comparisons began in the 1990s Developed XML-based data sharing mechanism EML released in 2003
LTER Data Management Today Active DM committee –“Best Practice” docs –ESIP member Central data catalog –DOI assignment Data package checking Automated code generation –R –Matlab –SAS Federation
EML Overview Related specifications –FGDC –ISO –Darwin Core (DwC) Communities –OBIS, GBIF (Darwin Core Archive) –TDWG Features –Rich structural metadata allows automated ingestion –Active development community
Santa Barbara Coastal LTER Understand factors controlling structure and function of kelp forests of the Santa Barbara Channel
Santa Barbara Coastal LTER IMS Established protocols for handling and processing RDBMS -> XML Community standards and expertise Structural quality control for data tables and metadata Machine-readable data Federation Collection to Distribution Servers Backup & processing EML Metadata Export Field Data collection protocols Data Package Checking Data Package Design
Santa Barbara Channel BON IMS Adopted community protocols and standards Per LTER RDBMS -> XML Structural quality control Local catalog TBD Federation mechanism Servers Integrate ingested & de novo data EML Metadata Export Data Package Checking DataONE Member Node? Data Package Design
Possible Next Steps Outline potential strategies for posting data –Examine repositories Exchange specification(s) –Multiple partners & endpoints? Consider impact on local groups Alternatives –Submit once to federated repository for broad retrieval – Begin developing data quality requirements –Examine available tools and vocabularies –Define use cases