George E. Brown, Jr. Network for Earthquake Engineering Simulation 4 th regular meeting of the NEES preservation advisory committee Stanislav Pejša
AGENDA Overview of infrastructure Overview of YEAR 4 - FY 2013 Data Seal of Approval Plans for year 5 - FY 2014 Open Forum
OVERVIEW OF INFRASTRUCTURE Infrastructure Workflows Metadata and documentation
NEES Infrastructure 14 engineering laboratories Shake Tables University at Buffalo UC San Diego UN, Reno Tsunami Wave Basin OSU Geotechnical Centrifuges RPI UC Davis Field Experiments UC Los Angeles UC Santa Barbara UT at Austin Large Scale Laboratories Cornell University Lehigh University UC Berkeley UIUC UM, Twin City
Infrastructure workflow at NEES site * * these tasks are carried out by the RT at non-NEES sites
Curation workflow curation is a process – starts early – "exit" interview – data upload – reminders – data review – experiment review – copyright compliance – preservation – DOI management
File system structure - Metadata proxy
Collection of metadata and documentation Auto complete – Names – Organization – Facility Check boxes – Equipment System-generated folders/Tabs – Genre/Format – Proxy for metadata – Location on file system – File format constrains Templates – Sensors Forms – Material properties Metadata Names of researchers Affiliated organization Description Title Dates Testing facility Equipment Material properties Type of test Proper location Adequate file format Sensors
OVERVIEW OF YEAR 4 - FY 2013 Improved Communication Improved Curation Performance NEES research data publication
Some statistics as of 7289 registered users 200 research projects – out of them 101 are completed and curated volume of uploaded files in research projects : GB (18. 1 TB) number of files in research projects: files per research project: 8455
Some more statistics
… And more statistics FY 2013 FY 2012 OVERALL extension.jpg jpg jpg txt70060.txt41354.txt msd19385.csv3512.bin43011.mat17464.png3232.log30006.bin7925seed2806.mat20399.nef5057.avi1561.msd19385.png3976.xls +.xlsx1287.csv18274.mp43369.mat1279.png18223.csv3238.pdf1042.avi15359.pdf2809.zip989.zip10168.wmv1740.ini917.dat9336.xls +.xlsx1669info819.pdf8111.dat655.bin803.xls + xlsx6750.out604.raw701.nef5057.thm589count700.mp43953
Improved communication more workpower - a new workflow curator + two graduate students more frequent and early communications with RT more frequent reviews post-test review of curation requirements with RT(some sites) curation handout with summary of requirements escalation process greater involvement of the Strategic Council curation status - public now
Improved curation performance
Publishing NEES research data Date# of issued DOIs
A curated public dataset
Citation and Attribution Recommended citation format Researcher 1, Researcher 2, Researcher 3 (YYYY), “Experiment Title” Network for Earthquake Engineering Simulation (distributor), Dataset, DOI: /D3SQ8QH1F Users of the data are expected to cite the data sets they used in the recommended format as shown above and also include an acknowledgement to the NEES Data Repository. To acknowledge the NEEShub Data Repository: The facilities of the George E. Brown Network for Earthquake Engineering Simulation (NEES) Data Repository were used for access to data and metadata used in this study ( The NEES Data Repository is funded through the National Science Foundation and specifically the CMMI Directorate through the National Science Foundation under Cooperative Agreement Number CMMI
Data Publication at NEES All recently curated experiments – Have assigned DOI – Have improved metadata that facilitate discovery Datasets are considered published information The Earthquake Spectra journal is accepting a new type of manuscript called Data Papers. –Peer-reviewed papers that describe datasets of interest to the earthquake community –Data must be publically available with a Digit Object Identifier (DOI) –Inaugural issue of data papers in Earthquake Spectra
Discoverable NEES datasets
Exposure of the NEES data repository
DATA SEAL OF APPROVAL
DSA - Expected Timeline August 2013 started September rough draft October 7, DSA workshop October 30, submission for review November 30, results of the review
PLANS FOR YEAR 5 - FY 2014 Improving infrastructure Amending current features Preparing NEES data repository for transfer to a new awardee
Improvement of infrastructure improving the workflow support for submitting jobs to the grid environment finalising work on the file registry linking publications and metadata introducing OAI-PMH or ResourceSync automated data upload from the several sites to the NEES data repository improved delivery of the file format registry loading keyword terms from ASCE into NEEShub
OPEN FORUM Are there other grants/programs/centers that that are comparable in terms of data collection, data sharing, preservation, and curation?