# 1 METADATA: A LEGACY FOR OUR GRANDCHILDREN N. Scott Urquhart STARMAP Program Director Department of Statistics Colorado State University.

Slides:



Advertisements
Similar presentations
VARYING RESIDUAL VARIABILITY SEQUENCE OF GRAPHS TO ILLUSTRATE r 2 VARYING RESIDUAL VARIABILITY N. Scott Urquhart Director, STARMAP Department of Statistics.
Advertisements

Mine Action Information Center
PENNSYLVANIA CODE Title 25.Environmental Protection Chapter 102 Erosion and Sediment Control Clean Streams Law-Act 394 of To preserve and improve.
# 1 CSU’s EPA-FUNDED PROGRAM ON SPACE-TIME AQUATIC RESOURCE MODELING and ANALYSIS PROGRAM (STARMAP) N. SCOTT URQUHART SENIOR RESEARCH SCIENTIST DEPARTMENT.
An Overview STARMAP Project I Jennifer Hoeting Department of Statistics Colorado State University
EMAP West Training Presentations Jay Araas Department of Statistics In partial fulfillment of the requirements for the Degree of Master of Science Colorado.
Multi-Lag Cluster Enhancement of Fixed Grids for Variogram Estimation for Near Coastal Systems Kerry J. Ritter, SCCWRP Molly Leecaster, SCCWRP N. Scott.
# 1 STATISTICAL ASPECTS OF COLLECTIONS OF BEES TO STUDY PESTICIDES N. SCOTT URQUHART SENIOR RESEARCH SCIENTIST DEPARTMENT OF STATISTICS COLORADO STATE.
Robust sampling of natural resources using a GIS implementation of GRTS David Theobald Natural Resource Ecology Lab Dept of Recreation & Tourism Colorado.
Nonparametric, Model-Assisted Estimation for a Two-Stage Sampling Design Mark Delorey, F. Jay Breidt, Colorado State University Abstract In aquatic resources,
Bayesian Models for Radio Telemetry Habitat Data Megan C. Dailey* Alix I. Gitelman Fred L. Ramsey Steve Starcevich * Department of Statistics, Colorado.
1 STARMAP: Project 2 Causal Modeling for Aquatic Resources Alix I Gitelman Stephen Jensen Statistics Department Oregon State University August 2003 Corvallis,
EPA & Ecology 2005 # 1 AN ACADEMICIAN’S VIEW OF EPA’s ECOLOGY PROGRAM ESPECIALLY ITS ENVIRONMENTAL MONITORING AND ASSESSMENT PROGRAM (EMAP) N. Scott Urquhart,
PAGE # 1 A PROGRAM IN STATISTICAL SURVEY DESIGN AND ANALYSIS FOR AQUATIC RESOURCES STARMAP: THE PROGRAM AT COLORADO STATE UNIVERSITY SPACE-TIME AQUATIC.
Semiparametric Mixed Models in Small Area Estimation Mark Delorey F. Jay Breidt Colorado State University September 22, 2002.
Bayesian modeling for ordinal substrate size using EPA stream data Megan Dailey Higgs Jennifer Hoeting Brian Bledsoe* Department of Statistics, Colorado.
PAGE # 1 NWQMC NWQMC December 11, 2002 SELECTION OF WATER QUALITY MONITORING SITES and CSU’s STARMAP by N. Scott Urquhart Department of Statistics Colorado.
1 Accounting for Spatial Dependence in Bayesian Belief Networks Alix I Gitelman Statistics Department Oregon State University August 2003 JSM, San Francisco.
LEARNING MATERIALS for AQUATIC MONITORING N. Scott Urquhart Department of Statistics Colorado State University.
PAGE # 1 Presented by Stacey Hancock Advised by Scott Urquhart Colorado State University Developing Learning Materials for Surface Water Monitoring.
Quantifying fragmentation of freshwater systems using a measure of discharge modification (and other applications) David Theobald, John Norman, David Merritt.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Two-Phase Sampling Approach for Augmenting Fixed Grid Designs to Improve Local Estimation for Mapping Aquatic Resources Kerry J. Ritter Molly Leecaster.
Example For simplicity, assume Z i |F i are independent. Let the relative frame size of the incomplete frame as well as the expected cost vary. Relative.
Habitat association models  Independent Multinomial Selections (IMS): (McCracken, Manly, & Vander Heyden, 1998) Product multinomial likelihood with multinomial.
PAGE # 1 STARMAP OUTREACH Scott Urquhart Department of Statistics Colorado State University.
October, A Comparison of Variance Estimates of Stream Network Resources Sarah J. Williams Candidate for the degree of Master of Science Colorado.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
1 Learning Materials for Surface Water Monitoring Gerald Scarzella.
Optimal Sample Designs for Mapping EMAP Data Molly Leecaster, Ph.D. Idaho National Engineering & Environmental Laboratory Jennifer Hoeting, Ph. D. Colorado.
Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.
# 1 POSSIBLE LESSONS FOR CEER-GOM FROM EMAP N. Scott Urquhart STARMAP Program Director Department of Statistics Colorado State University.
Random Effects Graphical Models and the Analysis of Compositional Data Devin S. Johnson and Jennifer A. Hoeting STARMAP Department of Statistics Colorado.
1 Learning Materials for Surface Water Monitoring Gerald Scarzella.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Administration Of A Website Information Architecture November 17, 2010.
The Natural Resources Digital Library Needs, Partners, and Challenges Bonnie Avery, Janine Salwasser, & Janet Webster Oregon State University.
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML National Water Quality Monitoring Council August 20, 2003.
ESA UNCLASSIFIED – For Official Use Data Stewardship Interest Group WGISS-39 Meeting Data Purge Alert Procedure Tsukuba, Japan – May, 2015 Mirko.
Improving Data Entry and Reporting for the HOPWA Program May 2012.
AON Data Questionnaire Results 21 Respondents Last Updated 27 March 2007 First AON PI Meeting Scot Loehrer, Jim Moore.
The Western Waters Digital Library: Building a Resource Through Multi- State Collaboration and Technology Dawn Paschal Assistant Dean, Digital Library.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
U.S. Department of the Interior U.S. Geological Survey NWIS, STORET, and XML Advisory Committee on Water Information September 10, 2003 Kenneth J. Lanfear,
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
Support of the Framework for Monitoring Office of Management and Budget March 26, 2003.
1 Proposed Adoption of Biological and Toxicological Water Quality Data Elements and WQDE Guide LeAnne Astin Interstate Commission on the Potomac River.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
REGIONAL COORDINATION High Level Indicators Draft “white paper” to recommend a core set indicators that can be shared among all types of monitoring Protocol.
DAMARS/STARMAP 8/11/03# 1 STARMAP YEAR 2 N. Scott Urquhart STARMAP Director Department of Statistics Colorado State University Fort Collins, CO
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
PAGE # 1 EaGLes Conference EaGLes Conference December 3, 2002 CASE STUDY 6: INDICATORS OF MARSH MAINTENANCE OF ELEVATION comments by N. Scott Urquhart.
Web Information Retrieval Prof. Alessandro Agostini 1 Context in Web Search Steve Lawrence Speaker: Antonella Delmestri IEEE Data Engineering Bulletin.
# 1 CSU’s EPA-FUNDED PROGRAM ON “APPLYING SPATIAL AND TEMPORTAL MODELING OF STATISTICAL SURVEYS TO AQUATIC RESOURCES” N. SCOTT URQUHART RESEARCH SCIENTIST.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
VARYING DEVIATION BETWEEN H 0 AND TRUE  SEQUENCE OF GRAPHS TO ILLUSTRATE POWER VARYING DEVIATION BETWEEN H 0 AND TRUE  N. Scott Urquhart Director, STARMAP.
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Concepts.
The Bear River Watershed Information System Jeffery S. Horsburgh Utah Water Research Laboratory Utah State University David.
Making FAAM Flights Discoverable
Scientific Information Management Approaches Needed to Support Global Assessments: The USEPA's Experience and Information Resources Jeffrey B. Frithsen.
Data Management: Documentation & Metadata
Statewide / Framework Datasets: Approaches and Outcomes
A (prototype) Shiny app for QCing continuous stream sensor data
Indicator structure and common elements for information flow
Role of Metadata in Census Data Dissemination
TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER
The role of metadata in census data dissemination
July, 2019 Joint Statistical Meetings
WHERE TO FIND IT – Accessing the Inventory
Presentation transcript:

# 1 METADATA: A LEGACY FOR OUR GRANDCHILDREN N. Scott Urquhart STARMAP Program Director Department of Statistics Colorado State University

# 2 DISCLAIMERSDISCLAIMERS  The work reported here today was developed under the STAR Research Assistance Agreement CR awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of author and the STARMAP, the program he represents. EPA does not endorse any products or commercial services mentioned in this presentation.  The people of CEER-GOM have heard parts of this presentation. Sorry. That presentation at Ocean Springs, MS (3/26/02) led to an invitation for this talk.

# 3 CONTEXT FOR COMMENTS  SPACE-TIME AQUATIC RESOURCES MODELING AND ANALYSIS PROGRAM = STARMAP  STARMAP IS FUNDED BY EPA’s STAR PROGRAM, AS ARE ALL OF THE EaGLes PROGRAMS (==> “SIBLING” PROGRAMS)  STARMAP IS TO USE EMAP AS A DATA SOURCE AND CONTEXT  NSU = STARMAP PROGRAM CSU 10 YEARS OF COLLABORATION WITH EMAP 40 + YEARS AS STATISTICIAN WORKING WITH ECOLOGISTS

# 4 AN IMPORTANT LESSON  YOU DO NOT KNOW WHAT YOUR DATA WILL BE USED FOR 20 YEARS FROM NOW  BY THE TIME THE VARIOUS EaGLes PROGRAMS ARE COMPLETE WE, AS TAX PAYERS, WILL HAVE INVESTED > $40M IN THE VARIOUS STUDIES  THE RESULTING DATA NEEDS TO BE RESPONSIBLY AND READILY AVAILABLE TO FUTURE GENERATIONS

# 5 YOU DO NOT KNOW WHAT YOUR DATA WILL BE USED FOR 20 YEARS FROM NOW  POPULAR PRESPECTIVE - WE “KNOW” LOTS ABOUT THE “ENVIRONMENT”  REALITY: GOOD AQUATIC DATA IS SCARCE  SPATIALLY EXTENSIVE  OVER A REASONABLE TIME SPAN  WELL DOCUMENTED PROCEDURES  WELL TRAINED CREWS  CAREFULLY EXECUTED STUDIES  DATA PUBLICALLY AVAILABLE

# 6 THE VALUE OF “METADATA”  DATA  WITHOUT CONTEXT ARE NUMBERS NEARLY WORTHLESS TO OTHERS How many file cabinets full of data are in your park offices?  DATA WITH CONTEXT IS INFORMATION CAN BE VALUABLE TO OTHERS  CONTEXT IS CALLED METADATA

# 7 VERY DISCOURAGING EXPERIENCE WITH HISTORIC DATA  THREE HISTORIC DATA SETS  NUTRIENTS IN NORTHEAST LAKES Larsen, D. P., N. S. Urquhart and D. Kugler (1995). Regional scale trend monitoring of indicators of trophic condition of lakes. Water Resources Bulletin 31:  E. COLI IN A RIVER BASIN IN OREGON  NUTRIENTS IN LAKES & STREAMS IN EPA REGION 10  EMAP SURFACE WATERS I THOUGHT THIS WAS WELL DOCUMENTED!

# 8 SO WHAT IS METADATA?  BEST DEF’N SEEMS TO BE ORGANIZED “DATA ABOUT DATA”  VERY DIVERSE VIEWS ABOUT WHAT IT SHOULD CONTAIN: LIBRARIANS W3 - GROUP - - DEFINING FEATURES OF THE WORLD WIDE WEB { title, description, publication date and author } CENSUS-BUREAU TYPES, WORLDWIDE GEOGRAPHIC DATA STANDARDS EPA’s STORET

# 9 WHAT IS METADATA GOOD FOR?  A Librarian probably would answer  Discovery  Managing the resource (Ownership &responsibility) ARCHIVING AUTHENTICATING - QA/QC - UNCHANGING GROWING  This statistician answers  For correctly analyzing data in the future  Not discovery, but correct utilization  Paths to related documents based on the same dataset

# 10 METADATA COMPONENTS IMPORTANT TO A PERSON ANALYZING THE DATA  NAME OF DATASET  DEFINITION OF RESPONSES EVALUATED  MOTIVATING FACTORS  INTERNAL FEATURES OF DATASET

# 11 IMPORTANT METADATA COMPONENT: DATASET NAME  IS THIS REALLY IMPORTANT?  YES!  IMPORTANT FINDINGS FROM A DATASET WILL BE PUBLISHED. WE NEED TO ADOPT A CONVENTION THAT THE DATASET NAME IS A KEYWORD. Name needs to be permanent and consistently used THEN THEN FUTURE INVESTIGATORS CAN USE STANDARD SEARCH TOOLS TO FIND INFORMATION EXTRACTED FROM EACH DATASET.  MUCH LONGER LIVED THAN WEB LINKS

# 12 IMPORTANT METADATA COMPONENT: DATASET NAME { continued }  Filtering criteria for data on which publication is based Name of existing named subset Geographic/temporal subset Response subset

# 13 IMPORTANT METADATA COMPONENT: DEFINITION OF RESPONSES EVALUATED  USE IT TO DOCUMENT  SITE SELECTION AND LOCATION  FIELD PROTOCOLS FOR GATHERING DATA & MATERIAL Peck DV, Lazorchak JM, Klemm DJ, editors EMAP Surface Waters: Western Pilot Study field operations manual for wadeable streams. Corvallis (OR): U.S. Environmental Protection Agency, Office of Research and Development. 275 p.  LABORATORY METHODS  QUALITY ASSURANCE/QUALITY CONTROL

# 14 IMPORTANT METADATA COMPONENT: MOTIVATING FACTORS  WHAT WERE THE STUDY OBJECTIVES?  Scale = one page (perhaps a lot more in this context); Specific objectives Narrative on their origin  WHY & HOW WERE THE SITES SELECTED?  From some population of sites (restrictions)  Purposefully  Good idea - accessibility of whole study plan

# 15 IMPORTANT METADATA COMPONENT: INTERNAL FEATURES OF DATASET  LARGE DATASETS OFTEN CONSIST OF MANY SUB DATA SETS  EG: EMAP MAHA DATA COLLECTION CONSISTS OF 42 SAS DATASETS UNIQUE SITE IDENTIFICATION; WITH DATE OF SITE VISIT DATA IS UNIQUELY IDENTIFIED.  Why was this subset of the data constructed?  Who knows more about it  Which responses are in which data sets? Be careful that values are the same in each data set

# 16 IMPORTANT METADATA COMPONENT: INTERNAL FEATURES OF DATASET (continued)  Data dictionary  Usable paths to definition of variables  METHODS USED TO DEAL WITH  NONDETECTS, MISSING OR LOST DATA, ETC

# 17 THANK YOU FOR YOUR ATTENTION Acknowledgement: Nancy Chaffin, Metadata Librarian, Morgan Library, Colorado State University QUESTIONS and/or COMMENTS ARE WELCOME