Data Management Considerations for the International Polar Year

Slides:



Advertisements
Similar presentations
Developing the ICSU World Data System (WDS) Mustapha Mokrane ICSU Secretariat Science and Information Technology Officer.
Advertisements

Donald T. Simeon Caribbean Health Research Council
In just a few short years, the I&M networks have become known as a key source and supplier of reliable, organized, and retrievable information about parks.
Data Seal of Approval Overview Lightning Talk RDA Plenary 5 – San Diego March 11, 2015 Mary Vardigan University of Michigan Inter-university Consortium.
Recent developments in the UNFCCC process in relation to global observations 4 th GTOS Steering Committee Paris, 1-2 December 2009 Rocio Lichte Programme.
Strengthening International Science for the Benefit of Society Celebrating 75 years:
The IPY Data and Information Service—How do we get there? IPY Data Workshop Cambridge, England 3 March 2006 World Data Center for Glaciology, Boulder Facilitating.
Program Collaboration and Service Integration: An NCHHSTP Green paper Kevin Fenton, M.D., Ph.D., F.F.P.H. Director National Center for HIV/AIDS, Viral.
Dr. Jūratė Kuprienė Director for innovations and infrastructure development Workshop: Information services for research process , Rīga Research.
Report of the Science and Technology Committee GEO Plenary VIII Istanbul, Turkey 16 November 2011.
G O D D A R D S P A C E F L I G H T C E N T E R 1 Global Precipitation Measurement (GPM) GV Data Exchange Protocol Mathew Schwaller GPM Formulation Project.
SAON is a process to support and strengthen the development of multinational engagement for sustained and coordinated pan-Arctic observing and data sharing.
Data and information—the legacy of the IPY Mark A. Parsons
1 INFRA : INFRA : Scientific Information Repository supporting FP7 “The views expressed in this presentation are those of the author.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
The Data Center of the 21 st Century John Bates NOAA National Climatic Data Center.
TEMPLATE DESIGN © An increasing world population, industrial development, globalization and changing weather and climate.
Ellsworth LeDrew, University of Waterloo of-ipy// Mark Parsons Taco de Bruin.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
International Perspectives on Data and Information for Science 1.What is ICSU 2.ICSU and data and information 3.ICSU policy 4.World Summit on the Information.
Preservation Strategies: Framing The Approach Nancy Hoebelheinrich Knowledge Motifs LLC Data Management Workshop American Geophysical.
Producer Questions 6 December Producer Questions 2 Purpose The SIP standard envisions the development of a formal model of the data for.
Draft GEO Framework, Chapter 6 “Architecture” Architecture Subgroup / Group on Earth Observations Presented by Ivan DeLoatch (US) Subgroup Co-Chair Earth.
Enterprise Content Management: Building a Collaborative Framework 32 nd Meeting of the Section of International Organizations, International Council on.
IPY Data Management Mark A. Parsons co-chair IPY Data Policy and Management Subcommittee eGY Meeting Boulder, Colorado, USA 13 March 2006 World Data Center.
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
International Polar Year Data Management CODATA06 Beijing, China 25 October 2006 Mark A. Parsons IPY Data Policy and Management Sub-committee IPY Data.
Course Enhancement Module on Evidence-Based Reading Instruction K-5 Collaboration for Effective Educator Development, Accountability, and Reform H325A
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
HDF and HDF-EOS: Implications for Long-Term Archiving and Data Access.
Convergence And Trust in Earth and Space Science Data Systems Ted Habermann, NOAA National Geophysical Data Center Documentation: It’s not just discovery...
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
International Polar Year Data Management Activities at NSIDC Mark A. Parsons PoDAG Boulder, Colorado 16 February 2005 World Data Center for Glaciology,
ICSU- Strengthening international science for the benefit of society Carthage Smith Deputy Executive Director.
1 Geospatial Standards for Canada Proposed blueprint for Jean Brodeur and Cindy Mitchell.
IPDA Architecture Project International Planetary Data Alliance IPDA Architecture Project Report.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Theme : Information, monitoring & research NWRS Workshops October - December
PAA on Scientific Data and Information Roberta Balstad Chair, PAA Panel.
School on Grid & Cloud Computing International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics.
Stages of Research and Development
Data Management for the International Polar Year
Legacy and future of the World Data System (WDS) certification of data services and networks Dr Mustapha Mokrane, Executive Director, WDS International.
Ensuring Data Quality for Monitoring and Evaluation
The rise of demand-driven climate services Tiago Capela Lourenço cE3c - Centre for Ecology, Evolution and Environmental Changes CCIAM - Climate Change.
Jarek Nabrzyski Director, Center for Research Computing
WHO The World Health Survey General Introduction
Data Management for the International Polar Year
Summit 2017 Breakout Group 2: Data Management (DM)
Presented April 7, 2005 at the 2005 AAG meeting, Denver, CO
Active Data Management in Space 20m DG
ICSU: Strengthening International Science for the Benefit of Society
The Biodiversity and Protected Areas Management (BIOPAMA) Programme
Preprints and Other Interim Research Products NIH perspectives
EOSC Governance Development Forum
Support for the AASHTO Committee on Planning (COP) and its Subcommittees in Responding to the AASHTO Strategic Plan Prepared for NCHRP 8-36, TASK 138.
Measuring Data Quality and Compilation of Metadata
A Funders Perspective Maria Uhle Co-Chair, Belmont Forum Directorates for Geosciences, US National Science Foundation.
An Open Archival Repository System for UT Austin
Developing a Data Model
Statistics beyond the National Level –Regional Experiences
Considerations in Development of the SBSTA Five Year Programme of Work on Adaptation Thank Mr. Chairman. Canada appreciates this opportunity to share.
Open Archival Information System
eGY Planning Meeting Boulder, February 2005
The ESA Earth Observation Long Term Data Preservation (LTDP) Programme
the role of global health funders in the UK
Robin Dale RLG OAIS Functionality Robin Dale RLG
The role of metadata in census data dissemination
EOSC-hub Contribution to the EOSC WGs
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

Data Management Considerations for the International Polar Year World Data Center for Glaciology, Boulder Facilitating the international exchange of snow and ice data Data Management Considerations for the International Polar Year “In the midst of the present IGY, with its vast ramifications, its flood of observations, messages, reports, etc. threatening to overwhelm the individual, the just proudness of this … progress is mingled with a sentimental nostalgia [for] IPY2 … IPY2 was like chamber-music compared to the symphony of the present IGY” Julious Bartels, Annals of the International Geophysical Year, Vol 1., p205 Mark A. Parsons, Ronald L. Weaver, Ruth Duerr, and Roger G. Barry American Geophysical Union San Francisco, California 14 December 2004

IPY1 IPY2 IGY (IPY3) IPY4 ? So if IGY was a symphony what is IPY4 with it’s overwhelming volume of data. Let’s hope it’s not cacophony.

What will IPY4 bring? IPY4 ? Will you be able to find all the data relevant to your research and see relationships between data sets. Will you be able to retrieve IPY4 data in 2050? Will you be able to merge and integrate different data sets across experiments and disciplines? Will you be able to subset, visualize, and transform your data? etc. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Organization of IPY Data Management Data Policy & Management Subcommittee scientists data managers funding agencies IPY Joint Committee eGY Programme Office Data & Information Service Users From the IPY Framework document based on recomendations from JCADM, Clic, and the ICSU Priority Area Assessment on Sci Data and Info Note: Service not system--but still needs to be a portal Recommends and I’m generally assuming open and free access DIS is “conductor” that ensures all data components are coordinated and follow best practices Projects Data Centers, Virtual Observatories, etc. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Systems and Innovation Succeeded “Challenged” Failed Careful of a monolithic data system or overreliance on technology--build on what exists as stated in framework A federated approach can encourage innovation with less risk All bound together with a simple DIS DIS is conducter but we all need to know the music Greater risk with size and complexity The Standish Group’s “CHAOS report”. An assessment of 40,000 IT application projects Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Organization of IPY Data Management Data Policy & Management Subcommittee scientists data managers funding agencies IPY Joint Committee eGY Programme Office Data & Information Service Users Seeking a step improvement in DM Projects Data Centers, Virtual Observatories, etc. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

The People Part “A striking proportion of project difficulties stem from people in both customer and supplier organisations failing to implement known best practice.” — Oxford University/Computer Weekly survey of public and private sector IT projects (emphasis added) However, people are much more able to adapt to change, uncertainty, and messy systems Systems don’t solve problems people do, but they need standards and best practices. Service counts. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

The People Part: Science and Data Management Many have stated the need to involve scientists in data management, but… It is also important to involve data managers in conducting science. Field Experiments: 20% increase in data quality (Parsons, et al. 2004) 70% of experiment cost is data collection (Longley, et al. 2001) Observing systems NRC repeatedly, US Climate Change Sci Prog., JCADM, PAA, IPY agree on scientific involvement (enhances usability) We saw a ~20% improvement in data quality by involving data managers in data collection also Increased completeness Improved data collection protocol when data managers were involved in data collection for a large field experiment. Especially important with all the experiments (~70% of an experiment cost is data collection Longley e.a. 2001) and new obs systems slated to be part of IPY Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Preservation and Access—Two Peas in a Pod Scientific Data Stewardship: “preservation and responsive supply of reliable and comprehensive data, products, and information for use in building new knowledge to…” —USGCRP, 1998 “the long-term preservation of the scientific integrity, monitoring and improving the quality, and the extraction of further knowledge from the data” — H. Diamond et al., NOAA/NESDIS, 2003 Describe archive needs like OAIS: fixity, integrity, etc. vs. access needs such as catalog metadata (also needed for arch) Preservation and access have overlapping metadata requirements Overlapping integrity requirements Both driven by changing user needs Both disrupted by changing technology Therefore: Both must be considered during planning, collection, processing, archiving… “Scientific data and information management can no longer be viewed as a task for untrained amateurs or as part of routine ‘clean up’ at the completion of a research project.” --ICSU PAA However Access still needs work… Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Access. What is it? Preservation requirements are well defined in the Open Archive Information System (OAIS) Reference Model, but No similar model for access requirements — eGY could help Not even a common definition of “access” and what restricts it Unique access requirements for social science data and non-digital collections (physical samples, photographs, audio, etc.) Access needs--relate to portal, We have OAIS, now we need Access ref. Model--first should be open access definition, data integration. Access relates to data policy, capacity, association with publication, finding, data mining, data integration Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Documentation Use existing standards, e.g. Describe uncertainty ISO19115 metadata standard OAIS Reference Model Describe uncertainty Challenge your assumptions “We must not … start from any and every accepted opinion, but only from those we have defined — those accepted by our judges or by those whose authority they recognize.” —Aristotle c. 350 BC No new standards! Uncertainty and errors (data quality) are different things e.g. uncertainty in an algorithm vs errors in measurement Some uncertainty is inherent and uncorrectable (Couclelis, 2003) Lakoff and Johnson (1980)argue that people need a conceptual basis to understand something and that scientists invoke key metaphorical concepts to work observations into a clear consistent structure. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

The Data Itself Formats: Archives and users may have different needs 011000101001001111010111000111101100101010001110011100101010011101010100111000110101101000010000100101001001010110010010001010100100100101010101001010100101001010100000111110010110101010110100010111101011010110101010011000101001001111010111000111101100101010001110011100101010011101010100111000110101101000010000100101001001010110010010001010100100100101010101001010100101001010100000111110010110101010110100010111101011 The Data Itself Formats: Archives and users may have different needs Consider four themes (Raymond, 2004) Transparency Interoperability Extensibility Storage or transaction economy Unique considerations with audio/video There will never be a single standard format but some good examples in the works, (text, OGC, others). eGY could help here. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Data Management Considerations or Themes Manage technical innovation Systems need people Scientists and data managers working together Preservation and Access—Two peas in a pod The nature of the documentation The nature of the data In summary Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004

Data Management Principles (bumper stickers) Preservation without access is pointless; access without preservation is impossible. It’s about DATA not systems Involve scientists in data management & data managers in science Think about long-term archiving NOW! Document uncertainty! A good bumper sticker is catchy and conveys much in few words. Keep things simple & flexible Consider the needs of current, future, and unknown users

What’s Next? The Data and Information Service should be created soon. The Data Sub-Committee needs to consider these themes and principles when developing the IPY data policy. If we don’t think about data and their stewardship now and continuously over the next several years, the results of the international polar year will be meaningless. Data Management Considerations for IPY; Parsons, Weaver, Duerr, Barry; AGU, 14 December 2004