1 Data Preservation Imperatives: The Role of the US National Science Foundation Lucy Nowell, Ph.D. Office of Cyberinfrastructure Conference on Permanent.

Slides:



Advertisements
Similar presentations
21 st Century Science and Education for Global Economic Competition William Y.B. Chang Director, NSF Beijing Office NATIONAL SCIENCE FOUNDATION.
Advertisements

Legal Framework for e-Research July 2007 Gold Coast, Australia Chris Greer US National Science Foundation Office of Cyberinfrastructure (NSF/OCI)
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Social and behavioral scientists building cyberinfrastructure David W. Lightfoot Assistant Director, National Science Foundation Social, Behavior & Economic.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Overview of the National Science Foundation (NSF) and the Major Research Instrumentation (MRI) Program Office of Integrative Activities National Science.
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
Data, Data Everywhere…. September 8, 2011 The Coalition for Academic Scientific Computation José-Marie Griffiths, PhD Vice President for Academic Affairs.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Research Cyberinfrastructure Alliance Working in partnership to enable computationally intensive, innovative, interdisciplinary research for the 21 st.
Designing the Microbial Research Commons: An International Symposium Overview National Academy of Sciences Washington, DC October 8-9, 2009 Cathy H. Wu.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Partnerships and Broadening Participation Dr. Nathaniel G. Pitts Director, Office of Integrative Activities May 18, 2004 Center.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
The Materials Genome Initiative and Materials Innovation Infrastructure Meredith Drosback White House Office of Science and Technology Policy September.
Students Becoming Scientists in the World: Integrating Research and Education for Sustainable Development Dr. James P. Collins Directorate for the Biological.
Open Access Symposium 2015 Open Access, the Law, and Public Information Mary Alice Baish UNT Dallas College of Law May 19, 2015 National Plan for Access.
AIAA’s Publications Business Publications New Initiatives Subcommittee Wednesday, 9 January 2008 Rodger Williams.
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
“Sometime in the 2010s, if all goes well, the Large Synoptic Survey Telescope (LSST) will start to bring a vision of the heavens to Earth. Suspended.
Towards a European network for digital preservation Ideas for a proposal Mariella Guercio, University of Urbino.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
1 Investing in America’s Future The National Science Foundation Strategic Plan for FY Advisory Committee for Cyberinfrastructure 10/31/06 Craig.
Cyberinfrastructure A Status Report Deborah Crawford, Ph.D. Interim Director, Office of Cyberinfrastructure National Science Foundation.
Dr. Fran Berman, RPI Feedback from BRDI Sponsor Forum 11/11 January 29, 2012 Fran Berman.
NSF Programs and Possibilities Research Linkages EU-US 23 September 2004 Sylvia Spengler US National Science Foundation.
ESIP Federation Air Quality Cluster Partner Agencies.
Chris L. Greer, Program Manager, NSF Federation of Earth Science Information Partners (ESIP) January 3-5, 2007, Portland, OR.
Speeding Nano Progress Using Information Diffusion Walt Warnick, Ph.D. Director, Office of Scientific and Technical Information U.S. Department of Energy.
PSCIC Working Group: Parag Chitnis Chris Greer Susan Lolle Sam Scheiner Jane Silverthorne Bill Zamer Manfred Zorn.
Congress created the NSF in 1950 as an independent federal agency. Budget ~$7.0 billion (2012) Funding for basic research.
Catawba County Board of Commissioners Retreat June 11, 2007 It is a great time to be an innovator 2007 Technology Strategic Plan *
NSF – HSI Workshop 1 Introduction & NSF Overview NSF Workshop for Sponsored Project Administrators at Hispanic Serving Institutions April 13, Miami,
Overview of NSF and the Directorate for Biological Sciences (BIO) Overview of NSF and the Directorate for Biological Sciences (BIO) Tom Brady Division.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
HPC Centres and Strategies for Advancing Computational Science in Academic Institutions Organisers: Dan Katz – University of Chicago Gabrielle Allen –
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Committee Meeting, June 9, 2008 Strategic Institutional Research Plan.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
O C I October 31, 2006Office of CyberInfrastructure Implementing the Strategic Vision for Digital Data NSF Data Group ACCI Meeting October 31, 2006.
ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe September 26-27, 2006 ARL Prue.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Applied Sciences Perspective Lawrence Friedl, Program Director NASA Earth Science Applied Sciences Program LANCE User Working Group Meeting  September.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
DOE Data Management Plan Requirements
Digital Data Collections ARL, CNI, CLIR, and DLF Forum October 28, 2005 Washington DC Chris Greer Program Director National Science Foundation.
23 October 2006 Michel Sabourin, Ph.D. Chair, Canadian National Committee for CODATA Presentation made during the XXth International CODATA Conference,
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
Speeding Nano Progress by Accelerating the Spread of Knowledge Walt Warnick, Ph.D. Director, Office of Scientific and Technical Information U.S. Department.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Faculty Councils Brad Whittaker Director, Research Services and Industry Liaison Strategic Research Plan.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Cultural Heritage in Tomorrow ’s Knowledge Society Cultural Heritage in Tomorrow ’s Knowledge Society Claude Poliart Project Officer Cultural Heritage.
Expedition Workshop Strategic Leadership For Networking and Information Technology Education September 16, 2008 Chris Greer Director, NCO.
U N I T E D S T A T E S D E P A R T M E N T O F C O M M E R C E N A T I O N A L O C E A N I C A N D A T M O S P H E R I C A D M I N I S T R A T I O N.
Digital Data Collections in Biology Collaborative Expedition Workshop November 8, 2005 Arlington, Virginia Chris Greer Program Director National Science.
NSF Draft Strategic Plan for Data, Data Analysis, and Visualization Chris Greer Program Director National Science Foundation.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
ICPSR Data Fair November 8, 2010 Katherine McNeill, MIT Libraries
DataNet Collaboration
Associate Director for Research, Education and Marine Operations
Summit 2017 Breakout Group 2: Data Management (DM)
Briefing to ARL Membership
Wrap-Up – NSF Site Visit 8 February 2010
Presentation transcript:

1 Data Preservation Imperatives: The Role of the US National Science Foundation Lucy Nowell, Ph.D. Office of Cyberinfrastructure Conference on Permanent Access to the Records of Science Brussels, Belgium 15 November 2007

2 Outline NSF Office of Cyberinfrastructure Motivation for Data Preservation Role of Universities and Academic Libraries Characteristics of the Digital Age NSF OCI Data Strategic Vision and Goals

3

4 NSF Act of 1950 “To promote the progress of science…” Encourage & develop a national policy for the promotion of basic research and education in the math, physical, medical, biological, engineering and other sciences Initiate & support basic scientific research in the sciences

5 National Aeronautic and Space Administration Environmental Protection Agency Smithsonian Institution Nuclear Regulatory Commission Other agencies Commerce Science Advisor Other boards, councils, etc. U.S. President Independent Agencies Major Departments Science Advisor Office of Science and Technology Policy Office of Management and Budget Agriculture Health and Human Services InteriorHomeland Security DefenseEnergy

6 Research Directorates Biological Sciences Computer & Info. Science & Eng. Education & Human Resources Engineering Geosciences Mathematical & Physical Sciences Social, Behaviorial & Econ. Sciences Offices CyberInfrastructure Integrative Activities Polar Programs International Science and Engineering National Science Foundation Director Deputy Director National Science Board

7 New Modes of Investigation The conduct of science and engineering is changing and evolving. This is due, in large part, to the expansion of networked cyberinfrastructure … NSF Strategic Plan

8 Terry Langendoen Office of CyberInfrastructure (OCI) Dan Atkins Office Director José Muñoz Dep. Office Dir. Lucy Nowell Diana Rhoten Kevin Thompson Judy Hayden Mary Daley Irene Lombardo Deborah White Steve Meacham, Abani Patra Data Learning & Workforce Virtual Organizations Software/ Middleware High Performance Computing

9 … is the organized aggregate of technologies that enable us to access and integrate today’s information technology resources—data and storage, computation, communication, visualization, networking, scientific instruments, expertise—to facilitate science and engineering goals. - Fran Berman, Director, SDSC Cyberinfrastructure …

10 CI Vision : 4 Interrelated Perspectives Data, Data Analysis & Visualization High Performance Computing Collaboratories, Observatories & VirtualOrganizations Learning & WorkforceDevelopment

11 The Fragility of Memory in a Digital Age Report of the Task Force on Archiving of Digital Information Commission on Preservation and Access and the Research Libraries Group “In 1964, the first electronic mail message was sent from either MIT, the Carnegie Institute, or Cambridge University. The message does not survive, however, and so there is no documentary record to determine which group sent the pathbreaking message.”

12 NASA plans new search for missing moon tapes Aug. 15, 2006, 5:13PM Seth Borenstein, Associated Press WASHINGTON —NASA said today it was launching an official search for more than 13,000 original tapes of the historic Apollo moon missions.

StudyResource type Resource half-life Koehler (1999 and 2002) Random Web pages 2.0 years Nelson and Allen (2002) Digital Library Object 24.5 years Harter and Kim (1996) Scholarly Article Citations 1.5 years Rumsey (2002) Legal Citations 1.4 years Markwell and Brooks (2002) Biological Science Education Resources 4.6 years Spinellis (2003) Computer Science Citations 4.0 years Source: Koehler W. (2004) Information Research, 9 (2), 174

14 Replication of Results: A Cornerstone of Science “…the results of one scientist's experiment are not considered reliable until another scientist has replicated them. The reproducibility of results plays several different, crucial roles in science…[but] in many circumstances, considerations of time and money often make reproducibility impractical.” The Key Role of Replication in Science, Nancy S. Hall, The Chronicle of Higher Education, 10 November 2000

15 Replication of Results First and foremost, scientists attempt to reproduce someone else's experiment if they doubt that the results are accurate, or if the results contradict a view that is widely accepted in the field. An experiment is so reproducible that replicating it becomes a test of the student; if the student cannot replicate the experiment, it is the student who is at fault. As a training exercise, a new person [in a group] might be asked to repeat experiments that others have already performed, both to familiarize the newcomer with the work of the group and to give the older members a sense of the newcomer's expertise. The Key Role of Replication in Science, Nancy S. Hall, The Chronicle of Higher Education, 10 November 2000

16 Replication of Data Collection Not Always Feasible Medical experiments carried out over years or decades, involving hundreds or even thousands of human subjects. Events that are singular and beyond the experimenter's control, like comets, earthquakes, and volcanic eruptions. The Key Role of Replication in Science, Nancy S. Hall, The Chronicle of Higher Education, 10 November 2000

17 A Global Response “Ensuring research data are easily accessible, so that they can be used as often and as widely as possible, is a matter of sound stewardship of public resources.” Organization for Economic Cooperation and Development (OECD); “Promoting Access to Public Research Data for Scientific, Economic, and Social Development”

18 “If we are effectively to preserve for future generations the …. corpus of information in digital form that represents our cultural record, we need … to commit ourselves technically, legally, economically, and organizationally to the full dimensions of the task.” Report of the Task Force on Archiving of Digital Information, 1996 Commission on Preservation and Access and the Research Libraries Group A Challenge for Society

19 The Universities “Ever since their inception, universities have been occupied with the fundamental elements of what we now call 'knowledge management', i.e. the creation, collection, preservation and dissemination of knowledge.” Andre Oesterlinck, Knowledge Management in Post-Secondary Education: Universities

20 The distinctive mission of the University is to serve society as a center of higher learning, providing long-term societal benefits through transmitting advanced knowledge, discovering new knowledge, and functioning as an active working repository of organized knowledge. Mission Statement of the University of California

21 The Academic Libraries “It is to the research library community that others will look for the preservation of … digital assets, as they have looked to us in the past for reliable, long-term access to the ‘traditional’ resources and products of research and scholarship.” Association of Research Libraries (ARL) Strategic Plan

22 Information is the currency of the digital age and information integration is the means for mobilizing that currency for discovery, innovation, learning, and progress.

23

24

25

26

27 x y z x y z Timet x y z t x y z x y z t t Before the Digital Age: A World Constrained to 4 Dimensions

28 x y z x y z Timet x y z t x y z x y z t t CI 5th Dimension

29 Opening a 5 th dimension through cyberinfrastructure is the revolutionary force of the digital age …

30 Characteristics of a 5D World: (in priority order) 1.Time and place are no longer barriers to participation and interaction 2.Access is open to specialists and non- specialists alike 3.Information is the primary driver for progress 4.The realm of the possible is expanded through new capabilities, resources, and mechanisms

31 Individuals, groups, organizations, and nations that don’t embrace the 5th dimension will fall behind in the digital age

32 The World Is Flat - Thomas Friedman More room for innovation New spaces for learning and discovery Expanded opportunities for collaboration and interaction Greater capabilities for research and education The flat world is expanding - Anonymous OCI program director

33 NSF Draft Strategic Plan for Data, Data Analysis, and Visualization Chapter 3

34 Vision “Science and engineering digital data are routinely deposited in a well-documented form, are regularly and easily consulted and analyzed by specialists and non- specialists alike, are openly accessible while suitably protected, and are reliably preserved.” NSF Cyberinfrastructure Vision for 21st Century Discovery, Chapter 3

35 Goals To catalyze the development of a system of science and engineering data collections that is open, extensible and evolvable. To support development of a new generation of tools and services facilitating data acquisition, mining, integration, analysis, and visualization.

36 Principles Data generated with NSF funding will be accessible and reliably preserved Research/education opportunities determine investment priorities Broad community engagement is necessary in reviewing and prioritizing data activities

37 Principles ( cont’d ) Data is only useful if it can be found, understood, and analyzed Legitimate privacy, confidentiality, and intellectual property rights must be protected International, interagency, and public- private partnerships are essential

38 Digital Data Preservation and Access Framework Federal State Local International Non-profit College University USER Commercial Multi-Sector Nimble Sustainable Reliable User-centric

39 DataNet A robust and resilient national and global digital data framework for preservation and access to the resources and products of the digital age Provide reliable digital preservation, access, integration and analysis capabilities for science and/or engineering over a decades-long timeline: sustainability Continuously anticipate and adapt to changes in technologies & user needs and expectations Engage at the frontiers of science & engineering research & education, with research & development to drive the leading edge forward Serve as component elements of an interoperable data preservation and access network, spanning national and international boundaries: shared governance and standards Creation of new types of organizations that fully integrate all of these capabilities

40 DataNet Partners Combine expertise in library and archival sciences; computer, computational and information sciences; cyberinfrastructure; and domain sciences and engineering Develop models for economic and technological sustainability over multiple decades Engage at the frontiers of science and engineering research and education Work cooperatively and in coordination to to create a functional data network with revolutionary new capabilities for information access, use, and integration without regard to conventional barriers such as data type and format, discipline or subject area, and time and place/institution.

41 DataNet Partner Responsibilities Provide for full data management life cycle Data deposition/acquisition/ingest Data curation & metadata management Data protection, including privacy Data discovery, access, use, & dissemination Data interoperability, standard, & integration Data evaluation, analysis, & visualization Engage in research central to DataNet responsibilities Education & training Community & user input assessment International engagement – collaborate & coordinate closely with preservation & access organizations to catalyze formation of a global data network Foreign collaborators are expected to secure support from their own national sources.

42 Summary Strategic Plan Promote a change in culture Catalyze development of a national digital data framework Support new generations of tools, services, and capabilities

43 NSFNet Traffic September 1991

44 The World Wide T=T0 = Data point-of-presence

45 The World Wide T=TN

46 The Whole Is Greater Than the Sum of Its Parts Climate Change Pandemic Drought and Starvation Sustainable Energy Aging Populations Human Behavior under Stress Etc.

47 Thank you!