Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

Moving Forward With Digital Preservation at the Library of Congress Laura Campbell Associate Librarian for Strategic Initiatives Library of Congress.
Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar.
The National Digital Stewardship Alliance: Community, Content, Commitment.
Mairéad Martin, Penn State University Commons Solutions Group Storage Workshop May 2010.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
Data Grid: Storage Resource Broker Mike Smorul. SRB Overview Developed at San Diego Supercomputing Center. Provides the abstraction mechanisms needed.
The Frame NSF-funded national supercomputer centers Centers have hosted significant projects: TeraGrid, NPACI, GEON, SCEC, Chronopolis Fostered development.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
INFSO-RI Enabling Grids for E-sciencE Grid & Data Preservation Boon Low System Development, EGEE Training National.
PREMIS in Thought: Data Center for LC Digital Holdings Ardys Kozbial, Arwen Hutt, David Minor February 11, 2008.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
ADAPT An Approach to Digital Archiving and Preservation Technology Principal Investigator: Joseph JaJa Lead Programmers: Mike Smorul and Mike McGann Graduate.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
14 July 2000TWIST George Brett NLANR Distributed Applications Support Team (NCSA/UIUC)
NASA World Wind. What is NASA World Wind? A richly interactive 3D planetary visualization tool. Smart client architecture. Portal for NASA data. Integrates.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
SAN DIEGO SUPERCOMPTER CENTERUC SAN DIEGO LIBRARIESNDIIPP PARTNERS MEETING David Minor SDSC Robert H. McDonald SDSC Sangchul Song UMIACS Bryan.
Simo Niskala Teemu Pasanen
Dogan Seber, PhD San Diego Supercomputer Center University of California, San Diego I. DLESE Library II. DISCOVER OUR EARTH Earth Science Resources for.
Trusted Datagrids: Library of Congress Projects with UCSD Ardys Kozbial – UCSD Libraries David Minor - SDSC.
UNIVERSITY of MARYLAND GLOBAL LAND COVER FACILITY High Performance Computing in Support of Geospatial Information Discovery and Mining Joseph JaJa Institute.
NSF Vision and Strategy for Advanced Computational Infrastructure Vision: NSF Leadership in creating and deploying a comprehensive portfolio…to facilitate.
Toward a Distributed and Collaborative Framework for Preservation Martin Halbert, UNT Dean of Libraries David Minor, Chronopolis Program Manager Katherine.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Presenter: Karla Strieb Assistant Executive Director Transforming Research Libraries June 3, 2010 Supporting E-science: Progress at Research Institutions.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Ensemble Computing in the National Science Digital Library (NSDL)
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
DOE 2000, March 8, 1999 The IT 2 Initiative and NSF Stephen Elbert program director NSF/CISE/ACIR/PACI.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
Interoperability within the Grid NDIIPP Partners Meeting Arlington, VA July 9, 2008 Interoperability within the Grid Robert H. McDonald Digital Preservation.
Rule-Based Preservation Systems Reagan W. Moore Wayne Schroeder Mike Wan Arcot Rajasekar Richard Marciano {moore, schroede, mwan, sekar,
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Archive for the NSDL Reagan W. Moore Charlie Cowart.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Data Integration and Management A PDB Perspective.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
What is NDIIPP doing?. July 7 th, Web-At-Risk is opening its archives for public access, having captured nearly 6 TB of data—the entire CA State Government.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
“A Library outranks any other one thing a community can do to benefit its people.” --Andrew Carnegie.
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
Chronopolis – MetaArchive Improving and Strengthening Inter-Institutional Preservation.
Introduction to The Storage Resource.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
C HRONOPOLIS TM and the D IGITAL P RESERVATION I MPERATIVE Brian E. C. Schottlaender The Audrey Geisel University Librarian ECAR Symposium, 4 December.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
A Project of the University Libraries Ball State University Libraries A destination for research, learning, and friends.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Tapping into National Cyberinfrastructure Resources Donald Frederick SDSC
SAN DIEGO SUPERCOMPUTER CENTER Replication Policies for Federated Digital Repositories Robert H. McDonald Chronopolis Project Manager
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
The National Digital Stewardship Alliance: Stewardship, Collaboration, Inclusiveness, Exchange.
Get Data to Computation eudat.eu/b2stage B2STAGE How to shift large amounts of data Version 4 February 2016 This work is licensed under the.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
The National Digital Stewardship Alliance: Community, Content, Commitment.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Joseph JaJa, Mike Smorul, and Sangchul Song
Presentation transcript:

Background

Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust Trust Agreements between sites Replication Replication of data at multiple, geographically distinct sites UCSD Libraries Data Providers (Data repositories, libraries, etc.)

Chronopolis: Basic Facts Based on a 3-node federated data grid at geographically separate sites Current capacity of up to 50 TB of data per node (150 TB total) Using Storage Resource Broker (SRB) for data management Using BagIt and SRB protocols to transfer data Using several monitoring tools: Auditing Control Environment (ACE), SRB Replication Monitor, SRB System Monitor Analyzing metadata that is created by the various parts of the system Writing best practices documents for clients and partners

Chronopolis Management Chronopolis is being developed by a national consortium led by SDSC and the UCSD Libraries. Initial Chronopolis nodes include: SDSC and the UCSD Libraries at UC San Diego University of Maryland Institute for Advanced Computer Studies (UMIACS) National Center for Atmospheric Research (NCAR) in Boulder, CO UCSD Libraries

SDSC Founded in 1985 with a $170 million grant from the National Science Foundation's Supercomputer Centers program, SDSC is an organized research unit of the University of California, San Diego Its staff of more than 300 includes professionals in multidisciplinary science and technology including software development; data management, analysis and preservation; visualization; high-end computing; and code optimization for a variety of scientific applications SDSC is a founding member of the the TeraGrid, a multi-year effort to build and maintain the world's most powerful and comprehensive distributed computational infrastructure for open scientific research.

NCAR

UMIACS Interdisciplinary research institute with a broad range of research programs at the interface between computer science and other disciplines Annual budget around $25M, primarily coming from NSF, DoD, NASA, NIH, and Industry Over 65 faculty from Computer Science, Engineering, Information Sciences, Linguistics, Life Sciences, and Social Sciences.

Chronopolis Lessons Complexities of multi-organizational model – MOUs, SLAs, multiple business offices, etc Benefit of staff – breadth and depth as well as redundancy Wider palette of tools and diversity of infrastructure

Membership

Current Chronopolis collections March 2009 Data Providers: Inter-university Consortium of Political and Social Research – preservation copy of all collections including 40 years of social science data and Census California Digital Library – political and government web crawls, Web-at-risk collection SIO Explorer – data from 50 years of research voyages NCSU Libraries -- State and local geospatial data

New collections and customers We’re looking for new customers! What kinds of users? – Chronopolis is: Large in scale Not designed as an access system Agnostic to content

New nodes We’re looking (possibly) for new nodes! Who would want to join? – Organizations which see the value in geographic replication – Organizations which have some expertise/infrastructure but not the whole enterprise Why would they want to join? – Gain large preservation environment – New working relationships with other communities (possibly geographically-based)

Extensibility versus Cloning

Advice Don’t re-invent the wheel: – Use existing tools and processes Focus on your targeted, unique collections: – Local content – Shared collection foci Understand the underlying technologies and which would be of benefit (e.g. SRB, BagIt, ACE)

Thanks