Lessons About Sustainability Learned from the Open Science Data Cloud Robert Grossman University of Chicago & Open Cloud Consortium.

Slides:



Advertisements
Similar presentations
OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
Advertisements

1 STDM Development: Role of UN-HABITAT Presented at Post-conference workshop on the Social Tenure Domain Model: From Concept to Implementation Washington,
The Open Science Data Cloud Robert L. Grossman University of Chicago and Open Cloud Consortium April 4, 2012 A 501(c)(3) not-for-profit operating clouds.
QCloud Queensland Cloud Data Storage and Services 27Mar2012 QCloud1.
Tom Yarmas CTO – Cloud Technologies U.S. Public Sector Cloud Computing: How to do it right!
Clouds and CI Magellan Perspective Shane Canon Lawrence Berkeley National Lab LSN April 4, 2012.
New business models for open research Todd Vision Jared Lyle Mark Hahnel 12-June-2014Open Repositories1.
 Amazon Web Services announced the launch of Cluster Compute Instances for Amazon EC2.  Which aims to provide high-bandwidth, low- latency instances.
Live for today as if it is your last day but plan for tomorrow as if it will last forever!
Frontier Continuity Services Protecting Your Data with Our Cloud.
Richard MARCIANO Chien-Yi HOU School of Information and Library Science (SILS) Sustainable Archives & Leveraging Technologies Group (SALT) University of.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Persistent Digital Archives and Library System (PeDALS) South Carolina Department of Archives and History.
The University of Texas Research Data Repository : “Corral” A Geographically Replicated Repository for Research Data Chris Jordan.
FIGARO - Federated Network of European Academic Publishers1 Federated Initiative of GAP and Roquade Bas Savenije Utrecht University The Open Archives Initiative:
An Introduction to the Open Science Data Cloud Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium.
Server and Short to Mid Term Storage Funding Research Computing Funding Issues.
Assessment of Core Services provided to USLHC by OSG.
Telecom Grade Cloud Computing László Szilágyi 26 April 2013.
US NITRD LSN-MAGIC Coordinating Team – Organization and Goals Richard Carlson NGNS Program Manager, Research Division, Office of Advanced Scientific Computing.
Ian Bird LHCC Referees’ meeting; CERN, 11 th June 2013 March 6, 2013
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
APHSA Conference To Cloud or Not to Cloud. Speaker’s Bio Russell Nicoll is the Chief Information Officer (CIO) for the TN Dept. of Intellectual & Developmental.
Pennsylvania GTO 3-Year Strategic Plan NSGIC Annual Conference 2005 Rochester, NY Jim Knudson Stacey White
IBM Governmental Programs Open Computing, Open Standards and Open Source Recommendation for Governments.
Four core tenets of sustainability: lessons from the Trusted Digital Repository Process Adam Brin Digital Antiquity.
NEERI Panel. Collaboration for Data & Tools Access to data and tools is the quintessence of most digital research infrastructures or e-infrastructures.
ABIDE Anonymised BIg Data Environment, ABIDE. ABIDE is an Anonymized Data Science Sharing Platform for laboratories and big data users of all types. Uniquely,
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Future support of EGI services Tiziana Ferrari/EGI.eu Future support of EGI.
Risk Management & Legal Issues in Cloud Practice Christopher Dodorico Director, PricewaterhouseCoopers Wednesday, October 10, 2012.
U.S. Department of the Interior U.S. Geological Survey Next Generation Data Integration Challenges National Workshop on Large Landscape Conservation Sean.
Your First Azure Application Michael Stiefel Reliable Software, Inc.
Sharing Research Data Globally Alan Blatecky National Science Foundation Board on Research Data and Information.
API, Interoperability, etc.  Geoffrey Fox  Kathy Benninger  Zongming Fei  Cas De’Angelo  Orran Krieger*
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
User Group 2015 Building A PopMedNet Community. Agenda Slide - 2 What is Open Source? Where are we today? Where should we go?
Michel Desbois, Deputy Administrator, Information Systems and Technology Management The Cooperative State Research, Education and Extension Service eGrants.
Project Matsu: Large Scale On-Demand Image Processing for Disaster Relief Collin Bennett, Robert Grossman, Yunhong Gu, and Andrew Levine Open Cloud Consortium.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
A public-private partnership building a multidisciplinary cloud platform for data intensive science Bob Jones Head of openlab IT dept CERN This document.
1 NGDS is a catalog of documents and datasets that provide information about geothermal resources within the United States, including information from.
Day 2 Breakout Report Session 1: Incident Response Chair: Brent Woodsworth Co-Chairs: Ellen Sogolow & Rufus Edwards.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
2.2.5.G1 © Take Charge Today – August 2013 – Spending Plans – Slide 1 Funded by a grant from Take Charge America, Inc. to the Norton School of Family.
Known Territories l There are several ways that Globus software has been used that are well-trod paths u General-purpose Grid infrastructures u Domain-specific.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Microsoft Azure Active Directory. AD Microsoft Azure Active Directory.
WLCG Laura Perini1 EGI Operation Scenarios Introduction to panel discussion.
LHC Computing, CERN, & Federated Identities
WP3 – Representation of requirements Robert Jenkins CEO CloudSigma 1 This document produced by Members of the Helix Nebula consortium is licensed under.
Consortia and Public-Private Partnerships: Informing the GENI Transition Ilya Baldin RENCI Director for Network Research.
Staging and Testing Analytics in Real Environments Joseph Kielman Visualization and Analytics Programs and Center of Excellence for Command, Control, and.
Spending Plans Advanced Level G1 © Take Charge Today – August 2013 – Spending Plans – Slide 2 Funded by a grant from Take Charge America, Inc.
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No Collaboration.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
SharePoint Taxonomy & Governance If I Don’t Mind, Will It Still Matter? Sean Stecker – Ensynch, Inc.
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Scaling Science Communities Lessons learned by and future plans of the Open Science Grid Frank Würthwein OSG Executive Director Professor of Physics UCSD/SDSC.
CyVerse Tools and Services
Open is not enough! Sustainability and Innovation in Scholarly Communication
Funding Sustainability and Domain Repositories
Scientific Data as Research Infrastructure
GCOS Strategy: advocate-coordinate-communicate
EGI-Engage T. Ferrari/EGI.eu.
Building a CMMI Data Infrastructure
Building a CMMI Data Infrastructure
HPWREN Sustainability Discussion
EOSC-hub Contribution to the EOSC WGs
NM Department of Homeland Security and Emergency Management
Presentation transcript:

Lessons About Sustainability Learned from the Open Science Data Cloud Robert Grossman University of Chicago & Open Cloud Consortium

Bionimbus Protected Data Cloud Operated by University of Chicago 1+ PB controlled access cancer data Started in 2013 Open Science Data Cloud Operated by NFP Open Cloud Consortium 6 PB (pan science) Started in users that compute over data 150 active each month Users utilize between 1000 – 100,000+ core hours per month Same open source software stack

Useful Models Today Organizations and projects can contribute through a “condo model” Modest fees can be charged on those communities that can pay, including commercial entities. PIs can add charges in their grants for data infrastructure, which are paid when they contribute their data to the infrastructure. PIs can add charges in their grants which are paid when they make significant use of data infrastructure. PIs can add charges in their grants to contribute cap exp to publically funded repositories and infrastructure Federal agencies can provide grants to support publicly funded data infrastructure There are very important differences between small, medium and large scale data infrastructure.

Lessons Learned / Principles Pay in at constant cap exp, despite the fact that no one wants to pay and that there is an expectation that data infrastructure should be free. Permanent IDs with queryable metadata are essential. Portability of data and data environment (at scale) is critical so researchers can liberate their data. Peering of Tier 1 data repositories and infrastructure and interoperability of public data infrastructure is critical, but don’t get ahead of reference implementations. Essential to keep costs down: – Efficiency gained through medium scale infrastructure – Open source software – Use of infrastructure management and automation tools – Misconceptions about the costs / tradeoffs of public cloud infrastructure.