DuraCloud Enabling services for managing data in the cloud Michele Kimpton, CBO DuraSpace Bill Branan, Senior Developer DuraSpace.

Slides:



Advertisements
Similar presentations
What is Cloud Computing? Massive computing resources, deployed among virtual datacenters, dynamically allocated to specific users and tasks and accessed.
Advertisements

DuraSpace, Fedora and DuraCloud Triangle Research Libraries Network September, 2009.
DuraSpace: Digital Information All Ways, Always Pretoria, South Africa May 14 th, 2009.
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
The Internet2 NET+ Services Program Jerry Grochow Interim Vice President CSG January, 2012.
Archiving research data in the cloud or in a local repository Michele Kimpton, CEO DuraSpace CNI Dec 2014.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
Building community clouds to support access to scholarship Michele Kimpton CEO, DuraSpace Jonathan Markow CSO, DuraSpace.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Moving libraries to Web scale Matt Goldner Product & Technology Advocate 14 June 2011.
1 The Australian Partnership for Sustainable Repositories Margaret Henty Digital Futures Industry Briefing November 8, 2006.
Cloud Computing (101).
Preservation In The Cloud Markus Wust NCSU Libraries.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
Plan Introduction What is Cloud Computing?
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Clouds on IT horizon Faculty of Maritime Studies University of Rijeka Sanja Mohorovičić INFuture 2009, Zagreb, 5 November 2009.
New Value from the DSpace Foundation and Fedora Commons Michele Kimpton and Sandy Payette Executive Directors DuraSpace.
Page  1 SaaS – BUSINESS MODEL Debmalya Khan DEBMALYA KHAN.
“ Does Cloud Computing Offer a Viable Option for the Control of Statistical Data: How Safe Are Clouds” Federal Committee for Statistical Methodology (FCSM)
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
DuraCloud A service provided by Sandy Payette and Michele Kimpton.
SobekCM’s Community Ecosystems & Socio-Technical Practices Presented by Mark V. Sullivan June 10 th, 2014 Sobek image created by Jeff Dahl and is shared.
Hydra from 35,000ft Chris Awre Hydra Europe Symposium London School of Economics, 23 rd April 2015.
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
CLOUD COMPUTING. What is cloud computing ? History Virtualization Cloud Computing hardware Cloud Computing services Cloud Architecture Advantages & Disadvantages.
An Introduction to DuraCloud Michele Kimpton, Project Director Carissa Smith, Partner Specialist DuraSpace Webinar  Sept 2011.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
DuraCloud pilot program Michele Kimpton, CEO DuraSpace Richard Rodger, Dept Head Software development, M.I.T. Libraries Claire Stewart Dept Head Digital.
Texas Digital Library CENTRAL TEXAS AND SAN ANTONIO-AREA REGIONAL MEETING SEPTEMBER 5, 2013.
Electronic Records Management: A Checklist for Success Jesse Wilkins April 15, 2009.
The Global Video Grid: DigitalWell Update & Plan For SRB Integration Myke Smith, Manager Streaming Media Technologies University of Washington / ResearchChannel.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
& Collaborating to Build an Open Access Archive of Public Policy Research Coalition for Networked Information Task Force Meeting.
BMC Open Access Colloquium, 8 February Morgan: "Open Access Repositories"
Presented by: Presented by: Tim Cameron CommIT Project Manager, Internet 2 CommIT Project Update.
2009 Federal IT Summit Cloud Computing Breakout October 28, 2009.
The Canadian Information Network for Research in the Social Sciences and Humanities Tim Au Yeung and Mary Westell Libraries.
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
Session 3.  Now you know WHY to make policies and WHAT they should contain…  But HOW do you implement policies?  And then HOW do you implement a program.
May 2, 2013 An introduction to DSpace. Module 1 – An Introduction By the end of this module, you will … Understand what DSpace is, and what it can be.
February, 2006 Open Repositories, Sydney, Australia Transition to a Broader Participation: Experience from the DSpace Project MacKenzie Smith MIT Libraries.
Spotlight on Cloud Computing February 2, 2011 A Vision for Cloud- and Earth- based Services at UC Davis 1.
WHAT OUR CUSTOMERS ARE SAYING “After thorough market research and a review process, Qorus Breeze Proposals stood out from the competitors because of its.
ScholarSpace & Open UH Mānoa March 2013 Beth Tillinghast Web Support Librarian ScholarSpace & eVols Project Manager UHM Library.
Technology for Social Justice Enhancing community sector service delivery Stefanie Kechayas – Senior Consultant 17 November 2015 SharePoint Connect and.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
From ePrints to eSPIDA: Digital Preservation at the University of Glasgow William J Nixon, Service Development DAEDALUS, University of Glasgow DPC: Digital.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
DuraCloud for Dummies Should I stay or should I go [to the cloud]? ~Carissa Smith, DuraSpace.
The DuraCloud Workshop Your hosts: Bill Branan & Carissa Smith.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Archiving and Preservation Michele Kimpton CEO, DuraSpace Bryan Beecher Director, ICPSR DuraSpace Webinar November 2, 2011.
CLOUD COMPUTING RICH SANGPROM. What is cloud computing? “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware A Cloud Computing Methodology Study of.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI strategy and Grand Vision Ludek Matyska EGI Council Chair EGI InSPIRE.
Store, Manage, and Archive Content in the Cloud Michele Kimpton, DuraSpace CEO DuraSpace Nate Klingenstein, Internet 2 Internet 2 meeting, April 2013.
Built on the Powerful Microsoft Azure Platform, Forensic Advantage Helps Public Safety and National Security Agencies Collect, Analyze, Report, and Distribute.
HUIT Cloud Initiative Update November, /20/2013 Ryan Frazier & Rob Parrott.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Katherine Skinner, Martin Halbert & Matt Schultz Educopia Institute and MetaArchive Cooperative NDSA Infrastructure Committee
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Preservation support and data management services built on cloud infrastructure Features:  Ingest once, and make multiple copies to multiple storage.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Agenda  What is Cloud Computing?  Milestone of Cloud Computing  Common Attributes of Cloud Computing  Cloud Service Layers  Cloud Implementation.
Recommendation 6: Using ‘cloud computing’ to meet the societal need ‘Faster and transparent access to public sector services’ Cloud computing Faster and.
Introduction to D4Science
Michele Kimpton Project Director, DuraCloud NDIPP Partner meeting
Archiving and preservation services in the cloud
Presentation transcript:

DuraCloud Enabling services for managing data in the cloud Michele Kimpton, CBO DuraSpace Bill Branan, Senior Developer DuraSpace

Agenda  What problem are we trying to solve  What is DuraCloud  Pilot  Timeline

Challenges (From our communities) Digital preservation and archiving is hard to achieve, even just basic replication Making digital content more accessible and useable to researchers Easy and elastic provisioning of shared infrastructure (also across institutions!) Robust compute environments for large indexing jobs, data mining and analysis of large datasets

What About the Cloud? A style of computing where massively scalable IT-related capabilities are provided “as a service” using Internet technologies to multiple external customers. (Gartner, 6/08).

More definitions… (UC Berkeley RAD Lab) Public Cloud The service being sold is “Utility Computing” Private Cloud Internal datacenters of a business or other organization not made available to the general public Community Clouds Networks of private clouds, available within a given community

Cloud services

Public Cloud Services Elastic web-based infrastructure for storage and compute

Focus group study Academic CIO’s and Library IT Institutions’ perceptions of current compute and storage costs Current market understanding, willingness to consider cloud computing DuraSpace value Need for DuraCloud services

Cloud Computing Today All participants possess some understanding of cloud compute and storage No institution using external cloud provider to manage or store collections today Majority of institutions think it is likely they will use cloud computing in some capacity in the next twelve months

Cloud Computing Barriers Long-term trustworthiness and sustainability of solution Access and reliability Data security and privacy concerns

DuraSpace Value Proposition “The community they have historically served brings a sense of trust and commitment to the market. Nothing is as powerful in higher education as strong references from the higher education community.” “We would work with DuraSpace over Sun or Google any day in this market. Given the size and the strength of the education sector overall (small relative to their other markets), we can never count on the big players to not cut the services that are key to us.” “We view DuraSpace as part of our community, not just a vendor. We have different roles, but we are all working on behalf of the same things.”

DuraCloud Proposition Trust and durability in the cloud DuraCloud is a service aimed at supporting libraries, universities, and other cultural heritage organizations that wish to provide perpetual access to their digital content. The service replicates and distributes content across multiple cloud providers and enables the deployment of services to support: * access * preservation * re-use

Preservation Services -Online backup in the cloud -ability to replicate content to multiple providers and locations -ability to synchronize backup with primary store or repository system -management,monitoring, audit and repair through web based interface Hosted by DuraSpace not-for-profit org Partnerships with cloud providers

Access and compute services

Application Exchange

What DuraCloud is not… Repository platform Hosting service Central archive or library You own and manage your data, we enable technology and services utilizing the cloud

Partners and Pilots Selected initial cloud providers Selected 2 initial pilot partners

Cloud partnerships Preferred pricing Co-Development of services Free storage for open data pool Early notification of architecture/API changes Immediate notification of security breach Possible enhanced SLA’s-data loss? Sponsorship of non-profit

NYPL pilot -back up copy 800k images (50 TB data) -transformation from Tiff to JPEG run image server in cloud -Push JPEG 2000 back into Fedora Repository Digital Gallery Collection

NYPL workflow

BHL pilot -back up copy entire corpus (40 TB data) from multiple sources -have multiple copies including Europe -Do compute intensive data mining over corpus BioDiversity Heritage Library

Pilot use cases NYPL Replication and preservation support Format conversion Instant provisioning Synchronization with repository BHL Replication and preservation support International collaborative infrastructure Researcher platform for data mining

Timeline Begin pilots(MOU’s in place) – September 2009 DuraCloud Alpha Pilot release- Oct 2009 Pilot data loading and testing – Fall 2009 Beta for repository community - Q Pilot testing with software services Q Cloud partner evaluations complete-Q Strategic cloud partnerships in place- Q Pricing Model determined-Q Report pilot results – Q Launch production service Q3 2010

Pilot Success  Can replicate content across 2 or more cloud providers through web interface  Can manage and check and repair content through web interface  Can perform at least one service on content  Migrated small, medium and large datasets into cloud ( up to 40 TB)  Established pricing strategy  Have integrated both DSpace and Fedora  Have multiple strategic cloud partnerships in place  Launch service mid 2010

Project Director- myself Senior Developer- Bill Branan Senior Developer-Andrew Woods Web Developer- Danny Bernstein Technical oversight- Brad McLean Fedora integration- Chris Wilper DSpace integration- Tim Donohue Marketing communications-Carol Minton Morris Core Team

Thank You For more information: DuraSpace Organization: Wiki: commons.org/confluence/display/duracloudpilot /