Curation Micro-Services “It’s a Series of Tubes” Curation Micro-Services “It’s a Series of Tubes”

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

WP2: Data Management Gavin McCance University of Glasgow.
Merritt: A Micro-Services-Based Curation Repository University of California Curation Center California Digital Library November 18, 2010.
A Micro-Services-Based Approach for Curation and Preservation Solutions Stephen Abrams Patricia Cruse John Kunze Perry Willett University of California.
DuraSpace: Digital Information All Ways, Always Pretoria, South Africa May 14 th, 2009.
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
The future’s so bright…. DAITSS DIGITAL PRESERVATION SYSTEM: RE-ARCHITECTED, RE- WRITTEN, AND OPEN SOURCE Priscilla Caplan Florida Center for Library Automation.
HATHI TRUST A Shared Digital Repository Digital Repositories for Preservation and Access Digital Directions 2013 Jeremy York July 22, 2013 Unless otherwise.
The Merritt Curation Repository Features, Uses, and Benefits University of California Curation Center California Digital Library UC Berkeley, August 13,
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
HEP Data Sharing … … and Web Storage services Alberto Pace Information Technology Division.
Merritt Fixity Authenticity for Managed Digital Assets University of California Curation Center California Digital Library April 7, 2011.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
Depositing e-material to The National Library of Sweden.
Next Generation Node (NGN) Technical Overview April 2007.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
UMIACS PAWN, LPE, and GRASP data grids Mike Smorul.
Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani Free University of Bozen - Bolzano Lesson 2 – Components.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
What is Unix Prepared by Dr. Bahjat Qazzaz. What is Unix UNIX is a computer operating system. An operating system is the program that – controls all the.
Struts 2.0 an Overview ( )
Neighborhood Watch for Repository Quality Assurance Stephen Abrams Patricia Cruse John Kunze University of California Curation Center California Digital.
Design Principles for Digital Preservation Systems Stephen Abrams University of California Curation Center California Digital Library
Repositories collect lots of technical metadata, but lack tools to use it to better understand the objects in their care, and to apply it precisely in.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
The Web Archiving Service Tracy Seneca California Digital Library California Digital LibraryNew York UniversityUniversity of North Texas National Digital.
Libraries as Partners in Research: the UC Curation Center’s Tools and Services UC3 Team University of California Curation Center California Digital Library.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
EZID long-term identifiers made easy Greg Janée University of California Curation Center California Digital Library July 31, 2012.
UC3 Standards and Best Practices for Datasets and Other Supplemental Journal Article Materials UC3 Stephen Abrams Patricia Cruse John Kunze.
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
Presentation on SubmissionTrackingTool: by Anjan Sharma.
San Diego Supercomputer Center SDSC Storage Resource Broker Data Grid Automation Arun Jagatheesan et al., San Diego Supercomputer Center University of.
Architecture styles Pipes and filters Object-oriented design Implicit invocation Layering Repositories.
UC3 Curation Micro-Services Simplified Repository Ingest UC Curation Center California Digital Library May 20, 2010.
Copyright, 1996 © Dale Carnegie & Associates, Inc. Presented by Hsiuling Hsieh Christine Liu.
Fedora Content Models for the National Science Digital Library Data Repository Fedora User’s Group Meeting Copenhagen, September 28, 2005 Carl Lagoze Cornell.
Computer Emergency Notification System (CENS)
Topic Rathachai Chawuthai Information Management CSIM / AIT Review Draft/Issued document 0.1.
“Where are We From, Where Are We Going?” Permanent Objects, Disposable Systems Stephen Abrams Patricia Cruse John Kunze David Loy California Digital Library.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
April 10, 2009CDL Users Council1 Digital Curation Services at CDL Perry Willett Digital Preservation Project Manager California Digital Library.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Digital Preservation MetaArchive Cooperative.  9:00-9:45 - Session 1: Digital Preservation Overview  9:45-11:00 - Session 2: Policy & Planning Overview.
Web: Minimal Metadata for Data Services Through DIALOGUE Neil Chue Hong AHM2007.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Digital Curation: Curation Micro- services approach to building repositories Mark Phillips UNT Libraries November 8, 2010.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
UC3 Services In-Depth: Data Curation for Practitioners 2012 Workshop.
Moodle Moot – August 2015 Nick Thompson, CCLE Coordinator CASA Community Application Sharing Architecture.
Vicki Tobias Introduction to and Institutional Repositories.
A Technical Overview Bill Branan DuraCloud Technical Lead.
Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files William C. Block Jeremy Williams Lars Vilhuber Carl Lagoze.
An Introduction to EZID University of California Curation Center Team California Digital Library August, 2011 UC3 Summer Webinar Series.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
OAIS (archive) Producer Management Consumer. Representation Information Data Object Information Object Interpreted using its Yields.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
OAIS (archive) OAIS (archive) Producer Management Consumer.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
OAIS Producer (archive) Consumer Management
Overview: Fedora Architecture and Software Features
An Introduction to the Merritt Curation Repository
CNI Spring 2010 Membership Meeting
Independent Systems Architecture: ISA
NSDL Data Repository (NDR)
Presentation transcript:

Curation Micro-Services “It’s a Series of Tubes” Curation Micro-Services “It’s a Series of Tubes”

The Unix philosophy “Make each program do one thing well” “To do a new job, build afresh rather than complicate old programs by adding new features” “Expect the output of every program to become the input to another, as yet unknown, program” “Design and build software … to be tried early” “Don't hesitate to throw away the clumsy parts and rebuild them” — D. L. McIlroy et al., “Unix time-sharing system forward,” Bell System Technical Journal 57:6, part 2 (1978): 1902

The micro-services “philosophy” CC CC

Curation micro-services MetaphorsAssumptionsPrinciplesPreferencesPractices Pipeline Safety through redundancy Modularity The small and simple over the large and complex Focus on outcomes, not means Lego bricks Meaning through context Granularity The minimally sufficient over the feature laden Complexity through composition, not addition Utility through service Orthogonality The configurable over the prescribed Policy neutral, platform and protocol independent Value through use (and reuse) Emergence The proven over the (merely) novel Approach sufficiency through incrementally necessary steps Stewardship is a relay Evolution Early prototyping, frequent refactoring ParsimonyCode to interfaces

Curation micro-services ModeFocusValueServiceValenceVisibility Curation Value Accretion Annotation UI / Access control / Message queuing Interoperation User-facing Visibility Notification Utility Accessibility Access Application Derivation Transformation Selectivity Search Actionability Index Stewardship Ingest Preservation Context Epistemology Characterization Interpretation Provider- facing Ontology Inventory State Reliability Replication Protection Fixity Stability Storage Identity

Design goals Principle of least surprise Multiple interface modalities – RESTful HTTP – Command line – Procedural (Java, Perl, Ruby, …) Linked data Stable URL references The file system is the database State or content Storage node ObjectVersionFile default/1234/3/xyz state/ Storage service

“You say micro, I say macro…” Access ANVL ARK BagIt CAN Checkm Dflat ERC EZID GhOST Ingest Inventory LockIt N2T Namaste Noid Pairtree ReDD RUU Storage Access ANVL ARK bagit.plBagIt CAN checkm.plCheckm Dflat ERC EZID GhOST Ingest Inventory LockIt N2T namaste.plNamaste Noid Pairtree ReDD RUU Storage ServiceToolConvention

Development roadmap First waveSecond wave Third waveFourth wave Fifth waveSixth wave IdentityInventoryIndexSearchNotificationAnnotation StorageIngest / AccessFixityReplication CharacterizationTransformation IDm / Authn / AuthzMetadata standards Object / collection modelingSemantic interoperability Policy / business model development

Ingest process flow Submitting user agent Ingest Inventory Storage Node Identity Submit Create identifier Identifier Add version Get version metadata Version metadata Notification Version metadata Get version metadata Add version

Ingest implementation Submitting user agent Submitter Consumer Ingester Storage Queue HTML form Servlet Implicitly multi-threaded Servlet Implicitly multi-threaded Dæmon Explicitly multi-threaded Zookeeper dæmon Job metadata Job payload Submission notification Ingest notification Batch or single object

Questions? silverpipes.jog / firstpresmacomb.org

More information UC Curation Center (UC3) Micro-service specifications Digital curation group UC3 Stephen AbramsErik Hetzner Margaret Low Mark Reyes Perry Willett Patricia CruseGreg Janée David Loy Tracy Seneca Scott FisherJohn Kunze Isaac Rabinovitch Marisa Strong