Automation, Virtualization, and Integration of a Digital Repository Server Architecture or How to Deploy Three Production DSpaces in One Night.

Slides:



Advertisements
Similar presentations
Implementing Tableau Server in an Enterprise Environment
Advertisements

IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
Contribution to MD9 Viktor Pusztai Ministry For Environment and Water GRID-Budapest CEOS WGISS meeting 17 September 2003 Thailand - Chiang Mai.
Configuration management
The Documentum Team Lance Callaway, Brooke Durbin, Perry Koob, Lorie McMillin, Jennifer Song Missouri University of Science and Technology Rolla, Missouri.
® IBM Software Group © 2010 IBM Corporation What’s New in Profiling & Code Coverage RAD V8 April 21, 2011 Kathy Chan
Software Factory Assembling Applications with Models, Patterns, Frameworks and Tools Anna Liu Senior Architect Advisor Microsoft Australia.
From Entrepreneurial to Enterprise IT Grows Up Nate Baxley – ATLAS Rami Dass – ATLAS
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
Web Content Management Systems. Lecture Contents Web Content Management Systems Non-technical users manage content Workflow management system Different.
This chapter is extracted from Sommerville’s slides. Text book chapter
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Module - Technical Basics
Developing Interfaces and Interactivity for DSpace with Manakin Part 2: Technical and Conceptual Overview of Dspace and Manakin Eric Luhrs Digital Initiatives.
Customized cloud platform for computing on your terms !
© 2012 LogiGear Corporation. All Rights Reserved Robot framework.
Natick Public Schools Technology Update January 26, 2009 Dennis Roche, CISA Director of Technology.
LDS Account Integration. Disclaimer This is a training NOT a presentation. – Be prepared to learn and participate in labs Please ask questions Prerequisites:
|Tecnologie Web L-A Anno Accademico Laboratorio di Tecnologie Web Introduzione ad Eclipse e Tomcat
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
The DSpace Course Module – Upgrading from 1.4 to 1.5.
The DSpace Course Module - Look & Feel Customisation.
AIP Backup & Restore Sunita Barve NCRA, Pune. AIP The latest version of DSpace 1.7.0, supports backup and restore of all its contents as a set of AIP.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.
Solutions using Microsoft Content Management Server 2002 Connector for SharePoint Technologies Sue Corke Mark Harrison Microsoft UK.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
Module 9 User Profiles and Social Networking. Module Overview Configuring User Profiles Implementing SharePoint 2010 Social Networking Features.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Module 6: Configuring User Environments Using Group Policies.
Presentation Title Subtitle DSpace UI Prototype 7 Spring, Angular.js, and the DSpace REST API.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
PDS4 Demonstration Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
JRA1 Meeting – 09/02/ Software Configuration Management and Integration EGEE is proposed as a project funded by the European Union under contract.
PROJECT SECME Carthik A. Sharma Juan Carlos Vivanco Majid Khan Santhosh Kumar Grandai. Software Engineering Fall 2002.
Sitecore upgrades The Past, The Present, The Future.
CS122B: Projects in Databases and Web Applications Spring 2017
CS122B: Projects in Databases and Web Applications Winter 2017
SharePoint 101 – An Overview of SharePoint 2010, 2013 and Office 365
Integrating ArcSight with Enterprise Ticketing Systems
Containers as a Service with Docker to Extend an Open Platform
Integrating ArcSight with Enterprise Ticketing Systems
System Center 2012 Configuration Manager
CMS DCS: WinCC OA Installation Strategy
Netscape Application Server
Shared Services with Spotfire
Data Virtualization Tutorial: Introduction to SQL Script
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
Microsoft SharePoint Server 2016
Jon Galloway | Tech Evangelist Christopher Harrison | Head Geek
Deploying and Configuring SSIS Packages
Dynamic Web Page A dynamic web page is a kind of web page that has been prepared with fresh information (content and/or layout), for each individual viewing.
Andrew Pruski SQL Server & Containers
Drupal VM and Docker4Drupal For Drupal Development Platform
VI-SEEM Data Repository
Drupal VM and Docker4Drupal as Consistent Drupal Development Platform
Cloud Computing Dr. Sharad Saxena.
SpringerLink Training August 2010
Gotcha! SharePoint Online Migration Mistakes to Avoid
CS122B: Projects in Databases and Web Applications Winter 2018
CS122B: Projects in Databases and Web Applications Spring 2018
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Module 01 ETICS Overview ETICS Online Tutorials
Course: Module: Lesson # & Name Instructional Material 1 of 32 Lesson Delivery Mode: Lesson Duration: Document Name: 1. Professional Diploma in ERP Systems.
SharePoint 2010 – SharePoint 101
TechEd /23/2019 9:23 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
CREE: HEIRPORT lite Welcome screen:
David Cleverly – Development Lead
MS Confidential : SharePoint 2010 Developer Workshop (Beta1)
SDMX IT Tools SDMX Registry
Presentation transcript:

Automation, Virtualization, and Integration of a Digital Repository Server Architecture or How to Deploy Three Production DSpaces in One Night and Be Home for Dinner TAMU Libraries Digital Initiatives James Creel, Micah Cooper, Jeremy Huff TCDL 2015 Austin, TX

Talk Outline The OAK Trust digital repository Technical Debts The cost of customization The evolution of server architecture A Trio of Innovations Automation of deployments Virtualization of infrastructure Modularization of customizations Lessons Learned for IT in libraries

The OAK Trust Digital Repository A brief overview and history

OAK Trust A branded, customized DSpace instance hosted in-house at TAMU Libraries Launched as “The Texas A&M University Digital Repository” in 2005 with an eye toward archiving ETDs Rebranded with launch of the OAK (Open Access to Knowledge) Fund Has grown to host ~70,000 items, including articles, books, maps, and photography from diverse sources which underwrites TAMU researchers’ publication fees for open-access journals if they agree to submit their articles to the repository

OAK Trust - Hosting From inception to 2013, hosted on dedicated Solaris hardware. Database, assetstore, and SSO authentication all hosted on their own separate hardware.

OAK Trust Customizations Over half a dozen XMLUI themes Extensive custom Java Expanding/collapsing Community/collection browser Links to collection handles from group listing on profile page Record context to keep page on login Metadata-tree browser within collection Export metadata from search results TAMU , Image Gallery, Primeros Libros, Periodicals, Geofolios, ESL, Capstone, Fanzine

Cumulative Costs of Customization Technical Debt Cumulative Costs of Customization

Manually Upgrading DSpace Preparations in Development Environment Compare old and new configuration files line by line Get a realistic duplicate of the production db mvn package, ant install Tweak configurations as needed Test basic things Themes look good, widgets work, search and browse work, webapps run ok This can be an iterative process instead of three steps; you might end up having to go back into development after problems become apparent in pre-production Configuration tweaks include server addresses and directory paths

Problems with the Development Deployment Configuration files are big, and the old config and new config must be compared line by line Java files reference each other’s contents in structurally and nominally particular ways A change to core code on which your customization depends requires that the customization be rewritten Coding to Java interfaces helps, but interfaces change too

Manually Upgrading DSpace Preparations in Pre-production Mount assetstore and log directory mvn package, ant install Tweak configurations and environment as needed Test more extensive things Authentication works, communication with other servers works Configuration tweaks include server addresses and directory paths Environment changes may include java version, tomcat version, build tools versions

Problems with the Pre-production Deployment Pre-production environment on a physically provisioned machine is rather different from your development one Surprises in the tweaks (e.g. “Oh, we need Java 1.7 not 1.6”) must be meticulously recorded in anticipation of the ultimate production deployment We develop typically on Macs, and historically were deploying to Solaris.

Manually Upgrading DSpace in Production Announce plans for downtime to customers and family members Mount assetstore and log directory mvn package, ant install Tweak configurations and environment as needed Test even more things Authentication still works, handle server works, statistics still showing up, all the webapps reply Configuration tweaks include server addresses and directory paths Environment changes may include java version, tomcat version, build tools versions

Problems with the Production Deployment The expanded to-do list for the pre-production deployment may be lengthy – the team is then expected to perform the procedure identically on the production box with minimal downtime Production environment on a physically provisioned machine is always at least a little different from the pre-production one

Summary of Problems Rewriting features to work after changes in the stock code base Hardware and software environment differences Reproducing an extensive, detailed process perfectly, by hand, late into the evening

Three Remedies to the Deployment Problems Relieving the burden of technical debt Modularization of Code Virtualization of Infrastructure Automation of Deployment

Modularization of code Problem: Rewriting custom features to work after changes in the stock code base Solution: separate out customizations and cleanly integrate them with core code Solution in context: DSpace modules modularizing XSL pull requests to DuraSpace

Modularization of code Dspace Modules: Since DSpace 3x customizations to core DSpace are possible by overriding core files with custom files placed in a modules directory. Adding your own customization need not disturb the core code base.

Modularization of code Modularizing XSL: We have continued in this principle of hierarchical modularization by putting empty placeholders in stock XSL, enabling extension in sub-themes. This has lead to an extreme reduction in redundant code—with some file being reduced in excess of 90% BEFORE AFTER

Modularization of code Pull requests to DuraSpace: Technical debt can be further reduced by adopting the open source mindset of developing for the larger community first, as opposed to an institutionally centric approach If a custom feature is integrated into the core code, it need not be locally rewritten when upgrading

Virtualization of Infrastructure Problem: Server environments are inevitably unique and idiosyncratic on physically provisioned hardware for development, pre-production, and production Solution: Deploy virtual machines with standardized environments, abstracting away hardware concerns Solution in context: Open Stack, vmware, Vagrant

Virtualization of Infrastructure VMware: a framework for the creation and management of completely virtualized sets of hardware. Vagrant: lightweight, reproducible and portable virtual development environment.

Automation of Deployment Problem: People make mistakes when forced to execute detailed procedures in a hurry, and it’s stressful anyway! Solution: Script the deployment so it is programmatically identical with each execution Solution in context: Chef

Virtualization of Infrastructure Chef: “Code as Infrastructure” – Chef is a framework for the scripted automation of application deployment. It is: Version-able Testable repeatable

Amusing Anecdotes and Takeaways

Big Technical Changes are Expensive Implementing a virtual infrastructure and automating deployment is a huge cultural and technical shift. Many stakeholders have to buy-in to the long-term investment. Lots of work has to be done before any benefit is realized.

Production deployments of DSpace at TAMU is now fast! The work is nearly all front-loaded The production deployment is a “one-click” process, undertaken with a higher degree of certainty

Thanks for coming! Any questions? TAMU Libraries Digital Initiatives James Creel, Micah Cooper, Jeremy Huff TCDL 2015 Austin, TX