CMR ECHO Transition and Client Collaboration

Slides:



Advertisements
Similar presentations
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Advertisements

1 Introducing Collaboration to Single User Applications A Survey and Analysis of Recent Work by Brian Cornell For Collaborative Systems Fall 2006.
REST API versioning Group Name: ARC/PRO
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Linux Operations and Administration
Kuali Rice at Indiana University Rice Setup Options July 29-30, 2008 Eric Westfall.
Cloud Computing Project By:Jessica, Fadiah, and Bill.
® IBM Software Group © 2007 IBM Corporation Best Practices for Session Management
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Copyright 2007, Information Builders. Slide 1 Scaling Large HTML Reports With Active Cache Mark Nesson,Vashti Ragoonath June 2008.
1 Design and Integration: Part 2. 2 Plus Delta Feedback Reading and lecture repeat Ambiguous questions on quizzes Attendance quizzes Boring white lecture.
Copyright © Software Carpentry 2011 This work is licensed under the Creative Commons Attribution License See
Version Control and SVN ECE 297. Why Do We Need Version Control?
Creating competitive advantage Copyright © 2003 Enterprise Java Beans Presenter: Wickramanayake HMKSK Version:0.1 Last Updated:
ECHO Technical Interchange Meeting 2013 Timothy Goff 1 Raytheon EED Program | ECHO Technical Interchange 2013.
1 Retirement of Legacy Features Why? –Improved usability and performance for Access Controls, Order Option Definitions, etc. through MMT GUI instead of.
Portlet Development Konrad Rokicki (SAIC) Manav Kher (SemanticBits) Joshua Phillips (SemanticBits) Arch/VCDE F2F November 28, 2008.
INTRODUCTION TO WEB HOSTING
Modularity Most useful abstractions an OS wants to offer can’t be directly realized by hardware Modularity is one technique the OS uses to provide better.
Triggering Engagement Designer workflows with custom HTTP events
How IoT Initiatives are Changing Product Development.
Essentials of UrbanCode Deploy v6.1 QQ147
Data Center Infrastructure
Netscape Application Server
N-Tier Architecture.
Chapter 18 Maintaining Information Systems
Introduction to Redux Header Eric W. Greene Microsoft Virtual Academy
CS 5150 Software Engineering
HISTORY Of API.
LCGAA nightlies infrastructure
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
Introduction to Computers
PHP / MySQL Introduction
Data Virtualization Community Edition
API Documentation Guidelines
Internet Networking recitation #12
Enterprise Application Architecture
Metadata Management Tool
Dilbert Scott Adams Manage It! Your Guide to Modern, Pragmatic Project Management. Johanna Rothman.
FAST Administration Training
Order Management For Shippers.
Design and Maintenance of Web Applications in J2EE
DAY 2: Create PT: Make a Plan
5 Tips for Upgrading Reports to v 6.3
Systems analysis and design, 6th edition Dennis, wixom, and roth
Near Real Time ETLs with Azure Serverless Architecture
Software Testing and Maintenance Maintenance and Evolution Overview
GRUNTMASTER6000 A leading innovation for future programmers.
Systems analysis and design, 6th edition Dennis, wixom, and roth
Chapter 27 Software Change.
Introduction When searching for a new mattress, you have to make sure you know where to go to find the best one. The mattress you sleep on is going to.
Git CS Fall 2018.
Your code is not just…your code
SharePoint 2019 Overview and Use SPFx Extensions
ESIP Winter Meeting 2016 January 2016
Andy Puckett – Sales Engineer
HOW TO USE THE NEW GLOBAL GRANT REPORT
NIEM Tool Strategy Next Steps for Movement
Scaling Businesses on the Cloud
OpenStack Summit Berlin – November 14, 2018
1. GitHub.
REST Easy - Instant APIs for Your Database
Mark Quirk Head of Technology Developer & Platform Group
Session Abstract This session will provide an overview of the latest improvements and enhancements made to the Ed-Fi ODS/API in 2016, as well as a preview.
QoS Metadata Status 106th OGC Technical Committee Orléans, France
Your code is not just…your code
IT Next – Transformation Program
Advanced Tips and Tricks
Presentation transcript:

CMR ECHO Transition and Client Collaboration Winter ESIP 2016 Jason Gilman The material is based upon work supported by the National Aeronautics and Space Administration under Contract Number NNG15HZ39C

& CMR ECHO At the EOSDIS Technical Interchange Meeting back in November Dana Shum and Katie Baynes spoke about some of the changes going in to modernize the legacy ECHO components of the CMR. We’re moving some of the core capabilities from ECHO technologies to the more modern approaches used in the CMR. We’re going to get a lot of benefits out of this move which I’ll detail along with some new features and REST APIs. Along with some of the older SOAP based APIs are going away. I’m going to go into more detail here and explain why we’re making this move, how we’re doing it, and what benefits you’re going to see.

ECHO Prehistory To understand where ECHO and the CMR are today we have to jump back to the origins. Originally ECHO was a single monolithic application. It has a SOAP API. Providers sent their metadata over FTP to an Ingest process. Users would search for and order data through a client called WIST which was a copy of an older client that had run at individual archive centers. Note: Describe the purpose of the colors here. Grey – legacy component. Blue - clients

ECHO Middle Ages The next step in ECHO evolution brought some separate services which are shown in pink. We added REST APIs for ingesting, searching for, and ordering data. New response formats were added. Catalog REST and Elasticsearch were added for a new way to ingest and search. The Reverb client was developed around this time which was a significant improvement over the previous WIST client.

CMR and ECHO Together The next stage brings the addition of the CMR here in green which provides Ingest and search capabilities. Every diagram I've shown so far has only added new things. Nothing has gone away with the exception of WIST. Providers can still use FTP Ingest. They can ingest through Catalog REST. And starting just recently they can ingest directly into the CMR. It all ends up in the CMR and is available for searching. All of the existing clients like Reverb can still work through their existing APIs You can see here we have a mix of different technology’s and approaches from three different eras all working together. We’ve got legacy monolithic Java applications in grey. We’ve got Ruby applications here in red and the newer microservices of the CMR.

Legacy System Problems It’s great that we’ve been able to continue to use our legacy applications for so long but there are a lot of limitations with their continued use. Legacy System Problems

Cost It’s expensive to maintain this system of legacy technologies. There are additional hardware costs to run the legacy applications. They legacy technologies require developers to be familiar with larger sets of code bases which means we have to spend longer on fixing things. Throughout the entire lifetime of the CMR we’ve had a team working on it but we also have several people dedicated to the legacy systems. Those additional developers are a cost both in terms of their salary but also in terms of opportunity. They’re spending time maintaining an old system when we could be adding new value to the earth science community in new features.

Complexity Maintaining a system with that number of legacy components adds a lot of complexity. The size of the boxes I showed in the diagram before doesn’t represent the complexity of an individual box. The CMR services are relatively simple microservices that focus on doing one thing well. They’re easy to deploy and run. The legacy components represent very complex, decades old applications with many 10s to 100s of thousands of lines of code. They use different older technologies which means that developers have to be familiar with all of the different technologies being used.

Holding Us Back The limitations of the legacy components prevent us from making some fundamental improvements in the CMR. The CMR utilizes legacy ECHO code for orders, authentication, authorization and other things. The next couple diagrams will help demonstrate why that’s an issue.

Degraded Service Events (DSE) since CMR went live Reliability Degraded Service Events (DSE) since CMR went live This is a diagram showing counts of degraded service events (aka outages) since the CMR went live categorized by cause. (Explain each column) You can see that the largest cause of outages was in ECHO code and related software while the newer CMR approaches have not yet been the source of any outages. When one of these outages occurs it’s not like just the legacy component itself goes down. The legacy systems are still a fundamental part of the CMR. When they don’t work it means that the CMR doesn’t work so users can’t perform searches and data providers can’t ingest data. CMR - Indicates it was caused by code problem in CMR or software issue in CMR application servers and third party libraries. ECHO Scalability - Indicates it was caused by systems inability to scale to handle many requests. Process - Means it was caused by someone making a mistake in a process that was not automated. Shared Infrastructure - Means the cause was in the infrastructure (load balancers, puppet management, NASA network etc) ECHO - Indicates it was caused by code problem in ECHO or software issue in ECHO application servers and third party libraries

Granule Search Performance This graph shows the performance improvement when the CMR went live. CMR is fast in spite of ECHO. We have to jump through hoops to make the CMR fast. We need to enforce access control rules during a query. Fetching this data from ECHO takes about a minute. We have to make sure to cache that, keep the cache warm, and only ever fetch it on a background thread. These legacy systems just weren’t designed for the performance requirements the CMR has.

We are going to transition the legacy components of ECHO into the CMR. Transition Goals

Incremental Rollout We’re not going to do a big bang release. We’re going to go capability by capability and transition each one to the CMR.

Minimize Impact* We’re going to do what we can to minimize the impact of the change to clients and users. That means your data isn’t going to disappear. That also means you shouldn’t see downtime during the switchover. If you’re using one of the current supported REST APIs it should seem like nothing has changed. * Legacy SOAP APIs are retiring in September 2016

Safety Net We’re planning an incremental rollout using techniques that have worked before. But part of making a big transition like this is identifying risks and planning for what could go wrong. When transitioning a capability if we find a serious bug that we somehow missed in testing in various environments we have a toggle that we can throw to switch back to the old implementation. That should be transparent to clients and users. We probably won’t have to do it but it’s insurance in case of problems.

Transition Plan

For Each Capability Design and implement CMR component. Integrate use of CMR component into legacy code. Live switch over.

Before Capability This is the current state of the CMR and ECHO. It’s the same diagram I showed before.

After Capability Added Note that clients can continue to use existing APIs during transition without problems or interruption. This is a temporary increase in complexity on the path to a simplified experience

Eventually… (~September 2016) Eventually after transitioning all the components and retiring the legacy parts we’ll be at a much more simplified architecture.

Benefits Reduced cost Improved reliability and performance Easily add new capabilities Consistent APIs and features

Data Stored as Immutable Revisions Speaking of consistent APIs and features one nice feature of the transition is that we will be storing the transferred data in the same way we store granule and collections. You’ve probably heard that the CMR stores every update of a granule or collection as a separate immutable revision. We don’t overwrite your data. We create a new copy in our database and eventually age off the old copies to save space. The MMT app exposes this to allow you to view your update history, see who made changes, revert to a previous revision and even undelete. Those are some great features. We’re going to be storing all of the transitioned ECHO data the same way so you’ll have the same capabilities with ECHO data like access control rules, orders, data quality summaries, and order options.

We need your help. Transition off of legacy APIs Provide Feedback SOAP -> REST FTP Ingest -> REST Ingest Provide Feedback Use EDSC and MMT Feedback on CMR APIs

Finally I’m going to discuss something that we’ve been thinking about as we grow the CMR and add new APIs is API versioning. API Versioning

API Improvements API Stability There’s a natural push and pull between the stability of an API and the ability to improve it. As more people use our APIs it gets harder to change it. We don’t want to break existing clients. But we also want to be able to improve it over time. If the CMR is going to be around for a long time it has to be able to change.

Solution! API Versioning API Stability Client Collaboration API Improvements Solution! API Versioning and Client Collaboration are part of the solution here. API Versioning and similar techniques let us add new things make forward progress without breaking clients. We get improvements and stability. Client collaboration is needed as we go forward. We need to know what’s working and what’s not working. And eventually we need cooperation when we’re going to get rid of old versions.

API Versioning Strategies

URI Path Popular (Twitter, Github, etc) Problems https://cmr…/search/v2/collections Popular (Twitter, Github, etc) Problems URIs should be stable. Granularity problems

Query Parameter or Custom Header https://cmr…/search/collections?ver=2 Easy to use. Problems Not RESTful. Doesn’t take advantage of built in HTTP capabilities. Granularity problems

Content Negotiation RESTful URLs stay the same. GET https://cmr…/search/collections Accept: application/reference2+json RESTful URLs stay the same. Request and response content versioned separately. Problems More difficult to specify.

Content Negotiation via URL ext. GET https://cmr…/search/collections.ref2_json Easy to specify for clients that can’t specify a header. Problems Doesn’t handle request content Changes the URL

What will we use? Currently researching approaches. May use combination approach: Content Negotiation + URL extensions Feedback Requested

Client Collaboration

CMR Wiki https://wiki.earthdata.nasa.gov/display/CMR

Client Developer Forum https://wiki.earthdata.nasa.gov/display/CMR/CMR+Client+Developer+Forum

CMR Designs in Wiki Also note holding design reviews where people give feedback.

CMR Client Developer Email List cmr-client-developers@lists.nasa.gov Sign up here: https://lists.nasa.gov/mailman/listinfo/cmr-client-developers

This material is based upon work supported by the National Aeronautics and Space Administration under Contract Number NNG15HZ39C.