FTP versus HTTPS in EOSDIS Data Access WGISS 40 – September 30, 2015 Andrew Mitchell 1.

Slides:



Advertisements
Similar presentations
Implementing Tableau Server in an Enterprise Environment
Advertisements

Mobile Agents Mouse House Creative Technologies Mike OBrien.
BEDI -Big Earth Data Initiative
Geospatial One-Stop A Federal Gateway to Federal, State & Local Geographic Data
ITIS 1210 Introduction to Web-Based Information Systems Chapter 44 How Firewalls Work How Firewalls Work.
Copyright © 2012 Certification Partners, LLC -- All Rights Reserved Lesson 4: Web Browsing.
NASA COMMON METADATA REPOSITORY (CMR) Update and Near Term Plans – CEOS WGISS 37 1 Andrew Mitchell Earth Science Data and Information Systems (ESDIS) National.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 13: Administering Web Resources.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
© 2010, Robert K. Moniot Chapter 1 Introduction to Computers and the Internet 1.
Lesson 11-Virtual Private Networks. Overview Define Virtual Private Networks (VPNs). Deploy User VPNs. Deploy Site VPNs. Understand standard VPN techniques.
Lesson 20 – OTHER WINDOWS 2000 SERVER SERVICES. DHCP server DNS RAS and RRAS Internet Information Server Cluster services Windows terminal services OVERVIEW.
V1.00 © 2009 Research In Motion Limited Introduction to Mobile Device Web Development Trainer name Date.
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
Data Networking Fundamentals Unit 7 7/2/ Modified by: Brierley.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 10: Server Administration.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
 Proxy Servers are software that act as intermediaries between client and servers on the Internet.  They help users on private networks get information.
Installing and Maintaining ISA Server. Planning an ISA Server Deployment Understand the current network infrastructure Review company security policies.
Lecture slides prepared for “Business Data Communications”, 7/e, by William Stallings and Tom Case, Chapter 8 “TCP/IP”.
Chapter 10 Publishing and Maintaining Your Web Site.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Module 4 Managing Client Access. Module Overview Configuring the Client Access Server Role Configuring Client Access Services for Outlook Clients Configuring.
REDCap Overview Institute for Clinical and Translational Science Heath Davis Fred McClurg Brian Finley.
CLIENT A client is an application or system that accesses a service made available by a server. applicationserver.
Guide to TCP/IP, Second Edition1 Guide To TCP/IP, Second Edition Chapter 6 Basic TCP/IP Services.
MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter Four Configuring Outlook and Outlook Web Access.
XHTML Introductory1 Linking and Publishing Basic Web Pages Chapter 3.
A+ Guide to Managing and Maintaining Your PC Fifth Edition Chapter 19 PCs on the Internet.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
5 Chapter Five Web Servers. 5 Chapter Objectives Learn about the Microsoft Personal Web Server Software Learn how to improve Web site performance Learn.
U.S. Department of the Interior U.S. Geological Survey Web Services Interest Group WGISS #28 September, 2009 Pretoria, South Africa Lyndon R. Oleson U.S.
An XMPP (Extensible Message and Presence Protocol) based implementation for NHIN Direct 1.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
© McLean HIGHER COMPUTER NETWORKING Lesson 1 – Protocols and OSI What is a network protocol Description of the OSI model.
Guten Tag Michael Morahan CEOS-WGISS 29 May 17-21, 2010 Bonn, Germany.
1 Version 3.0 Module 11 TCP Application and Transport.
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Chapter 9 Publishing and Maintaining Your Site. 2 Principles of Web Design Chapter 9 Objectives Understand the features of Internet Service Providers.
Consolidated Metadata Repository (CMR) Status and Look Ahead CWIC 2015 Annual Meeting, February 18-19, 2015 This work was supported by NASA/GSFC under.
Computer Networking From LANs to WANs: Hardware, Software, and Security Chapter 13 FTP and Telnet.
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
REDCap Overview Institute for Clinical and Translational Science Fred McClurg Neil Nuehring.
REDCap Overview Institute for Clinical and Translational Science Heath Davis Fred McClurg Brian Finley.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
EOSDIS User Registration System (URS) 1 GES DISC User Working Group May 10, 2011 GSFC, NASA.
LANCE Processing at the AMSR-E SIPS Presented by Kathryn Regner Information Technology and Systems Center at the University of Alabama in Huntsville Joint.
1 Raytheon EED Program | ECHO Technical Interchange 2013.
ECHO Technical Interchange Meeting 2013 Timothy Goff 1 Raytheon EED Program | ECHO Technical Interchange 2013.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
ASP.NET 2.0 Security Alex Mackman CM Group Ltd
Website Update and Use of Official accounts Dr.Lasantha Ranwala ( MBBS,MSc-Biomedical Informatics) Medical Officer - Health Informatics RDHS Office.
LP DAAC Overview – Land Processes Distributed Active Archive Center Chris Doescher LP DAAC Project Manager (605) Chris Torbert.
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
Earth Science Data and Information System (ESDIS) Project Update Jeanne Behnke, Deputy Project Manager for Operations NASA Earth Science Data & Information.
Chapter 7: Using Network Clients The Complete Guide To Linux System Administration.
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
ArcGIS for Server Security: Advanced
Securing the Network Perimeter with ISA 2004
Web Caching? Web Caching:.
Working at a Small-to-Medium Business or ISP – Chapter 7
Data Networking Fundamentals
Working at a Small-to-Medium Business or ISP – Chapter 7
Working at a Small-to-Medium Business or ISP – Chapter 7
WGISS Connected Data Assets April 9, 2018 Yonsook Enloe
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Unit# 5: Internet and Worldwide Web
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Presentation transcript:

FTP versus HTTPS in EOSDIS Data Access WGISS 40 – September 30, 2015 Andrew Mitchell 1

Agenda User Registration System – URS –Earthdata Login Requiring Registration for Data Access at EOSDIS –FTP/HTTP Comparison URS Guidance and Policy FTP retirement at Data Centers –Lessons Learned Backup: File Transfer Protocol (FTP/HTTP) –Engineering Perspective –Performance Study 2

NASA USER REGISTRATION – EARTHDATA LOGIN 3

Earthdata Login 4

Capturing User’s Area of Interest 5

Study Areas & Application Domains NASA - Primary study area*ESA - Primary Application Domain* Air sea interaction Atmospheric aerosols Biological Oceanography Clouds Cryospheric studies Geophysics Global biosphere Human dimensions of global change Hydrologic cycle Land processes Physical Oceanography Polar processes Radiation budget Sea ice Troposheric chemistry Upper atmospheric composition Upper atmospheric dynamics Other Atmosphere Sea-Ice Geodesy Geology Hazards Hydrology Ice Land Environment Methods Oceanography Renewable Resources Topographic Mapping Other Calibration/Validation Costal Zones 6

Federated User Identity Study Performing a study of other (non OAuth2) Single Sign -On technologies that will allow Earthdata Login to become interoperable with user registration systems from other systems and agencies. 7

Architecture LDAP store LDAP proxy (via LDAP store) HTTP- accessible RESTish API FTP clients HTTP clients Web-based user maintenance

REQUIRING REGISTRATION FOR DATA ACCESS AT EOSDIS FTP and HTTP comparison 9

Impact of requiring authentication with FTP at DAACs AdvantagesDisadvantages Minimal impact to existing usersMultiple flavors deployed at the data centers (5 different ftp servers) Minimal impact to data centersNo direct support for LDAP authentication on some of the flavors. No changes to firewall rules or similar configuration Not authenticated securely: some flavors unable to support secure authentication. *Direct support for anonymous accessProhibited at LP DAAC due to DoI regulations Maturity of capability / protocolDoes not integrate well with REST API for support of OpenID or OGC 10

Impact of requiring authentication with HTTP at DAACs AdvantagesDisadvantages Comprehensive support from the user community: protocol is well established and mature, all data centers use the same http server (apache) End user scripts will have to change, as will manual access to the files they access Modules can be applied to support many extensions and metrics gathering unavailable to certain ftpds Data center configurations will have to change (on the firewall and the apache server) Easily accommodates a REST API and provides well established LDAP modules for simple configuration and integration DAACs custom code will have to change Permitted as a transfer protocol by the DoI Data Center customizations and extensions will need to be modified Supports a secure authentication mechanism (https) 11

URS GUIDANCE & POLICY 12

Guidance for EOSDIS DAACs, Subsystems And Applications Purpose: To provide guidance and clarify the integration requirements for the URS into EOSDIS systems and components. Scope: This guidance applies to all EOSDIS DAACs, subsystems (ECHO, GCMD, Earthdata, GIBS, etc.) and related EOSDIS services and applications including (Reverb, ASTER GDEM Explorer, ASF Vertex, etc.). Guidance: URS will be implemented by DAACs, subsystems and related services for the following capabilities: –Downloading science data files and FTP services. –Downloading science data files from HTTP, HTTPS and FTP services. –Web services and tools –Web services and tools allowing access to science data files (e.g. OPeNDAP, Web Coverage Services, analysis tools, DAAC-unique ordering tools). –Online collaboration –Online collaboration and comment tools (e.g. Wikis, Forums, Code Repositories). –Other tools and services that currently have optional or required user registration. NOTRegistration is NOT required: –Read-access to Web pages –Read-access to Web pages and documentation. –Data discovery –Data discovery services such as Reverb, Earth Data Search Client (ESDC), Global Change Master Directory keyword services, CMR and DAAC unique search clients. Note: This portion of the policy applies up until the point where science data downloads are performed or write operations such as saving search parameters, inputting or updating metadata records are performed. 13

Evolution and Transition Planning URS is available and this guidance will go into immediate effect. –A staggered approach will be utilized to implementing URS throughout DAACs, subsystems and applications. –Schedules and transition plans for implementation will be negotiated between effected systems and ESDIS. Milestones and Timeline –In 2015, HTTPS Access with URS 4 (SSO) must be available for all current equivalent FTP/HTTP Access. –DAACs, subsystems and applications are allowed to run HTTPS access and FTP/HTTP* access in parallel 14

FTP RETIREMENT AT DATA CENTERS Lessons Learned 15

Near Real Time Data Access (LANCE) HTTPS File Distribution Requirements for LANCE LANCE Elements shall integrate with the URS and restrict access to NRT data to users with valid URS accounts. URL structure should be decided by the data providers From a users perspective, it should be possible to get all the files simply by using curl or wget, –eg : wget -r –which would download all the OMTO3 data files and the Manifest for the date 2007/05/11. –To get the entire month use: wget -r –To get the entire year I could use: wget -r -nd 16

17

LP DAAC migration to HTTP The LP DAAC switched from FTP to HTTP for data access on June 4, This change was advertised on the LP DAAC Web site as a News item. For users who do not regularly visit our page, we encourage them to consider subscribing to the RSS News Feed ( so as not to miss out on future announcements. The News Item for the FTP to HTTP is available at ( _june_4_2013). Note: The cURL command handles http and has been used by some to update their scripted access to Data Pool. LP DAAC provides a good model for HTTPS data distribution

User Feedback “I think that the data should be delivered by a ftp server, because in my case, here in PARAGUAY the internet signal is not stable. During downloads, my connection was interrupted many times forcing me to restart the request process and download it again.” “We used to receive order by as ftp, currently it is only http, which is taking more time in downloading, can we go back to ftp option ?” “The problem I have with the http protocol is I don't know how to automate my wget script to get new data. With ftp I can use a wildcard at the end of the full file path. With the current naming of the.hdf files, MYD11C1.A hdf I don't know the filenames ahead of time, so I cannot even use a brute force, name every file to get approach. Is there some way you can recommend to automatically get these data? Can I request an automatic push to my incoming ftp site? “ 19

Summary Understanding that many of our users use scripts to get data from our anonymous FTP servers, this will require social as well as technical changes. We are gathering use cases and lessons learned from other DAACs in addition to providing ‘recipes’, reference software to automate authenticated HTTPS downloads, bulk download web clients, user tutorials and documentation. 20

Summary URS is also being enhanced to work with multiple web services. (e.g. OGC, OAI-PMH, OpenDAP, REST/SOAP). How to get HTTPS directory listings fast: Some DAACs will be exempt from the HTTP requirement (via waivers) –Our CDDIS DAAC is serving over 1.8M files and 380 Gbytes/day to over 13K distinct users ftp. 21

FILE TRANSFER PROTOCOL ENGINEERING PERSPECTIVE Backup - FTP versus HTTP 22

FTP/HTTP Comparison 23

FTP/HTTP Comparison (con’t) 24 Legend Performance (speed) Security

FILE TRANSFER PROTOCOL PERFORMANCE STUDY Backup 25

Study Background Sending files over a high-speed network doesn’t guarantee that the end-to-end performance will match the network capacity or meet user expectations. When transferring data, network latency (round-trip time or RTT) and packet loss can impact the transmission rate in conjunction with the file transfer protocol used, and the characteristics and tuning parameters of the end systems. EOSDIS performed a study of a set of file transfer protocols from ESDIS Networks to determine how each one performed in different network environments –All protocols studied use TCP for transport 26

Study Summary High speed networks don’t come with high speed end-to-end performance guarantees –File transfer protocol performance impacted by file size, host buffer size and TCP behavior Network latency (round-trip time, RTT) and packet loss Most common file transfer protocols were designed when network capacity was much less than today –FTP over TCP/IP was developed in the 1980s –Single TCP stream New file transfer protocols are designed to better adapt to changes in high speed network environments –Multiple, parallel TCP streams Other strategies are being employed to increase performance –Increasing packet size –Encrypting only sensitive data 27

Study Conclusions No single file transfer protocol works best in every network environment Data delivery requirements should be used to determine choice of file transfer protocol –Multi-stream protocols (bbFTP and GridFTP) are best at sending larger files over WANs (long RTT, higher packet loss) –Efficient, single stream protocols (FTP, HTTP) work best at sending smaller files over LANs (short RTT, lower packet loss) –Encryption processing software overhead lowers throughput Increased CPU load 28