Remote execution of long-running CGIs

Slides:



Advertisements
Similar presentations
Categories of I/O Devices
Advertisements

Bookshelf.EXE - BX A dynamic version of Bookshelf –Automatic submission of algorithm implementations, data and benchmarks into database Distributed computing.
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
Distributed Processing, Client/Server, and Clusters
Azure Services Platform Piotr Zierhoffer. Agenda Cloud? What is Azure? Environment Basic glossary Architecture Element description Deployment.
Brief Overview of.NET Remoting.NET Remoting is a Java RMI-like remote method invocation mechanism Infrastructure of.NET Remoting is highly customizable.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
DEV-14: Understanding and Programming for the AppServer™
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
IMDGs An essential part of your architecture. About me
Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting.
Mainframe (Host) - Communications - User Interface - Business Logic - DBMS - Operating System - Storage (DB Files) Terminal (Display/Keyboard) Terminal.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Concurrency Patterns Emery Berger and Mark Corner University.
Module 10 Administering and Configuring SharePoint Search.
Server to Server Communication Redis as an enabler Orion Free
INTRODUCTION TO WEB APPLICATION Chapter 1. In this chapter, you will learn about:  The evolution of the Internet  The beginning of the World Wide Web,
Moving Web Apps From Synchronous to Asynchronous Processing Jason Carreira Architect, ePlus Systems OpenSymphony member.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
Distributed Logging Facility Castor External Operation Workshop, CERN, November 14th 2006 Dennis Waldron CERN / IT.
CLIENT (Web browser, GET, POST) WEB Server GRID Infrastructure GRID Worker Node my_cgi.cgi cgi2rcgi NetSchedule NetCache remote_cgi Original CGI executable.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Session 11: Cookies, Sessions ans Security iNET Academy Open Source Web Development.
CLIENT SERVER COMPUTING. We have 2 types of n/w architectures – client server and peer to peer. In P2P, each system has equal capabilities and responsibilities.
NCBI Grid Presentation. NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Worker Nodes CGI Gateway.
Core and Framework DIRAC Workshop October Marseille.
Architecture NetSchedule -Queue 1 -Queue 2 -…. Client-submitter Client Waiting for job Worker Node (Active) Job 1Job 2Job 3 Worker Node (Waiting for a.
General Purpose Grid Computing LCA. Specification The system will provide a multi-threaded, shared memory environment that is distributed across a loosely.
1 Design and Implementation of a High-Performance Distributed Web Crawler Polytechnic University Vladislav Shkapenyuk, Torsten Suel 06/13/2006 석사 2 학기.
NetSchedule Push-Pull Model Queue 1Queue 2 Job 1 Job 2 ….. Job 3 NetSchedule server maintains several FIFO queues Push Job Pull Job.
WEB TESTING
Connected Infrastructure
Module 12: I/O Systems I/O hardware Application I/O Interface
Chapter 9: The Client/Server Database Environment
WWW and HTTP King Fahd University of Petroleum & Minerals
Chapter 3: Process Concept
Operating System.
Definition of Distributed System
VirtualGL.
NetSchedule Push-Pull Model
The Client/Server Database Environment
Connected Infrastructure
PHP / MySQL Introduction
#01 Client/Server Computing
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
Lecture 1: Multi-tier Architecture Overview
Near Real Time ETLs with Azure Serverless Architecture
Operating System Concepts
13: I/O Systems I/O hardwared Application I/O Interface
CS703 - Advanced Operating Systems
RKL Remote key loading.
Lecture Topics: 11/1 General Operating System Concepts Processes
Web Application Architectures
Enable long running Function Orchestrations
Prof. Leonardo Mostarda University of Camerino
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 3: Processes.
Back end Development CS Programming Languages for Web Applications
Overview of Workflows: Why Use Them?
A Scripting Server for Domain Automation Tasks
Web Application Architectures
Production Manager Tools (New Architecture)
Back end Development CS Programming Languages for Web Applications
Chengyu Sun California State University, Los Angeles
Web Application Development Using PHP
Module 12: I/O Systems I/O hardwared Application I/O Interface
#01 Client/Server Computing
Pig Hive HBase Zookeeper
Presentation transcript:

Remote execution of long-running CGIs Solving the 30-sec WEB timeout problem

Current Architecture 30 sec timeout WEB Server: CGI Application

Remote CGI infrastructure: Job status check Progress report WEB Server: CGI2RCGI gateway Back-end CGIs (worker nodes) Gateway relays CGI environment and input streams to the grid. While query is running it checks the status and displays a status report HTML page (customizable) NetSchedule Grid Output transferred to the WEB front end

Why 30 sec. timeout happens? Peak hours. In peak hours number of processor hungry tasks exceed number of CPUs. CGIs used as a platform to implement complex algorithms CGIs inefficiency

NetCache Brief: Temporary network accessible BLOB storage. Client submits data, receives the token (BLOB ID), which can be used to access the data (token is ready for URLs, cookies, etc). BLOBs are getting deleted automatically based on access pattern and timeout. Used for: Session data storage (CGIs) Storage of on-line generated graphics (CGIs) Local data cache (Object Manager) Load balanced using LBSM Cross platform Uses Berkeley DB (fast, transaction protected, can be redistributed and embedded into other apps (GBENCH)

NetSchedule Push-Pull Model Push Job Queue 1 Queue 2 Job 1 Job 2 ….. Job 3 Pull Job NetSchedule server maintains several FIFO queues

Network Communication front-end NetSchedule Clients & Worker Nodes Network Communication front-end Queue 1 Queue 2 FSM (in-memory) FSM (in-memory) Pending : 0 0 1 …………. Running : 0 1 0 …………. …… Done : 1 0 0 Pending : 0 1 1 …………. Running : 0 0 0 …………. …… Done : 1 0 0 Transactional Data Base Transactional Data Base

NetCache/NetSchedule In Grid Framework 1.Put Input Data CGI: Submit job Wait for result (off–line) Render result as HTML NetCache Server -keep input & output BLOBs 1.1 Data key (inp_blob_key) 8. Get Output Data (out_blob_key) 8.1 Output data 5. Put Result 5.1. Data key (out_blob_key) 2. Job Submit (inp_blob_key) 7. Get Result (job_key) 7.1 (out_blob_key) 2.1 (job_key) NetSchedule Server: - Control job queue 4. Get Input data (inp_blob_key) 3. Get Job 6. Put Job Result (out_blob_key) Worker Node: -Get new job Get input data (BLOB) Do the job Submit output BLOB Submit job result

How to convert an existing CGI? #include <misc/grid_cgi/remote_cgiapp.hp> ………….. class CRemoteCgiAppSample : public CRemoteCgiApp { ……. }; void CRemoteCgiAppSample::Init() // Standard CGI framework initialization CRemoteCgiApp::Init(); ………. } int CRemoteCgiAppSample::ProcessRequest( CCgiContext& ctx ) …………….. PutProgressMessage( “Work in progress"); ……………...

High availability All central components (queue and data storage) are duplicated All components are controlled by NCBI load balancer Protection against back-end (remote CGI) failures - by timeout or via explicit rescheduling

Features Back end servers can run requests for more than 30 seconds (WEB timeout) Easy application migration. (Minor code tweak and recompilation) Backend machines can send progress messages (feedback to the user)

Architecture Notification (UDP/IP) Worker Node (Active) Client-submitter NetSchedule Queue 1 Queue 2 …. GetJob() PutResult() Client-submitter Job 1 Job 2 Job 3 Client Waiting for job GetStatus() GetJob() PutResult() Worker Node (Waiting for a new job) Notification (UDP/IP) Push – Pull design (NetSchedule server is passive) Worker nodes can subscribe for queue events notifications Using connection-less network protocol

Framework Levels Remote CGI (Application looks like Ncbi CGI App) Grid framework High level API, job input and output are just C++ streams. NetSchedule/NetCache API Low level Queue/Storage access API All APIs need NetCache / NetSchedule components running in local network

Worker node API High level design (use of C++ streams, compatibility with ASN.1 serialization) Support of SMP (node can run parallel jobs) Remote administrative access to worker nodes (shutdown, availability check, statistics)

Performance & Latency Stress test results: NetSchedule Queue performance guarantees low overhead comparing to conventional CGI model Submit 5000 jobs Done. Elapsed: 2.147625 sec. Avg time: 0.000430 sec. GetStatus 5000 jobs... .....Elapsed :1.428061 sec. Avg time :0.000286 sec. Take-Return jobs... Returned 2500 jobs. Jobs processed: 2500 Elapsed: 1.848050 sec. Avg time: 0.000739 sec. Test environment: Linux-to-Linux 2 CPU machine, One active submitter

Job Timeout Protection Estimated job execution time (based on input analysis) 50 seconds NetSchedule Worker node 1 Job N Worker node 2

Job Timeout Protection Node Failure! NetSchedule Worker node 1 Job N Timer: Job N Worker node 2 Job N expired and rescheduled Job N