Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting.

Slides:



Advertisements
Similar presentations
Categories of I/O Devices
Advertisements

Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
OPERATING SYSTEMS PROCESSES
COM vs. CORBA.
Database Architectures and the Web
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Distributed System Structures Network Operating Systems –provide an environment where users can access remote resources through remote login or file transfer.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Processes CSCI 444/544 Operating Systems Fall 2008.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Server Architecture Models Operating Systems Hebrew University Spring 2004.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
1 Prototype Design of an Evolutionary Trustworthy Web Server  Hons Project Fall 2003.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Processes.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Overview SAP Basis Functions. SAP Technical Overview Learning Objectives What the Basis system is How does SAP handle a transaction request Differentiating.
File System Access (XRootd) Andrew Hanushevsky Stanford Linear Accelerator Center 13-Jan-03.
1 I-Logix Professional Services Specialist Rhapsody IDF (Interrupt Driven Framework) CPU External Code RTOS OXF Framework Rhapsody Generated.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
The Next Generation Root File Server Andrew Hanushevsky Stanford Linear Accelerator Center 27-September-2004
Finish configuration cloudclinica root jdbc:postgresql:5432//localhost/cc_db JDBC Url: JDBC Driver: User name: Password: ******** org.postgresql.Driver.
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
COMP 3438 – Part I - Lecture 4 Introduction to Device Drivers Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CHEN Ge CSIS, HKU March 9, Jigsaw W3C’s Java Web Server.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Designing Persistency Delos NoE, Preservation Cluster Workshop: Persistency in Digital Libraries 14. February 2006, Oxford Internet Institute.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 6 System Calls OS System.
Oracle 10g Database Administrator: Implementation and Administration Chapter 2 Tools and Architecture.
Some Design Notes Iteration - 2 Method - 1 Extractor main program Runs from an external VM Listens for RabbitMQ messages Starts a light database engine.
Ideas for a virtual analysis facility Stefano Bagnasco, INFN Torino CAF & PROOF Workshop CERN Nov 29-30, 2007.
1 Chapter Overview Performing Configuration Tasks Setting Up Additional Features Performing Maintenance Tasks.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR2 Disk.
Xrootd Update Andrew Hanushevsky Stanford Linear Accelerator Center 15-Feb-05
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
CS333 Intro to Operating Systems Jonathan Walpole.
ROOT-CORE Team 1 PROOF xrootd Fons Rademakers Maarten Ballantjin Marek Biskup Derek Feichtinger (ARDA) Gerri Ganis Guenter Kickinger Andreas Peters (ARDA)
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
FTP Server API Implementing the FTP Server Registering FTP Command Callbacks Data and Control Port Close Callbacks Other Server Calls.
Processes CSCI 4534 Chapter 4. Introduction Early computer systems allowed one program to be executed at a time –The program had complete control of the.
Processes CS 6560: Operating Systems Design. 2 Von Neuman Model Both text (program) and data reside in memory Execution cycle Fetch instruction Decode.
1 Java Servlets l Servlets : programs that run within the context of a server, analogous to applets that run within the context of a browser. l Used to.
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Processes and Virtual Memory
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
A prototype for an extended PROOF What is PROOF ? ROOT analysis model … … on a multi-tier architecture Status New development Prototype based on XRD Demo.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
FTP Client API FTP in embedded devices Implementing an FTP Client FTP Command APIs Other FTP Client APIs.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 2.
Copyright © 2004, Keith D Swenson, All Rights Reserved. OASIS Asynchronous Service Access Protocol (ASAP) Tutorial Overview, OASIS ASAP TC May 4, 2004.
Silberschatz, Galvin, and Gagne  Applied Operating System Concepts Module 12: I/O Systems I/O hardwared Application I/O Interface Kernel I/O.
WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.
NCBI Grid Presentation. NCBI Grid Structure NetCache NetSchedule Load Balancer (LBSM) Load Balancer (LBSM) Worker Nodes CGI Gateway.
SPL/2010 Reactor Design Pattern 1. SPL/2010 Overview ● blocking sockets - impact on server scalability. ● non-blocking IO in Java - java.niopackage ●
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
Architecture NetSchedule -Queue 1 -Queue 2 -…. Client-submitter Client Waiting for job Worker Node (Active) Job 1Job 2Job 3 Worker Node (Waiting for a.
Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Module 12: I/O Systems I/O hardware Application I/O Interface
Remote execution of long-running CGIs
Process concept.
Process Management Process Concept Why only the global variables?
OPEN-O Multiple VIM Driver Project Use Cases
Self Healing and Dynamic Construction Framework:
Chapter 2: System Structures
Chapter 2: The Linux System Part 2
Support for ”interactive batch”
Operating System Concepts
CS510 Operating System Foundations
Module 12: I/O Systems I/O hardwared Application I/O Interface
Presentation transcript:

Can we use the XROOTD infrastructure in the PROOF context ? The need and functionality of a PROOF Master coordinator has been discussed during the meeting on March 9 th. Main question addressed here: Can we base the PROOF Master coordinator on the XRD framework? Also: Can we take advantage of the XRD load balancer system?

Marek,

+ [cmd == retrieve]

How does XROOTD work Multi-component server based on a multi-thread architecture xrd component: provides networking, thread management, protocol scheduling Minimal sets of threads: Acceptor: opens connection; matches the protocol; submits job to scheduler Pollers: react to any activity on open links; submit job to scheduler Scheduler: schedules work to be done (jobs) Worker(s): wait for job to be done Buffer manager: dynamically optimizes use of memory buffers Workers created/destroyed following needs Links not attached to a specific worker: first worker free takes the job Jobs ≡ data/information to be processed for a given link

How does XROOTD work accept WN scheduler BM XROOTD jobs poller files ProtObj links

Scheduler One instance per main process new jobs are added to the queue always in last position can schedule jobs at a later time using a timer - presently used for internal optimization can handle forking of external processes - presently used to handle TNetFile requests forking a rootd process keeps statistics about all what’s going on Presently missing High-priority scheduling (not needed for file serving) - pollers use poll(): could be implemented using the POLLPRI flag (requires OOB?) - other solutions? - Andy is willing to implement a viable solution.

XConnections Physical connections: - one per client session / host server - based on TXSocket (: public TSocket) active use of timeouts both in opening and read/write operations - Reader thread (optional) fills a message queue from where the messages are picked-up - Handler for unsolicited server messages (requires reader thread) - Multiple-socket foreseen Logical connection: - one per open entity (file, …) - can share the same physical connection with another log connection Connection manager - keeps list of existing (logical,physical) connections - provides Connect/ Disconnect/ ReadRaw/ WriteRaw functionality - collects garbage Connection module (TXNetConn): - runs the xrootd protocol; handles re-directions - this is what the client class TXNetFile uses

Generic protocol interface Defined by the following methods: XrdProtocol *Match(XrdLink *lp) invoked when a new link is created to determine if this protocol can handle the open link int Process(XrdLink *lp) invoked when a link has data to be processed void Recycle(XrdLink *lp, int secs, char *reason) invoked when the instance is no longer needed int Stats(char *buf, int blen, int do_sync) invoked when we need statistics about all instances of the protocol

Existing protocol implentations XrdXrootdProtocol implements the protocol for file serving and directory handling - login, authentication (validates a physical link) - open, close, read, write, … - putfile, getfile, rm, mv, stat - mkdir, rmdir, dirlist XrdRootdProtocol implements solution for TNetFile backward compatibility - Match() transfers the open connection by execv(rootd,…)

XPROOFD: master coordinator Cannot run masters in multi-threaded environment - interpreter not MT-safe - crash of one master compromises all the others Mixed solution: - run in MT-env the “administrative tasks” (connection handling, logging, collection of results, …) not subject to unexpected bugs in client code - process data in external “job agents” (one per each protocol instance) The job agents open a link to the main application to communicate with the related protocol instance, subscribing to the poller(s). Caching and saving of the (temporary) results could be handle by protocol instance inside the MT main application: this could help for the privacy of the results Jobs agents would be essentially the present proofserv modified for xrd message handling

XrdProofdProtocol Two kind of instances - normal: proofserv instance lifetime uncorrelated from the logical connection starting it - query: get information about the status of submitted jobs lifetime same as connection lifetime Normal: - setup (authenticate, …), create the job agent (the job agent would be started via execv by a dedicated thread) - transfer messages from the client to the job agent - transfer messages from the job agent or server to the client - save the results at the end of processing Query: - query status of jobs - retrieve IDs of running sessions - retrieve results of terminated sessions Specific XProof protocol for handling / structuring messages (analogous of XProtocol.hh)

TXProofServ (Job Agent) Based on TProofServ At setup, open a XConnection to the MT parent HandleSocketInput, HandleUrgentData would be part of the UnsolicitedRequestHandler Other modifications may depend on where we will go with sandboxes TXSlave As TSlave but using TXProofConn TXProofConn Connection module based on the connection manager running the xproofd protocol XrdProofdProtocol

How would XPROOFD master coordinator work accept WN scheduler BM XROOTD jobs poller ProtObj links job agent job agent job agent Worker nodes client

XPROOFD: worker node Similar structure could be applied to worker nodes Ideally one could optimize the load on the existing job agents by making them interchangeable, i.e. not stick to a particular protocol instance. This would require loading of the library environment required in each of the JA instances accept WN scheduler BM XROOTD jobs poller ProtObj links job agent job agent job agent master

OLBD In xrootd, control network determining the best server among those having the file where to address the client In PROOF it should find out the best subset of worker nodes, among those it knows about, where to start the PROOF session, based on: - CPU, memory load - expected termination time of ongoing PROOF processing - location of files to be processed - … According to Andy changing the policy is pretty easy Requires worker node registration to masters

To summarize Though designed for file serving, the xrd component of xrootd provides an infrastructure and an interface general enough to handle generic server tasks Even if the proofservs cannot be run in threads, the general framework seems to provide most of the coordinating functionality we put on the list of desiderata Andy is at CERN in two weeks and he proposed to have a discussion about this, in particular about the additional levels of abstraction which could be needed to use the framework in other contexts.