Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware Harshawardhan Gadgil Advisor: Prof. Geoffrey Fox Ph.D. Defense.

Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware Harshawardhan Gadgil Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam April 5, 2007

2 Talk Outline Use Cases and Motivation Architecture Handling Consistency and Security Issues Performance Evaluation Application: Managing Grid Messaging Middleware Conclusion Thesis Contribution and Future Work

3 Grid Large Number of Distributed Resources Applications distributed and composed of a large number and type (hardware, software) of resources Components widely dispersed and disparate in nature and access

4 Sensor Grid * * Galip Aydin, Ph.D. Thesis, Jan 2007

5 Example Audio Video Conferencing GlobalMMCS project, which uses NaradaBrokering as a event delivery substrate Consider a scenario where there is a teacher and 10,000 students. One way is to form a TREE shaped hierarchy of brokers One broker can support up to 400 simultaneous video clients and 1500 simultaneous audio clients with acceptable quality*. So one would need (10000 / 400 ≈ 25 broker nodes). * * “Scalable Service Oriented Architecture for Audio/Video Conferencing”, Ahmet Uyar, Ph.D. Thesis, May 2005 ……… … 400 participants A single participant sends audio / video

6 Definition: Use of term: Resource Consider a Digital Entity on the network Specific case where this entity can be controlled by modest external state Can be captured via a few messages (typically 1) The digital entity in turn can bootstrap and manage components that may require more state. Thus, Digital entity + Component = Manageable Resource Thus could be hardware or software (services) Henceforth, we refer to Service being managed as a Manageable Resource

7 Definition: What is Management ? Resource Management – Maintaining System’s ability to provide its specified services with a prescribed QoS Management Operations* include Configuration and Lifecycle operations (CREATE, DELETE) Handle RUNTIME events Monitor status and performance Maintain system state (according to user defined criteria) This thesis addresses: Configuring, Deploying and Maintaining Valid Runtime Configuration Crucial to successful working of applications Static (configure and bootstrap) and Dynamic (monitoring / event handling) * From WS – Distributed Management http://devresource.hp.com/drc/slide_presentations/wsdm/index.jsp http://devresource.hp.com/drc/slide_presentations/wsdm/index.jsp

8 Existing Systems Distributed Monitoring frameworks NWS, Ganglia, MonALISA Primarily serve to gather metrics (which is one aspect of resource management, as we defined) Management Frameworks SNMP – primarily for hardware (hubs, routers) CMIP – Improved security & logging over SNMP JMX – Managing and monitoring for Java applications WBEM – System management to unify management of distributed computing environments Management systems not-interoperable – Move to Web Services based management of resources XML based interactions that facilitate implementation in different languages, running on different platforms and over multiple transports Competing Specifications (WS – Management and WS – Distributed Management)

9 Motivation: Issues in Management Resources must meet General QoS and Life-cycle features (User defined) Application specific criteria Improper management such as wrong configuration – major cause of service downtime Large number of widely dispersed Resources Decreasing hardware cost => Easier to replicate for fault-tolerance (Espl. Software replication) Presence of firewalls may restrict direct access to resources Resource specific management systems have evolved independently (different platform / language / protocol) Requires use of proprietary technologies Central management System Scalability and single point of failure

10 Desired Features of the Management Framework Fault Tolerance Failures are Normal, Resources may fail, but so also components of the management framework. Framework MUST recover from failure Scalability With Growing Complexity of application, number of resources (application components) increase E.g. LHC Grid consists of a large number of CPUs, disks and mass storage servers (on the order of ~ 30K) In future, much larger systems will be built MUST cope with large number of resources in terms of Additional components Required

11 Performance Initialization Cost, Recovery from failure, Responding to run-time events Interoperability Services exist on different platforms, Written in different languages, managed using system specific protocols and hence not INTEROPERABLE Framework must implement interoperable protocols such as based on Web-Service standards Generality Management framework must be a generic framework. Should be applicable to any type of resource (hardware / software) Usability Autonomous operation (as much as possible) Desired Features of the Management Framework

12 Architecture We assume resource specific external state to be maintained by a Registry (assumed scalable, fault-tolerant by known techniques) We leverage well-known strategies for providing Fault-tolerance (E.g. Replication, periodic check-pointing, request-retry) Fault-detection (E.g. Service heartbeats) Scalability (E.g. hierarchical organization)

13 Management Architecture built in terms of Hierarchical Bootstrap System Resources in different domains can be managed with separate policies for each domain Periodically spawns a System Health Check that ensures components are up and running Registry for metadata (distributed database) – Robust by standard database techniques and our system itself for Service Interfaces Stores resource specific information (User-defined configuration / policies, external state required to properly manage a resource) Generates a unique ID per instance of registered component Our present implementation is a simple registry service

14 Messaging Nodes form a scalable messaging substrate Provides transport protocol independent messaging between components Can provide Secure delivery of messages In our case, we use NaradaBrokering Broker as a messaging node Managers – Active stateless agents that manage resources. Since they don’t maintain state, hence robust Actual management functions are performed by a Resource specific manager component Resources – what you are managing Wrapped by a Service Adapter which provides a Web Service interface. Service Adapter connects to messaging node to leverage transport independent publish subscribe communication with other components Management Architecture built in terms of

15 Architecture: Scalability: Hierarchical distribution ROOT US EUROPE FSU CARDIFF CGL Active Bootstrap Nodes /ROOT/EUROPE/CARDIFF Always the leaf nodes in the hierarchy Responsible for maintaining a working set of management components in the domain Passive Bootstrap Nodes Only ensure that all child bootstrap nodes are always up and running … Spawns if not present and ensure up and running …

16 Architecture: Conceptual Idea (Internals) Connect to Messaging Node for sending and receiving messages User writes system configuration to registry Manager processes periodically checks available resources to manage. Also Read/Write resource specific external state from/to registry Always ensure up and running Periodically Spawn WS Management Publish Subscribe based communication via Messaging Node

17 Architecture: User Component “Resource Characteristics” are determined by the user. Events generated by the Resources are handled by the manager Event processing is determined by via WS-Policy constructs For e.g., Automatically instantiate a failed resource instance <pol:Policy xmlns:pol=http://schemas.xmlsoap.org/ws/2004/09/policy xmlns:pol1="http://www.hpsearch.org/schemas/2006/07/policy"> <pol1:AUTOInstantiate forkProcessLocator="udp://156.56.104.152:65535"/> Managers can set up services Writing information to registry can be used to start up a set of services

18 Issues in the distributed system Consistency Examples of inconsistent behavior Two or more managers managing the same resource Old messages / requests reaching after new requests Multiple copies of resources existing at the same time / Orphan Resources leading to inconsistent system state Use a Registry generated monotonically increasing Unique Instance ID (IID) to distinguish between new and old instances Requests from manager A are considered obsolete IF IID(A) < IID(B) Service Adapter stores the last known MessageID (IID:seqNo) allowing it to differentiate between duplicates AND obsolete messages Service adapter periodically renews with registry IF IID(ResourceInstance_1) < IID(ResourceInstance_2) THEN ResourceInstance_1 is OBSOLETE SO ResourceInstance_1 silently shuts down

19 Issues in the distributed system Security * NB-Topic Creation and Discovery - Grid2005 / IJHPCN http://grids.ucs.indiana.edu/ptliupages/publications/NB-TopicDiscovery-IJHPCN.pdf # NB-Security (Grid2006) http://grids.ucs.indiana.edu/ptliupages/publications/NB-SecurityGrid06.pdf NaradaBrokering’s Topic Creation and Discovery* and Security Scheme # addresses Message level security Provenance, Lifetime, Unique Topics Secure Discovery of endpoints Prevent unauthorized access to services Prevent malicious users from modifying message Thus message interactions are secure when passing through insecure intermediaries

20 Implemented: Management framework Management of NaradaBrokering Brokers Released with NaradaBrokering in Feb 2007 WS – Specifications WS – Management (could use WS-DM) -June 2005 parts (WS – Transfer [Sep 2004], WS – Enumeration [Sep 2004]) and WS – Policy[Sep 2004], SOAP v 1.2 (needed for WS-Management) WS – Eventing (Leveraged from the WS – Eventing support in NaradaBrokering) Used XmlBeans 2.0.0 for manipulating XML in custom container

21 Performance Evaluation Measurement Model – Test Setup Multithreaded manager process - Spawns a Resource specific management thread (A single manager can manage multiple different types of resources) Limit on maximum resources that can be managed Limited by maximum threads per JVM possible (memory constraints) Theoretically ~1800 resources (refer: Thesis writeup) Practical Limit on maximum requests that can be handled Performance Model in Thesis

22 Performance Evaluation Results

23 Performance Evaluation Results Scenario illustrating a case with multiple concurrent events Response time increases with increasing number of concurrent requests Response time is RESOURCE-DEPENDENT and the shown times are illustrative Increases rapidly as no. of requests > 210 MAY involve dependency on external services such as Registry access which will increase overall response time but can allow more than (210) concurrent requests to be processed

24 Performance Evaluation Comparing Increasing Managers on same machine w.r.t. different machines

25 Performance Evaluation Research Question: How much infrastructure is required to manage N resources ? N = Number of resources to manage M = Max. no. of resources that connect to a single messaging node D = Maximum concurrent requests that can be processed by a single manager process before saturating For analysis, we set this as the number of resources assigned per manager R = min. no. of registry service components required to provide desired level of fault-tolerance Assume every leaf domain has 1 messaging node. Hence we have N/M leaf domains Further, No. of managers required per leaf domain is M/D Other passive bootstrap nodes are not counted here since << N Total Components in lowest level = (R registry + 1 Bootstrap Service + 1 Messaging Node + M/D Managers) * (N/M such leaf domains) = (2 + R + M/D) * (N/M)

26 Performance Evaluation Research Question: How much infrastructure is required to manage N resources ? Thus percentage of additional infrastructure is = [(2 + R + M/D)*N/M] * 100 / N % = [(2 +R)/M + 1/D] * 100 % A Few Cases If, D = 200, M = 800 and R = 4, then Additional Infrastructure = [(2+4)/800 + 1/200] * 100 % ≈ 1.2 % Shared Registry then there is one registry interface per domain, R = 1, then Additional Infrastructure = [(2+1)/800 + 1/200] * 100 % ≈ 0.87 % If NO messaging node is used (assume D = 200), then Additional Infrastructure = [(R registry + 1 bootstrap node + N/D managers)/N] * 100 % = [(1+R)/N + 1/D] * 100 % ≈ 100/D % (for N >> R) ≈ 0.5% No. of Resources (N), No. of Resource assigned to manager (D), Registry Service Instances (R), Max. Entities connected to Messaging Node (M)

27 Performance Evaluation Research Question: How much infrastructure is required to manage N resources ?

28 Prototype: Managing Grid Messaging Middleware We illustrate the architecture by managing the distributed messaging middleware: NaradaBrokering This example motivated by the presence of large number of dynamic peers (brokers) that need configuration and deployment in specific topologies Runtime metrics provide dynamic hints on improving routing which leads to redeployment of messaging system (possibly) using a different configuration and topology or use (dynamically) optimized protocols (UDP v TCP v Parallel TCP) and go through firewalls Broker Service Adapter NB illustrates an electronic entity that didn’t start off with an administrative service interface So add wrapper over the basic NB BrokerNode object that provides WS – Management front-end Allows CREATION, CONFIGURATION and MODIFICATION of broker configuration and broker topologies

29 Performance Evaluation XML Processing Overhead XML Processing overhead is measured as the total marshalling and un-marshalling time required including validation against schema In case of Broker Management interactions, typical processing time (includes validation against schema) ≈ 5 ms Broker Management operations invoked only during initialization and failure from recovery Reading Broker State using a GET operation involves 5ms overhead and is invoked periodically (E.g. every 1 minute, depending on policy) Further, for most operation dealing with changing broker state, actual operation processing time >> 5ms and hence the XML overhead of 5 ms is acceptable.

30 Prototype: Observed Operation Costs (Individual Resources – Brokers) Operation Time (msec) (average values) Un-Initialized (First time) Initialized (Later modifications) Set Configuration 778 ± 533 ± 3 Create Broker 610 ± 657 ± 2 Create Link 160 ± 227 ± 2 Delete Link 104 ± 220 ± 1 Delete Broker 142 ± 1129 ± 2

31 Recovery: Estimated Recovery Cost per broker Topology Number of Resource specific Configuration Entries Recovery Time = T(Read State From Registry) + T(Bring Resource up to speed) = T(Read State) + T[SetConfig + Create Broker + CreateLink(s)] Ring N nodes, N links (1 outgoing link per Node) 2 Resource Objects Per node 10 + (778 + 610 + 160) = 1558 msec Cluster N nodes, Links per broker vary from 0 – 3 1 – 4 Resource Objects per node Min: 5 + 778 + 610 = 1393 msec Max: 20 + 778 + 610 + 160 + 2* 27 = 1622 msec * Assuming 5ms Read time from registry per resource object

32 Prototype: Observed Recovery Cost per Broker OperationAverage (msec) *Spawn Process 2362 ± 18 Read State 8 ± 1 Restore (1 Broker + 1 Link) 1421 ± 9 Time for Create Broker depends on the number & type of transports opened by the broker E.g. SSL transport requires negotiation of keys and would require more time than simply establishing a TCP connection If brokers connect to other brokers, the destination broker MUST be ready to accept connections, else topology recovery takes more time. Restore (1 Broker + 3 Link) 1616 ± 82

33 Contributions Designed and implemented a Resource Management Framework Scalable to manage large number of resources Tolerant to failures in framework itself Can handle failures in managed resources via user defined policies We have shown that Management framework can be built on top of a publish subscribe framework to provide transport independent messaging between framework components Implemented Web Service Management to manage resources Detailed evaluation of the system components to show that the proposed architecture has acceptable costs Implemented Prototype to illustrate management of a distributed messaging middleware system: NaradaBrokering

34 Future Work Apply the framework to broader domains Investigate application of architecture to resources with significant runtime state that needs to be maintained Higher frequency and size of messages XML processing overhead becomes significant Investigate strategies to distribute framework components (load balance) considering factors such as locality of resources and runtime metrics

35 Publications On the presented work: Scalable, Fault-tolerant Management in a Service Oriented Architecture Harshawardhan Gadgil, Geoffrey Fox, Shrideep Pallickara, Marlon Pierce Accepted as poster in HPDC 2007 Managing Grid Messaging Middleware Harshawardhan Gadgil, Geoffrey Fox, Shrideep Pallickara, Marlon Pierce In Proceedings of “Challenges of Large Applications in Distributed Environments” (CLADE 2006), pp. 83 - 91, June 19, 2006, Paris, France A Retrospective on the Development of Web Service Specifications Shrideep Pallickara, Geoffrey Fox, Mehmet Aktas, Harshawardhan Gadgil, Beytullah Yildiz, Sangyoon Oh, Sima Patel, Marlon Pierce and Damodar Yemme Chapter in Book Securing Web Services: Practical Usage of Standards and Specifications On NaradaBrokering: On the Secure Creation, Organization and Discovery of Topics in Distributed Publish/Subscribe systems Shrideep Pallickara, Geoffrey Fox, Harshawardhan Gadgil (To Appear) International Journal of High Performance Computing and Networking (IJHPCN), 2006. Special Issue of extended versions of the 6 best papers at the ACM/IEEE Grid 2005 WorkshopIJHPCNGrid 2005 On the Discovery of Brokers in Distributed Messaging Infrastructure Shrideep Pallickara, Harshawardhan Gadgil, Geoffrey Fox In Proceedings of the IEEE Cluster 2005 Conference. Boston, MAIEEE Cluster 2005

36 Publications (contd…) On NaradaBrokering (contd…): On the Discovery of Topics in Distributed Publish/Subscribe systems Shrideep Pallickara, Geoffrey Fox, Harshawardhan Gadgil In Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing Grid 2005 Conference, pp. 25-32, Seattle, WA (Selected as one of six Best Papers)Grid 2005 A Framework for Secure End-to-End Delivery of Messages in Publish/Subscribe Systems Shrideep Pallickara, Marlon Pierce, Harshawardhan Gadgil, Geoffrey Fox, Yan Yan, Yi Huang, (To Appear) In Proceedings of “The 7th IEEE/ACM International Conference on Grid Computing” (Grid 2006), Barcelona, September 28th-29th, 2006Grid 2006 HPSearch, GIS and Misc: A Scripting based Architecture for Management of Streams and Services in Real-time Grid Applications Harshawardhan Gadgil, Geoffrey Fox, Shrideep Pallickara, Marlon Pierce, Robert Granat, In Proceedings of the IEEE/ACM Cluster Computing and Grid 2005 Conference, CCGrid 2005, Vol. 2, pp. 710-717, Cardiff, UKCCGrid 2005 Building Messaging Substrates for Web and Grid Applications Geoffrey Fox, Shrideep Pallickara, Marlon Pierce, Harshawardhan Gadgil In special Issue on Scientific Applications of Grid Computing in Philosophical Transactions of the Royal Society, London, Volume 363, Number 1833, pp 1757- 1773, August 2005

37 Publications (contd…) HPSearch, GIS and Misc (contd…): Building and Applying Geographical Information System Grids Galip Aydin, Ahmet Sayar, Harshawardhan Gadgil, Mehmet S. Aktas, Geoffrey C. Fox, Sunghoon Ko, Hasan Bulut,and Marlon E. Pierce, 12 January 2006, Special Issue on Geographical information Systems and Grids based on GGF15 workshop, Concurrency and Computation: Practice and Experience SERVOGrid Complexity Computational Environments (CCE) Integrated Performance Analysis Galip Aydin, Mehmet S. Aktas, Geoffrey C. Fox, Harshawardhan Gadgil, Marlon Pierce, Ahmet Sayar, In Proceedings of the 6th IEEE/ACM International Workshop 13-14 Nov. 2005. Page(s): 256 - 261 (Grid 2005)

Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware Harshawardhan Gadgil Advisor: Prof. Geoffrey Fox Ph.D. Defense.

Similar presentations

Presentation on theme: "Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware Harshawardhan Gadgil Advisor: Prof. Geoffrey Fox Ph.D. Defense."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware Harshawardhan Gadgil Advisor: Prof. Geoffrey Fox Ph.D. Defense.

Similar presentations

Presentation on theme: "Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware Harshawardhan Gadgil Advisor: Prof. Geoffrey Fox Ph.D. Defense."— Presentation transcript:

Similar presentations

About project

Feedback