Enhancing Real-time CORBA via Real-time Java features International Conference on Distributed Computer Systems Tokyo, Japan Friday, March 19, 2004 Arvind S. Krishna Douglas C. Schmidt Elec & Comp. Eng. Dept Vanderbilt University {arvindk, Raymond Klefstad Elec & Comp. Eng. Dept University of California, Irvine
Talk Outline − Tech transitions in the DRE domain Real-time middleware & Real-time Java − RT-Java + real-time middleware ZEN project − ZEN R&D process − Design & Architecture − Applying Real-time Java features − Empirical Results − Concluding remarks and References
DRE Domain: Characteristics Types Wide applicability Range from Total ship-board computing systems to Industrial process control systems Common Requirement Right answer delivered too late becomes wrong answer Characteristics and Requirements Distributed Systems require capabilities to manage connections and message transfer between separate machines Real-time Systems require predictable and efficient control over end- to-end system resources Embedded Systems have weight, cost, and power constraints that limit their computing and memory resources Total Ship C&C Center Total Ship Computing Environments Increasingly DRE applications are combined to form “systems of systems” Industrial Process Control
Current Real-time Middleware Trends Current Real-time CORBA ORBs are developed in C and C++ –ACE+TAO from ISIS Vanderbilt –ORBexpress from OIC –eORB from Prism Technologies Real-time CORBA has not been universally adopted by DRE application developers –Complexity of the CORBA C++ mapping –Steep learning curve caused by feature rich and complex C++ language Increasingly hard to find “good” C++ application developers and retain them Java Programming Language has emerged having less complexity than C++ – Safety – Simplicity – Productivity However, Java is not suitable for developing DRE applications: The scheduling of Java threads is purposely under-specified (so to allow easy implementation of JVM on as many platform as possible) The GC can preempt for unbounded amount of time Java Threads Java provides coarse-grained control over memory allocation, and it does not provide access to raw memory Java does not provide high resolution time, nor access to signals, e.g. POSIX Signals The Real-Time Specification for Java (RTSJ) extends Java in the following areas: New memory management models that can be used in lieu of garbage collection Stronger semantics on thread and their scheduling Access to physical memory Asynchronous Event handling mechanism Timers and Higher time resolution Priority pre-emptive scheduler
RTSJ Thread & Memory Models RTSJ Thread Model Real-time Threads – priority and scheduling characteristics specified NoHeapRealtimeThreads Do not “touch” the heap Use of NHRT threads can have exec eligibility higher than that of GC RTSJ Thread Model Real-time Threads – priority and scheduling characteristics specified NoHeapRealtimeThreads Do not “touch” the heap Use of NHRT threads can have exec eligibility higher than that of GC Scoped Memory Region based memory based tied to number of active threads in that region Properties Any thread may create a scoped region – using new operator However, only a real-time thread may allocated from that region Scoped Memory Region based memory based tied to number of active threads in that region Properties Any thread may create a scoped region – using new operator However, only a real-time thread may allocated from that region Immortal Memory Same lifetime as the JVM Objects allocated never garbage collected Immortal Memory Same lifetime as the JVM Objects allocated never garbage collected Physical Memory Allows access to specific locations based on addresses. Physical Memory Allows access to specific locations based on addresses.
Scoped Memory in Action Heap time new A(…) (new RealtimeThread(…)).start() Obj A created in Heap region
Scoped Memory in Action Heap ma1 = new LTMemory(…) ma1 time (new RealtimeThread(…)).start() Ma1 is an “inner scope” can hold references only to “outer regions” Note: ma1 is reference is in heap, Reference count = 0 Scoped Memory Properties Reference counted; no of active threads in region When reference count of a region drops to zero –All Objects within that region are considered unreachable –Finalizers of all objects run; Assignment Rules obj in region ma can hold ref to obj in region mb if lifetime (ma) <= lifetime (heap) ma1.enter(logic1) logic1 Logic instance of Runnable ma1 Current Allocation Context Reference count = 1
Talk Outline
Motivation for ZEN Real-time ORB Integrate best aspects of several key technologies Java: Simple, less error-prone, large user-base Real-time Java: Real-time support CORBA: Standards-based distributed applications Real-time CORBA: CORBA with Real-time QoS capabilities ZEN project goals Make development of distributed, real-time, & embedded (DRE) systems easier, faster, & more portable Provide open-source Real-time CORBA ORB written in Real-time Java to enhance international middleware R&D efforts Integrate best aspects of several key technologies Java: Simple, less error-prone, large user-base Real-time Java: Real-time support CORBA: Standards-based distributed applications Real-time CORBA: CORBA with Real-time QoS capabilities ZEN project goals Make development of distributed, real-time, & embedded (DRE) systems easier, faster, & more portable Provide open-source Real-time CORBA ORB written in Real-time Java to enhance international middleware R&D efforts
Phase I – Applying Opt Strategies Foot-print Reduction OptimizationFoot-print Reduction Optimization Micro ORB Architecture Virtual Component PatternMicro ORB Architecture Virtual Component Pattern Micro POA Architecture Pluggable components Micro POA Architecture Pluggable components Request Demux/Dispatch OptimizationsRequest Demux/Dispatch Optimizations Connection Management Acceptor-Connector pattern, Reactor (java’s nio package)Connection Management Acceptor-Connector pattern, Reactor (java’s nio package) Buffer Management StrategiesBuffer Management Strategies Request Demultiplexing Active Demultiplexing & Perfect HashingRequest Demultiplexing Active Demultiplexing & Perfect Hashing POA OptimizationsPOA Optimizations Object Key Processing Strategies Asynchronous completion token patternObject Key Processing Strategies Asynchronous completion token pattern Servant lookup Reverse lookup mapServant lookup Reverse lookup map Concurrency Strategies Half- Sync/Half-AsyncConcurrency Strategies Half- Sync/Half-Async Foot-print Reduction OptimizationFoot-print Reduction Optimization Micro ORB Architecture Virtual Component PatternMicro ORB Architecture Virtual Component Pattern Micro POA Architecture Pluggable components Micro POA Architecture Pluggable components Request Demux/Dispatch OptimizationsRequest Demux/Dispatch Optimizations Connection Management Acceptor-Connector pattern, Reactor (java’s nio package)Connection Management Acceptor-Connector pattern, Reactor (java’s nio package) Buffer Management StrategiesBuffer Management Strategies Request Demultiplexing Active Demultiplexing & Perfect HashingRequest Demultiplexing Active Demultiplexing & Perfect Hashing POA OptimizationsPOA Optimizations Object Key Processing Strategies Asynchronous completion token patternObject Key Processing Strategies Asynchronous completion token pattern Servant lookup Reverse lookup mapServant lookup Reverse lookup map Concurrency Strategies Half- Sync/Half-AsyncConcurrency Strategies Half- Sync/Half-Async
Phase III Build a Real-Time CORBA ORB that runs atop a mature RTSJ Layer Phase II – Applying RTSJ Phase I Optimization patterns and principles ORB-Core Optimizations Micro ORB Architecture Virtual Component Pattern Connection Management Acceptor-Connector pattern, Reactor (java’s nio package) Collocation and Buffer Management Strategies POA Optimizations Request Demultiplexing Active Demultiplexing & Perfect Hashing Object Key Processing Strategies Asynchronous completion token pattern Servant lookup Reverse lookup map Concurrency Strategies Half-Sync/Half-Async Phase II Enhance Predictability by applying RTSJ features Associate Scoped Memory with Key ORB Components –I/O Layer : Acceptor-Connector, Transports –ORB Layer: CDR Streams, Message Parsers –POA Layer: Thread-Pools and Upcall Objects Using NoHeapRealtimeThreads –Ultimately use NHRT Threads for request/response processing –Reduce priority inversions from Garbage Collector Phase II Enhance Predictability by applying RTSJ features Associate Scoped Memory with Key ORB Components –I/O Layer : Acceptor-Connector, Transports –ORB Layer: CDR Streams, Message Parsers –POA Layer: Thread-Pools and Upcall Objects Using NoHeapRealtimeThreads –Ultimately use NHRT Threads for request/response processing –Reduce priority inversions from Garbage Collector
Talk Outline
Applying RTSJ features – Motivation Design of ZEN for Real-Time CORBA Apply scoped memory along critical request processing path Provide Policies at the POA level for RTSJ aware users Allows NHRT threads used for request processing Proper use of NHRT threads would minimize GC execution during request processing Design of ZEN for Real-Time CORBA Apply scoped memory along critical request processing path Provide Policies at the POA level for RTSJ aware users Allows NHRT threads used for request processing Proper use of NHRT threads would minimize GC execution during request processing Original design of ZEN All components allocated in heap Request processing thread may be preempted by GC (demand garbage collection) Original design of ZEN All components allocated in heap Request processing thread may be preempted by GC (demand garbage collection) Goals Compliance with CORBA specification Interoperability with classic CORBA Reduce overhead for applications not using real-time features End-user transparent Goals Compliance with CORBA specification Interoperability with classic CORBA Reduce overhead for applications not using real-time features End-user transparent
Analyzing Request Processing Steps Server Side Connection Acceptance 4. An acceptor accepts the new incoming connection. 6. A new connection handler T 1 is created to service requests 7. The Transport's event loop waits for data events from the client Server Side Request Processing Steps 11. The request header on connection is read to determine the size of the request. 12. A buffer of the corresponding size is obtained from the buffer manager to hold request and read data. 13. The request is the demultiplexed to obtain the target POA, servant, and skeleton servicing the request. The upcall is dispatched to the servant after demarshaling the request. 14. The reply is marshaled using the corresponding GIOP message writer; Transport sends reply to the client. Server Side Connection Acceptance 4. An acceptor accepts the new incoming connection. 6. A new connection handler T 1 is created to service requests 7. The Transport's event loop waits for data events from the client Server Side Request Processing Steps 11. The request header on connection is read to determine the size of the request. 12. A buffer of the corresponding size is obtained from the buffer manager to hold request and read data. 13. The request is the demultiplexed to obtain the target POA, servant, and skeleton servicing the request. The upcall is dispatched to the servant after demarshaling the request. 14. The reply is marshaled using the corresponding GIOP message writer; Transport sends reply to the client. Repetitive & Ephemeral Carried out for each client request Typically objects live for one cycle Repetitive & Ephemeral Carried out for each client request Typically objects live for one cycle Independent Steps for two different clients do not share context Independent Steps for two different clients do not share context Thread Bound Steps executed by I/O threads Thread- Pool threads Thread Bound Steps executed by I/O threads Thread- Pool threads
Application of Scoped Memory – ZEN Applying Scoped Memory 1.Break Steps into three broad regions based on request processing steps I/O scope read request ORB scope process request POA scope perform upcall send reply 2.Recursively enter each space from I/O POA scopes 3.Implicitly exit regions from POA I/O scope Applying Scoped Memory 1.Break Steps into three broad regions based on request processing steps I/O scope read request ORB scope process request POA scope perform upcall send reply 2.Recursively enter each space from I/O POA scopes 3.Implicitly exit regions from POA I/O scope Independent Ephemeral Create all objects in a scoped region Two requests can be mapped to two separate scope regions Temporary objects may be cleared after request processing Threadbound Encapsulate steps as “ logic ” class associate this logic class with real-time threads Threads enter the scoped region; processes request; exits region enabling objects to be finalized
Scope Memory in ZEN: Action time I/O Scope ORB Scope POA Scope NETWORK 1: (new RealtimeThread(default Scope)) Associate a start scope with real-time thread 2: (I/O Scope).enter() I/O Scope is now current active region I/O thread waits for data 3: waitForData() The three scopes are created during ORB initialization time Following Slides are adapted from Angelo Corsaro
I/O Stage Processing time I/O Scope NETWORK 1:Data arrival 2:Read Data new GIOPMessage () 3:Create new Message The message is created within the scoped region I/O Scope Participants The participants for this phase include, acceptors, and transports RTSJ application –Each of these components are thread-bound components and are designed based on inner logic class –Corresponds to the logic run by the thread –Instead of creating the entire component in scoped memory, we create the inner logic class in a scoped memory region, m io –This logic class is associated with the thread at creation time
ORB Stage processing time I/O Scope ORB Scope NETWORK new GIOPMessage () The thread enters ORB scope to parse request parseAndProcessRequest() I/O Scope ORB Scope Scope Stack Nested inner scope: all refs from ORB -> I/O are valid ORB Scope Participants Message parsers, CDR Streams RTSJ application –The appropriate message parser associated to parse request –The message parser and buffer created in a nested memory region, m orb. –Using RTSJ memory rules, references from the ORB to the I/O space are valid
POA Stage processing time I/O Scope ORB Scope NETWORK new GIOPMessage () parseAndProcessRequest() POA Scope Enter POA scope to process request and send response performUpCall() Up-call related objects created in this scope I/O Scope Scope Stack POA Scope ORB Scope Steps Demux request to get target POA, servant and skeleton Perform upcall on the servant marshal reply back to client RTSJ Application Message parser – parses the request to find target servant and skeleton Set up context for the upcall Upcall Object – holds info necessary to perform upcall Output buffer – holds response
Talk Outline
Predictability Enhancement Overview POA Demultiplexing experiment conducted to measure improvement in predictability Result Synopsis Average Measures: Scoped Memory does have some overhead ~ 3 s Dispersion Measures: Considerable improvement in predictability Dispersion improves by a ~ factor of 4 Worst-Case Measures: Scoped memory bounds worst case Heap shows marked variability Associating scoped memory – Does not compromise performance – Significantly enhances predictability – Bounds worst case latency Associating scoped memory – Does not compromise performance – Significantly enhances predictability – Bounds worst case latency
Enter Exit Analysis Overview Quantify overhead incurred by using Scope Memory: enter () – entering scope region exit () – leave the scope region Result Synopsis Average Measures: Constant enter () time across all message sizes. exit () time increases with message size Dispersion Measures: exit () methods incur considerable variability when compared to enter () Worst-Case Measures: Similar behavior to both enter and exit () time On exit finalizers of objects run, hence larger messages have higher average latency – enter () time uses constant time O(1) algorithm to validate illegal entry On exit finalizers of objects run, hence larger messages have higher average latency – enter () time uses constant time O(1) algorithm to validate illegal entry
Roundtrip Latency Analysis Overview Influence of Scoped memory in the Roundtrip latency measures Result Synopsis Average Measures: For smaller clients, scoped memory incurs greater overhead As requests increase, Scoped memory outperforms heap Dispersion Measures: Considerable improvement in predictability Dispersion improves as much as 50% Worst-Case Measures: Scoped memory bounds worst case Though mean values are greater 99% and Worst case measures are bounded –As number of requests increase, GC activity increases for Heap Memory. –Scope memory kicks in to reduce GC activity thereby improving processing time –As number of requests increase, GC activity increases for Heap Memory. –Scope memory kicks in to reduce GC activity thereby improving processing time
Concluding Remarks & Future Work Future Real-Time CORBA Research Policies at the POA level for RTSJ aware users Use NHRT threads for request/response processing Threading Models for RTSJ Modeling RTSJ exceptions e.g. ScopedCycleException Complete implementation of Real-time CORBA specification Future Real-Time CORBA Research Policies at the POA level for RTSJ aware users Use NHRT threads for request/response processing Threading Models for RTSJ Modeling RTSJ exceptions e.g. ScopedCycleException Complete implementation of Real-time CORBA specification Concluding Remarks We presented R&D efforts on integration of RTSJ and RT-CORBA Our efforts focus towards effective use of RTSJ and Real-time CORBA to improve QoS for Java based real-time systems Concluding Remarks We presented R&D efforts on integration of RTSJ and RT-CORBA Our efforts focus towards effective use of RTSJ and Real-time CORBA to improve QoS for Java based real-time systems Downloading ZEN Downloading ZEN
References ZEN open-source download & web page: Real-time Java (JSR-1): jsr_001_real_time.html Dynamic scheduling RFP: Dynamic_Scheduling_RFP.html Distributed Real-time Java (JSR-50): jsr_050_drt.html AspectJ web page: JRate