February 13, 2007 Dynamic Software Reconfiguration in Programmable Networks Nico Janssens DistriNetDistriNet, Department of Computer Science, K.U. Leuven.Department of Computer ScienceK.U. Leuven
February 13, 2007 DistriNet DistriNet: “Distributed Systems and Computer Networks” development of open, distributed object support platforms for advanced applications, using state of the art software technology always application driven often conducted in close collaboration with industry research topics include security middleware mobile / sensor networks embedded systems autonomous and decentralized systems software architecture language technology
February 13, 2007 Overview of this talk Problem statement Scope and background Approach Local reconfigurations Distributed reconfiguration Performance measurements Contributions and future research
February 13, 2007 Problem statement Computer networks core of distributed systems availability is crucial
February 13, 2007 Problem statement network functions: typically abstracted away from end-users and applications intermediate nodes are (mostly) closed vertically integrated systems since the mid 1990’s: various initiatives to open up the network infrastructure to increase its programmability
February 13, 2007 Problem statement Besides programmability, also re-configurability is an important issue! to support the increasing evolution of network software adaptive networks compressio n decompression
February 13, 2007 Problem statement Changing the software of a (programmable) network device off-line may potentially break the network’s availability!
February 13, 2007 Problem statement Dynamic software reconfiguration Changing the software of a (programmable) network device off-line may potentially break the network’s availability!
February 13, 2007 Problem statement To be beneficial, dynamic reconfiguration must be effective and efficient Often complex and error prone Support is needed to conduct dynamic software reconfiguration in programmable networks
February 13, 2007 Problem statement Main requirements 1. Correct reconfigurations 2. Limited reconfiguration overhead 3. Limited user input 4. Reusability NeCoMan (Network reConfiguration Management): middleware coordinating dynamic reconfigurations in programmable networks
February 13, 2007 Overview Problem statement Scope Approach Local reconfigurations Distributed reconfigurations Performance measurements Contributions and future research
February 13, 2007 Scope dynamic software reconfiguration programmable networks dynamic change management support
February 13, 2007 Scope dynamic software reconfiguration dynamic change management support dynamic software reconfiguration in out-of-band active networks programmable networks
February 13, 2007 Scope – Programmable Networks Out-of-band active networks [Coulson 2003] (In-band active networks)
February 13, 2007 Scope – Programmable Networks Dynamic software reconfiguration The majority of programmable network architectures enable the initial deployment of specific services … … but they do not support subsequent reconfigurations of services that are already in use [Hicks 2000]. exceptions include Click, Cactus, Netkit and Ensemble
February 13, 2007 Scope dynamic software reconfiguration programmable networks dynamic change management support compositional adaptation of pipe-and-filter based (network) architectures
February 13, 2007 Scope – Dynamic software reconfiguration Compositional adaptation [McKinley 2004] addition replacement removal
February 13, 2007 Scope – Node architectural style Pipe-and-filter (network) architectures [Shaw & Garlan 1996] Click, NetScript, CANEs, NetBind, DiPS+
February 13, 2007 Scope dynamic software reconfiguration programmable networks dynamic change management support customizable change management support for out-of- band active networks
February 13, 2007 Scope – Dynamic change management support goal: improve effectiveness and efficiency of a dynamic reconfiguration most existing change management support conform to the black-box philosophy customizability needed to optimize dynamic reconfigurations E.g. replacement compression service with new incompatible version vs. replacement reliability service with compatible version.
February 13, 2007 Scope – Dynamic change management support goal: improve effectiveness and efficiency of a dynamic reconfiguration most existing change management support conform to the black-box philosophy customizability needed to optimize dynamic reconfigurations E.g. replacement compression service with new incompatible version vs. replacement compression service with compatible version.
February 13, 2007 Scope dynamic software reconfiguration programmable networks dynamic change management support network service characteristics
February 13, 2007 Scope – Network services Isolated network services (e.g. filter and logging service) self-contained (no dependencies) reactive processes filter
February 13, 2007 Scope – Network services Distributed network services (e.g. compression, reliability, fragmentation, encryption, etc.) distributed dependencies client-server based collaboration reactive processes asynchronous buffered communications many-to-many service composition
February 13, 2007 Overview Problem statement Scope Approach Local reconfigurations Distributed reconfigurations Performance measurements Contributions and future research
February 13, 2007 Approach NeCoMan contains 4 (basic) algorithms to carry out an extensive set of reconfigurations local and distributed reconfigurations NeCoMan contains various predefined customizations to these algorithms pre-conditions to apply these customizations Main requirements: 1. Correct reconfigurations 2. Limited reconfiguration overhead 3. Limited user input 4. Reusability
February 13, 2007 Approach NeCoMan restricts the user input to a description of the reconfiguration that NeCoMan must carry out a description of the service characteristics and reconfiguration semantics the IP addresses of all affected nodes Based on the service characteristics and the reconfiguration semantics, NeCoMan selects the appropriate reconfiguration algorithm and (if possible) applies some of the predefined customizations to this algorithm Main requirements: 1. Correct reconfigurations 2. Limited reconfiguration overhead 3. Limited user input 4. Reusability
February 13, 2007 Approach Main requirements: 1. Correct reconfigurations 2. Limited reconfiguration overhead 3. Limited user input 4. Reusability separation of concerns to promote reusability NeCoMan contains no node or service specific reconfiguration support! must be provided by the nodes’ reconfiguration support 8 primitives NeCoMan coordinates the execution of these primitives
February 13, 2007 Approach Main requirements: 1. Correct reconfigurations 2. Limited reconfiguration overhead 3. Limited user input 4. Reusability separation of concerns to promote reusability script generator contains reconfiguration logic composes a tailored reconfiguration based on the user input (node-specific) virtual machine executes (portable) reconfiguration scripts conducts the actual node reconfiguration
February 13, 2007 Overview Problem statement Scope Approach Local reconfigurations Algorithms Customizations Distributed reconfigurations Performance measurements Contributions and future research
February 13, 2007 Local Reconfigurations 2 main algorithms replacement component of distributed service addition, replacement and removal of isolated services similar, will not be discussed 6 predefined customizations
February 13, 2007 Local Reconfigurations Algorithm for replacing component of a distributed network service
February 13, 2007 Local Reconfigurations: approach 1. Partial ordering of high-level reconfiguration phases 2. Preliminary partial ordering of NeCoMan’s reconfiguration actions 3. Partial ordering of NeCoMan’s reconfiguration actions 4. Linearization
February 13, 2007 Local Reconfigurations: approach Partial ordering of high-level reconfiguration phases result from reconfiguration conditions
February 13, 2007 Local Reconfigurations: approach Preliminary partial ordering of NeCoMan’s reconfiguration actions
February 13, 2007 Local Reconfigurations: approach Partial ordering of NeCoMan’s reconfiguration actions
February 13, 2007 Local Reconfigurations: approach Complete ordering of NeCoMan’s reconfiguration actions
February 13, 2007 Local Reconfigurations: example example: replacement compression component
February 13, 2007 Local Reconfigurations: example Install new component create new component link outports new component
February 13, 2007 Local Reconfigurations: example Install new component create new component link outports new component
February 13, 2007 Local Reconfigurations: example Finish old component intercept packets impose safe state
February 13, 2007 Local Reconfigurations: example Finish old component intercept packets impose safe state
February 13, 2007 Finishing impose safe state packet monitoring
February 13, 2007 Finishing impose safe state protocol-transaction monitoring state transfer
February 13, 2007 Local Reconfigurations: example Activate new component start processes link inports release packets
February 13, 2007 Local Reconfigurations: example Activate new component start processes link inports release packets
February 13, 2007 Local Reconfigurations: example Activate new component start processes link inports release packets
February 13, 2007 Local Reconfigurations: example Remove old component unlink outports old component delete old component
February 13, 2007 Local Reconfigurations: example Remove old component unlink outports old component delete old component
February 13, 2007 Customizations: overview 6 predefined customizations resulted from re-ordering and discarding all reconfiguration actions that both local algorithms include from these combinations, we selected the customizations that limit the reconfiguration overhead and still yield a valid reconfiguration (given that some additional pre- conditions are fulfilled) Customization Activate before finishing No finishing No processes Only client of server processes Only service-internal inports or outports Addition or removal replacement component of distributed service addition, replacement and removal of isolated services
February 13, 2007 Customization: activate before finishing communication is disrupted as from intercepting packets until they are released again however … this disruption can be reduced by activating the new component before finishing the old one
February 13, 2007 Customization: activate before finishing Pre-conditions the old component is stateless the new service component is able to process all ongoing protocol-transactions the network tolerates packet re-ordering Effect activation phase becomes executed before the finishing phase no packet interception
February 13, 2007 Customization: no finishing for some reconfigurations, there is no need to finish the old service e.g. when replacing a compression component in a TCP/IP network old component can be removed without reaching a reconfiguration-safe state
February 13, 2007 Customization: no finishing Pre-conditions the affected components operate in a best-effort network the new component is able to process all ongoing protocol- transactions inconsistent execution states (if any) do not compromise the correct functioning of the network e.g. for stateless components Effect old component will not be finished may reduce communication disruption... … if inconsistencies do not impact the network performance!
February 13, 2007 Overview Problem statement Scope Approach Local reconfigurations Distributed reconfiguration Algorithms Customizations Performance measurements Contributions and future research
February 13, 2007 Distributed Reconfigurations 2 main algorithms one for reconfigurations that involve reaching quiescence another one for reconfigurations where no quiescence will be reached 8 predefined customizations
February 13, 2007 Distributed Reconfigurations Quiescence [Kramer & Magee 1990] service is frozen and consistent applied to distributed network services: all ongoing protocol-transactions have completed no new protocol-transactions will be initiated until after the reconfiguration actions have completed
February 13, 2007 Distributed Reconfigurations Algorithm for reconfigurations that involve reaching quiescence distributed reconfigurations = actions for local reconfiguration + distributed synchronization
February 13, 2007 Distributed Reconfigurations: approach 1. Partial ordering of high-level reconfiguration phases 2. Preliminary partial ordering of NeCoMan’s reconfiguration actions 3. Partial ordering of NeCoMan’s reconfiguration actions 4. Linearization
February 13, 2007 Distributed Reconfigurations: example Example: replacement of compression service with new version
February 13, 2007 Distributed Reconfigurations: example Install new component create new component link outports new component Install new component create new component link outports new component
February 13, 2007 Distributed Reconfigurations: example Install new component create new component link outports new component Install new component create new component link outports new component
February 13, 2007 Distributed Reconfigurations: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Distributed Reconfigurations: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Distributed Reconfigurations: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Distributed Reconfigurations: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Distributed Reconfigurations: example Activate new component start processes link inports Activate new component start processes link inports release packets
February 13, 2007 Distributed Reconfigurations: example Activate new component start processes link inports Activate new component start processes link inports release packets
February 13, 2007 Distributed Reconfigurations: example Activate new component start processes link inports Activate new component start processes link inports release packets
February 13, 2007 Distributed Reconfigurations: example Activate new component start processes link inports Activate new component start processes link inports release packets
February 13, 2007 Distributed Reconfigurations: example Remove old component unlink outports old component delete old component Remove old component unlink outports old component delete old component
February 13, 2007 Distributed Reconfigurations: example Remove old component unlink outports old component delete old component Remove old component unlink outports old component delete old component
February 13, main algorithms one for reconfigurations that involve reaching quiescence another one for reconfigurations where no quiescence will be reached 8 predefined customizations Distributed Reconfigurations
February 13, 2007 Distributed Reconfigurations Monitoring for quiescence can be very time consuming (may significantly delay the reconfiguration) when many protocol-transactions are active at the same time (e.g. many compressed packets in transit) when it takes a long time for the ongoing protocol-transaction to complete (e.g. TCP protocol) In some cases it may even be impossible to reach a quiescent state, for instance because the employed protocol is non-deterministic Alternative for quiescence deactivating the affected components immediately at each node independently restoring consistency by transferring the execution state of the old components to the new ones
February 13, 2007 Distributed Reconfigurations Different algorithm replacement only (no addition and removal) no coordinated finishing no coordinated activation thus … independent execution of local reconfiguration algorithm
February 13, 2007 Customizations: overview 8 predefined customizations resulted from re-ordering and discarding all reconfiguration actions as well as the synchronization points that both distributed algorithms include Customization No coordinated activation Activate before finishing No finishing No finishing of server processes No processes Only client of server processes Only service-internal inports or outports Addition or removal quiescence no quiescence
Customizations: no coordinated activation Example replacing the components of a compression service where the old and new processes are able to accept and service each others invocations adding a compression service on two programmable nodes when the network or the applications can deal with (or filter out) compressed packets Effect no distributed synchronization when activating the new service less communication disruption Pre-conditions old and new service components are compatible, or in case of replacement the network is able to deal with incorrect service compositions in case of addition or removal
February 13, 2007 Customization: no coordinated activation resulting algorithm no coordinated activation no coordinated finishing thus … independent execution of local reconfiguration algorithm
February 13, 2007 Customizations: activate before finishing Pre-conditions the old service components do not share their execution state (if any) with their client applications if these components encapsulate execution state no state transfer the network tolerates packet re-ordering Effect packet marking and dispatching 3 instead of 2 synchronization messages no packet interception
February 13, 2007 Activate before finishing: example Example: replacement of compression service with new version
February 13, 2007 Activate before finishing: example Install new component create new component add marking support Install new component create new component link service-external outports new component add dispatching support
February 13, 2007 Activate before finishing: example Install new component create new component add marking support Install new component create new component link service-external outports new component add dispatching support
February 13, 2007 Activate before finishing: example Install new component create new component add marking support Install new component create new component link service-external outports new component add dispatching support
February 13, 2007 Activate before finishing: example Activate new component start processes link inports Activate new component start processes
February 13, 2007 Activate before finishing: example Activate new component start processes link inports Activate new component start processes
February 13, 2007 Activate before finishing: example Activate new component start processes link inports Activate new component start processes
February 13, 2007 Activate before finishing: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Activate before finishing: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Activate before finishing: example Finish old component impose safe state Finish old component intercept packets impose safe state
February 13, 2007 Activate before finishing: example Remove old component remove dispatching support unlink outports old component delete old component Remove old component remove marking support unlink outports old component delete old component
February 13, 2007 Activate before finishing: example Remove old component remove dispatching support unlink outports old component delete old component Remove old component remove marking support unlink outports old component delete old component
February 13, 2007 Activate before finishing: example Remove old component remove dispatching support unlink outports old component delete old component Remove old component remove marking support unlink outports old component delete old component
February 13, 2007 Activate before finishing: example Remove old component remove dispatching support unlink outports old component delete old component Remove old component remove marking support unlink outports old component delete old component
February 13, 2007 Overview Problem statement Scope Approach Local reconfigurations Distributed reconfiguration Performance measurements Contributions and future research
February 13, 2007 Performance measurements: test setup
February 13, 2007 Communication disruption: add compression
February 13, 2007 Communication disruption: replace compression
February 13, 2007 Communication disruption: remove compression
February 13, 2007 Communication disruption: add reliability
February 13, 2007 Communication disruption: replace reliability
February 13, 2007 Communication disruption: remove reliability
February 13, 2007 Overview Problem statement Scope Approach Local reconfigurations Distributed reconfiguration Performance measurements Contributions and future research
February 13, 2007 Contributions: Programmable Networks Extensive analytical validation Proof-of-concept prototype DiPS+ node architecture Project PEPITA (ITEA) Project SCAN (IWT) CuPS node reconfiguration support NeCoMan DiPS+ VM NeCoMan script generator Performance evaluation NeCoMan: a middleware to dynamically reconfigure out-of-band active network Project RACING (FWO) Project AgCo2 (GOA)
February 13, 2007 Contributions: Dynamic Software Reconfiguration these reconfiguration conditions specify the partial ordering of the identified reconfiguration actions benefits of making these conditions explicit: they provide some guidance for the development of future reconfiguration support they allow to reason about the reconfiguration process Specification of the reconfiguration conditions that must be fulfilled to conduct correct en efficient reconfigurations
February 13, 2007 Contributions: Dynamic Change Management Support Customizable change management support to reconfigure programmable networks black box change management support open change management support [Hillman and Warren 2004] increases again the cost and risks of dynamic software reconfiguration reconfiguration process cannot be changed NeCoMan allows to customize the reconfiguration process still protects the user from the complexity of composing a correct and efficient reconfiguration algorithm
February 13, 2007 Future research Programmable Networks NeCoMan cannot always select the most optimal algorithm for each reconfiguration this requires taking into account extra context specific information NeCoMan assumes that the execution of its actions never fails requires failure recovery support Other research areas Reflective middleware Middleware for transparent application reconfiguration Dynamic distributed aspect weaving
February 13, 2007 Thank you for your attention!