*An Integrated Self-Testing Framework for Autonomic Computing Systems Tariq M. King, Alain E. Ramirez, Rodolfo Cruz, Peter J. Clarke School of Computing and Information Sciences Florida International University FIU-SCIS Departmental Colloquium 07/14/2007 * To appear in Issue 8 2007 of the Journal of Computers * Supported in part by the National Science Foundation under grants IIS-0552555 and HRD-0317692
Introduction Continual growth in size and complexity of computing systems has led to a need for support tasks to be shifted from people to technology Movement towards self-managing systems that can dynamically configure, heal, protect, and optimize themselves – Autonomic Computing. There is a lack of techniques to dynamically validate such systems. How can we be sure that AC systems behave correctly after a change at runtime? Support tasks: maintenance, configuration, fault management Much research being performed in this area, however preliminary investigation revealed that there was little or no research on testing these systems (of course there was a lot of research that could be applied but nothing that specifically targeted the AC architecture)
Introduction (cont’d) This work presents a methodology which dynamically validates changes resulting from self-management in autonomic systems. Applies the concepts of autonomic managers, knowledge sources and manual management facilities to testing activities. Provides two general strategies based on system constraints and feasibility of use. Elaborates on a prototype that uses a generic design for autonomic systems that supports runtime testing.
Outline Background Overview of Testing Approach Test Managers Test Support Components Challenges Prototype Related Work Conclusion & Future Work
Autonomic Computing (1) IBM’s solution to the problems of integrating and managing highly complex computing systems Human Body – ANS regulates vital bodily functions without conscious human involvement (homeostasis) Applied to Computing Systems – automation of low- level tasks, and high-level goal specification. Main Characteristics Self-Configuration: configuring existing / new components Self-Optimization: balancing workloads, tuning efficiency Self-Protection: safeguarding from attacks or failures Self-Healing: diagnosing and repairing problems Remember to mention that features should be PROACTIVE
Autonomic Computing (2) Management console to facilitate human activity Coordinates Touchpoint AMs (within or across) Manages resources directly through touchpoint layer Implements sensor and effector interface for MR H/W or S/W entities being managed Start from bottom and come up, don’t forget to explain vertical layer of knowledge sources that allow information to be shared between managers Layered Architecture of Autonomic Computing Source: An Architectural Blueprint for Autonomic Computing, Fourth Ed., IBM Corporation, June 2006.
Autonomic Computing (3) Monitor: collects state information from MR Analyze: determines if and when self-* is needed Plan: formulates a plan to address change request Execute: executes the change plan on the MR Knowledge: stores data shared by the MAPE functions Monitor – could be static (structure) or dynamic (behavior) state information Analyze – MAPE Structure of Autonomic Managers Source: An Architectural Blueprint for Autonomic Computing, Fourth Ed., IBM Corporation, June 2006.
Software Testing - Categories Two broad types of testing: Blackbox – specification-based, focuses on functionality, i.e., inputs → expected outputs. Whitebox – implementation-based, focuses on whether or not the program has been thoroughly (adequately) exercised. Regression Testing – determines whether or not modifications to software have introduced new errors into previously tested code. Retest-all – retest the entire test suite Selective – only retest a strict subset
Software Testing - Automation Automating the testing process involves: Designing test cases Developing test scripts Setting up a test harness for automatically: Setting up the test environment Executing test cases Logging the test results Evaluating the test log What happens when the post-test evaluation passes or fails? Whitebox – implementation-based, focuses on whether or not the program has been thoroughly (adequately) exercised. Regression Testing – determines whether or not modifications to software have introduced new errors into previously tested code. Retest-all – retest the entire test suite Selective – only retest a strict subset
Safe Adaptation Zhang et al. (WADS 2004) Source: J. Zhang, B. H. C. Cheng, Z. Yang, and P. K. McKinley. Enabling safe dynamic component-based software adaptation. In WADS, pages 194–211, 2004.
Testing Approach – Overview Idea: Develop an implicit self-test characteristic for autonomic computing systems Integrate activities of a self-testing framework into the workflow autonomic managers (AMs) Test Managers interface with the autonomic system and coordinate the testing activity Framework consistent with grand vision of AC Based on two strategies: Safe Adaptation with Validation Replication with Validation
Testing Approach - Architecture Integrated Self-Testing Framework for AC Systems Extends the dynamic test model by King et. al, “Towards Self-Testing in Autonomic Computing Systems” Based on two general approaches: Safe Adaptation with Validation, and Replication with Validation
Test Managers (TMs) Extend the concept of autonomic managers to testing activities, i.e., Self-Testing via MAPE Responsible for: Test Coordination Test Planning and Execution Test Suite Management Pre- and Post-test Setup Post-test Evaluation Storage of Test Artifacts
High-Level Test Coordination 1a,1b) Monitors OAM & TKS for state changes 2) Uploads new validation policy to Touchpoint TM 3) Touchpoint TM needs support tools to be setup 4) Invokes effector of ATS to configure support tool 5) Support tool configured successfully for TTM 6) TTM notified that it can begin test execution Orchestrating TM Interactions Manages multiple components in the self-testing framework for high-level test coordination
Low-Level Testing Tasks Touchpoint TM Interactions Performs testing on managed resources of the autonomic system via two control loops (a) and (b)
Test Support Components Auxiliary Test Services (ATS) allows TMs to configure external or third-party testing tools, code profilers, performance analyzers, etc. Provides mechanisms for accessing information in the test knowledge sources. updating validation policies, test suites, test logs/histories Implements facilities for administrative functions related to manual test management, interactive test management, defect tracking, scenarios Administrative functions will be integrated with the console of the autonomic system
Challenges V&V techniques for offline systems not scalable; and still heavily dependent on human tester. Drawbacks magnified for adaptive systems. Testing adaptive nature requires dynamic: Regression test case selection Test suite analysis to: – generate new test cases when necessary. – identify test cases that are no longer applicable. Testing autonomic systems in the presence of unforeseen conditions
Overcoming Challenges Using proposed framework as the focal point for research directions to support QA in AC systems Some Directions: Use of formal specifications to generate test sequences, test oracles, and test data Executable formal specifications also support visualization systems (grand challenge of AC) Mechanisms for policy-based risk analysis and trust can support testing in unforeseen circumstances Risk-based strategies for regression test case selection With respect to generating test data: can provide useful information for test case generation such as preconditions, postconditions, and invariants Mechanims for policy-based – testing requirements can be based on the expected risk of interactions with unknown entities or in the presence of uncertain conditions
Prototype - Features Autonomic Container – Data structure with autonomic capabilities and implicit self-test Version 1 – Stack with self-configuration and self- test. 80% full, reconfigure by increasing capacity Version 2 – Implemented as a remote stack and added self-protection. Users stack exceptions > 3, protect by disabling user Implemented Replication with Validation Validation required 100% pass rate for test cases, 75% for both branch and statement coverage Test suite: 24 test cases, using boundary, random, and equivalence partitioning.
Prototype – Setup Environment Developed in Java 5.0 using Eclipse 3.3 SDK with required test support plugins and libraries. Design Tools: StarUML – class, activity, package diagrams OribeXML – creating and modifying XML policies Test Support Tools: JUnit – a Java unit testing tool from the xUnit family of testing frameworks Cobertura – a Java code coverage analysis tool that calculates the percentage of source code exercised by unit tests.
Prototype – Top-Level Design Top-Level Design of Autonomic Container
Prototype – Manager Design Generic Design of Autonomic Managers
Prototype – Policy Design
Prototype – Self-Test Classes Minimal Class Diagram of Self-Test Subsystem
Prototype – Evaluation Used a mutation testing technique to evaluate the fault detecting ability of the prototype Strategy: simulate faulty change requests to the managed resource (stack) under SC and SP: SC – created mutant stack by modifying the resize method to cause decreased capacity SP – created mutant stack by altering disable account method to enable the user Analyzed the results of executing the self-testing framework on original stack and the mutants
Prototype – Results Correct change scenarios: Incorrect change scenarios: Two TC failures each, coverage omitted Mutation analysis produced favorable results as testing would have prevented potentially harmful changes to the managed resource Feature TC Pass Rate Branch Statement Self-Config 100% 85% Self-Protect 88%
Prototype – Lessons Learned Provided us with insight on the scope of the self- testing subsystem w.r.t responsibilities. Synchronization of AMs and intelligent control loops are a major challenge. Limitations: Dynamic test planning (current: static lookup) Need to implement Safe Adaptation with Validation Strategy Code-based changes Responsibilities w.r.t what operations should be performed by the autonomic computing system, and what should be performed by the self-testing framework.
Related Work Towards Self-Testing in Autonomic Computing Systems, King et al., ISADS ‘07. A Self-Testing Autonomic Container, Stevens et al., ACMSE ‘07. Synthesizing assertions from observed behavior, Denaro et al., ACC ’05. Embeds assertions into the communication infrastructure. Assertions are checked at runtime. Making components self-testable, Le Troan et al., TOOLS ’99. Le Troan – test sequences and test oracles are included in the implementation, they only consider the unit level
Conclusion and Future Work Proposed a framework that dynamically validate change requests in AC systems. Approach extends the current structure of AC systems to include self-testing. Supports two validation strategies. Developed a prototype to show their feasibility. Future work calls for: Extending the capabilities of the prototype Implementing safe adaptation with validation Evaluating efficiency of the two approaches
Summer ’07 and Ongoing Work REU Program 2007 – Alain E. Ramirez and Barbara Morales; Autonomic Job Scheduler A Generic O-O Design to Support Self-Testable Autonomic Systems (SAC ‘07) Student Paper to ACMSE 2008 Fall 2007 – Gonzalo Argote-Garcia Embedding Formal Specifications into Autonomic Components ? – Alain E. Ramirez Applying Design Patterns to AC systems
Acknowledgements Djuradj Babich Jonatan Alava Ronald Stevens Brittany Parsons Mitul Patel Dr. S.M. Sadjadi Tariq
Thank You Questions? Tariq This work has been supported in part by the National Science Foundation under grant IIS-0552555