Decision-Theoretic Planning with (Re)Deployment of Components in Distributed Real-time & Embedded Systems Douglas C. Schmidt

Slides:



Advertisements
Similar presentations
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
Advertisements

Distributed Multimedia Systems
CS 795 – Spring  “Software Systems are increasingly Situated in dynamic, mission critical settings ◦ Operational profile is dynamic, and depends.
Resource Management of Highly Configurable Tasks April 26, 2004 Jeffery P. HansenSourav Ghosh Raj RajkumarJohn P. Lehoczky Carnegie Mellon University.
Variability Oriented Programming – A programming abstraction for adaptive service orientation Prof. Umesh Bellur Dept. of Computer Science & Engg, IIT.
SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Lu, and Zhu. CPU Utilization Control in Distributed Real-Time Systems Chenyang.
RT Reading Group Meeting 06/27/2002 Veljko Krunic.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Resource Management – a Solution for Providing QoS over IP Tudor Dumitraş, Frances Jen-Fung Ning and Humayun Latif.
Quality of Service in IN-home digital networks Alina Albu 7 November 2003.
High-level System Modeling and Power Management Techniques Jinfeng Liu Dept. of ECE, UC Irvine Sep
1 12/10/03CCM Workshop QoS Engineering and Qoskets George Heineman Praveen Sharma Joe Loyall Richard Schantz BBN Technologies Distributed Systems Department.
1 Quality Objects: Advanced Middleware for Wide Area Distributed Applications Rick Schantz Quality Objects: Advanced Middleware for Large Scale Wide Area.
Improving Robustness in Distributed Systems Jeremy Russell Software Engineering Honours Project.
WPDRTS ’05 1 Workshop on Parallel and Distributed Real-Time Systems 2005 April 4th and 5th, 2005, Denver, Colorado Challenge Problem Session Detection.
In-Band Flow Establishment for End-to-End QoS in RDRN Saravanan Radhakrishnan.
Chapter 13 Embedded Systems
Investigating Lightweight Fault Tolerance Strategies for Enterprise Distributed Real-time Embedded Systems Tech-X Corporation Boulder, Colorado Vanderbilt.
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
EMBEDDED SOFTWARE Team victorious Team Victorious.
23 September 2004 Evaluating Adaptive Middleware Load Balancing Strategies for Middleware Systems Department of Electrical Engineering & Computer Science.
QoS-enabled middleware by Saltanat Mashirova. Distributed applications Distributed applications have distinctly different characteristics than conventional.
Sensor Coordination using Role- based Programming Steven Cheung NSF NeTS NOSS Informational Meeting October 18, 2005.
Software Engineering Muhammad Fahad Khan
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Tufts Wireless Laboratory School Of Engineering Tufts University “Network QoS Management in Cyber-Physical Systems” Nicole Ng 9/16/20151 by Feng Xia, Longhua.
26 Sep 2003 Transparent Adaptive Resource Management for Distributed Systems Department of Electrical Engineering and Computer Science Vanderbilt University,
Adaptive QoS Management for IEEE Future Wireless ISPs 通訊所 鄭筱親 Wireless Networks 10, 413–421, 2004.
Integrating Fine-Grained Application Adaptation with Global Adaptation for Saving Energy Vibhore Vardhan, Daniel G. Sachs, Wanghong Yuan, Albert F. Harris,
Overview of the Resource Allocation & Control Engine (RACE) Ed Mulholland Tom Damiano Patrick Lardieri Jai Balasubramanian James Hill Will Otte Nishanth.
Cluster Reliability Project ISIS Vanderbilt University.
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
SAMANVITHA RAMAYANAM 18 TH FEBRUARY 2010 CPE 691 LAYERED APPLICATION.
Reconfigurable Real-Time Middleware for Distributed Cyber-Physical Systems with Aperiodic Events Yuanfang Zhang, Christopher Gill, Chenyang Lu Department.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
1 of 14 1/15 Synthesis-driven Derivation of Process Graphs from Functional Blocks for Time-Triggered Embedded Systems Master thesis Student: Ghennadii.
Salim Hariri HPDC Laboratory Enhanced General Switch Management Protocol Salim Hariri Department of Electrical and Computer.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Performance evaluation of component-based software systems Seminar of Component Engineering course Rofideh hadighi 7 Jan 2010.
Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated.
Software Product Line Material based on slides and chapter by Linda M. Northrop, SEI.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
A Software Framework for Distributed Services Michael M. McKerns and Michael A.G. Aivazis California Institute of Technology, Pasadena, CA Introduction.
Accommodating Bursts in Distributed Stream Processing Systems Yannis Drougas, ESRI Vana Kalogeraki, AUEB
NetQoPE: A Middleware-based Netowork QoS Provisioning Engine for Distributed Real-time and Embedded Systems Jaiganesh Balasubramanian
Adaptive Resource Management Architecture for DRE Systems Nishanth Shankaran
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Full and Para Virtualization
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
Topic 2: The Role of Open Standards, Open-Source Development, & Different Development Models & Processes (on Industrializing Software) ARO Workshop Outbrief,
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
August 20, 2002 Applying RT-Policies in CORBA Component Model Nanbor Wang Department of Computer Science Washington University in St. Louis
Improving System Availability in Distributed Environments Sam Malek with Marija Mikic-Rakic Nels.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
FLARe: a Fault-tolerant Lightweight Adaptive Real-time Middleware for Distributed Real-time and Embedded Systems Dr. Aniruddha S. Gokhale
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Why is Design so Difficult? Analysis: Focuses on the application domain Design: Focuses on the solution domain –The solution domain is changing very rapidly.
Input and Output Optimization in Linux for Appropriate Resource Allocation and Management James Avery King.
Marilyn Wolf1 With contributions from:
Standards and Patterns for Dynamic Resource Management
Gabor Madl Ph.D. Candidate, UC Irvine Advisor: Nikil Dutt
CIS 488/588 Bruce R. Maxim UM-Dearborn
Utility-Function based Resource Allocation for Adaptable Applications in Dynamic, Distributed Real-Time Systems Presenter: David Fleeman {
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Presented By: Darlene Banta
Paper by D.L Parnas And D.P.Siewiorek Prepared by Xi Chen May 16,2003
Presentation transcript:

Decision-Theoretic Planning with (Re)Deployment of Components in Distributed Real-time & Embedded Systems Douglas C. Schmidt Nishanth Shankaran, John S. Kinnebrew, Gautam Biswas, Dipa Suri, & Adam S. Howell

2 Enterprise Distributed Real-time Embedded (DRE) Systems Operate under limited resources Tight real-time performance QoS constraints Dynamic & uncertain environments Dynamic & changing goals Distribution of computation Multiple onboard processors Task distribution among satellites/processors Integration of information Data collection – Gizmo agent Process data – Science agent Downlink to earth – Comm. agent Coordinated operation Limited CPU, memory and network bandwidth Possible hardware failure Limited CPU, memory and network bandwidth

3 Motivating Application: Earth Science Enterprise Mission Collect Data Process Data Downlink to Earth Trigger Message A Final Result Message B System Description End-to-end systems-tasks/work-flows represented as operational string of components Operational strings simultaneously share resources Strings are dynamically added/removed from the system based on mission & mode System requirements 1.Automatically & accurately adapt to dynamic changes in mission requirements & environmental conditions 2.Handle failures arising from system failures Classes of operational strings with respect to importance Mission Critical, Mission Support, & Best Effort Solution: Spreading Activation Partial Order Planner (SA-POP) for decision-theoretic planning under resource constraints, coupled with Resource Allocation & Control Engine (RACE), a dynamic resource management framework for DRE systems

4 SA-POP Research and Development Challenges Research Challenges 1.Efficiently handle uncertainty in planning 2.Incorporate resource-aware scheduling with planning Development Challenges 1.Take advantage of functionally interchangeable components to efficiently meet resource constraints 2.Plan with multiple interacting goals, but produce distinct operational strings

5 SA-POP: Planning in DRE Systems with Components Task is an abstraction of functionality Multiple (parameterized) components may have the same function but different resource usage Task Network specifies probabilistic effects and requirements for tasks Condition nodes specify data flow and system/environmental conditions Task nodes have links to/from condition nodes specifying effects/preconditions Links incorporate probabilistic information about domains Task Map allows conversion between tasks and components Maps tasks (functionality abstraction) to parameterized components (implementation) Associates expected or worst case resource usage with each implementation Operational String specifies a component-based application to achieve a goal Set of tasks along with ordering and timing constraints Data connections between tasks Implementation (parameterized component) suggested for each task

6 SA-POP: Expected Utility Calculation using Spreading Activation ajaj cici ckck c c c Task node Precondition nodes (input data and system/environment preconditions) Effect nodes (output data and system/environment effects) From Action/Task Nodes To Action/Task Nodes Precondition Link Weights Forward propagation of probabilities Backward propagation of utilities Effect Link Weights

7 Least commitment partial order planning SA-POP: Operational String Generation Four hierarchical decision points in each planning step: 1.Goal/subgoal choice: choose an open condition, which is goal or subgoal unsatisfied in the current plan. 2.Task choice: choose a task that can achieve current open condition. 3.Task instantiation: choose an implementation for this task from the Task Map. 4.Scheduling decision(s): adjust task start/end time windows and/or add ordering constraints between tasks to avoid resource violations. Resource constrained scheduling } }

8 RACE Research & Development Challenges Research Challenges 1.Efficiently allocate computing & network resources to application components 2.Avoid over-utilization of system resources – ensure system stability 3.Maintain QoS even in the presence of failure 4.Ensure end-to-end QoS requirements are met – even under high load conditions Development Challenges 1.Need multiple resource management algorithms depending on application characteristics & current system condition (resource availability) 2.Single resource management mechanism customized for a specific mission goal or set of mission goals might be effective for that specific scenario 3.However, can not be reused for other scenario  Reinvent the wheel for every scenario

9 RACE Functional Architecture Dynamic resource management framework atop CORBA Component Model (CCM) middleware (CIAO/DAnCE) Allocates components to available resources Configure components to satisfy QoS requirements based on dynamic mission goals Perform run-time adaptation Coarse-grained mechanisms React to new missions, drastic changes in mission goals, or unexpected circumstances such as loss of resources e.g., component re-allocation or migration Fine-grained mechanisms Compensate for drift & smaller variations in resource usage e.g., adjustment of application parameters, such as QoS settings

10 RACE Software Component Architecture Input Adapter A generic interface that translates application components descriptors to internal data structure Plan Analyzer Examines input descriptors & select appropriate allocation algorithms based on application characteristics Add appropriate application QoS & resource monitors Planner Manager Maintains a registry of installed planners (algorithm implementations) along with their over head & currently executing sequences of planners that are generated by the Plan Analyzer

11 RACE Software Component Architecture CIAO/DAnCE Middleware A CCM middleware framework atop of which components of the operational strings are deployed Target Manager Runtime resource utilization monitor that tracks the utilization of system resources, such as onboard CPU, memory, and network bandwidth utilization.

12 RACE: Addressing Dynamic Resource Management Challenges Need for a control framework Allocation algorithms allocates resource to components based on current system condition & estimated resource requirements No accurate apriori knowledge of input workload & how the execution time depend on input workload Dynamic changes in system operating modes Control objectives: Ensure end-to-end QoS requirements are met at all times Ensure resource utilization is below the set-point – ensure system stability RACE is available with the Component-Integrated ACE ORB (CIAO) ( RACE Controller: Reallocates resources to meet control objectives RACE Control Agents: Maps resource reallocation to OS/application specific parameters

13 Experimentation Results – Hardware / Software Testbed Experiments were performed on the ISISLab testbed at Vanderbilt University ( Hardware Configuration 6 nodes with 2.8 GHz Intel Xeon dual processor, 1 GB physical memory, 1Ghz Ethernet network interface, and 40 GB hard drive Software Configuration Redhat Fedora Core Release 4 operating ACE+TAO+CIAO Middleware Two operational strings - one mission-critical, one best effort -with 6 components each were deployed on 6 nodes

14 Experimentation Results – Performance Analysis An end-to-end deadline of 500 ms was specified for the mission-critical operational string Mission critical string was deployed at time T = 0s, and best-effort was deployed at time T = 1800 sec Until T = 1800 sec, end-to-end execution time of mission critical string is lower than its deadline At T = 1800 sec, end-to-end execution time of mission critical string is way above its deadline This is due to excessive resources consumption by best-effort string RACE reacts to increase in execution time by perform adaptive system control modifications by modifying operating system priority, scheduler class and/or tearing down lower priority operational string(s) Deadline Execution Time RACE ensures end-to-end deadline of mission critical string is met even under fluctuations in resource availability/demand

15 Lessons Learned from SA-POP and RACE Integration Flexible Pluggable resource allocation and control algorithms in RACE SA-POP task network and task map tailored to domain Unlimited combinations of goals, goal priorities, and timing requirements in SA-POP Shared task map allows substitution of functionally equivalent task implementations by RACE Scalable Separation of concerns between SA-POP and RACE limits search spaces in each SA-POP handles cascading planning choices in operational string generation SA-POP only considers resource allocation feasibility with course-grained (system-wide) resource constraints RACE handles resource allocation optimization with fine-grained (individual processing node) resource constraints and dynamic control for fixed operational strings

16 Lessons Learned from SA-POP and RACE Integration Dynamic SA-POP task network with spreading activation provides expected utility information for generating robust applications in uncertain environments SA-POP replanning achieved efficiently with incrementally updated task network and plan repair as necessary RACE control algorithms alleviate need for replanning in many cases RACE provides reallocation and redeployment of revised operational strings when replanning is necessary