Download presentation
Presentation is loading. Please wait.
1
Action Breakout Session
Anil, AP, Nina Bhatti, Charles Berdnall, Joe Hellerstein, Wei Hu, Anthony Joseph, Randy Katz, Li, Machi Mukund Kimmo Raatikanen, Siva
2
Breakout Goal Identify research questions and issues related to adaptive action invocation to enhance the dependability and security of distributed systems Customer is the “system administrator,” not the end user
3
Breakout Process Define actions by example
Discuss cross-layer interaction and coordination Distill underlying principles
4
Key Observations Distinguish between control actions (e.g., “slow down”) and data actions (e.g., “drop packets”) Distinguish between internal/locally performed actions and actions that affect global behavior Control loops operating in multiple levels, regionally and globally Performance-related actions are the basic building block Control system itself can be target of an adversarial attack
5
Working Examples Network Storage Service; Media Streaming Service
Multiple instances of service various places in network Direct requests to best available service instance Balance requests among service instances Fall back to alternative service instance in the face of failure or DOS attack Coordinate measurements on client-side and server-side to reduce load through admission control and content adaptation Distinguish between server overload and network overload For clients “not in the loop” (heterogeneous clients, adversarial clients), proxy the necessary behavior inside the network Network Denial of Service Overload data traffic and starve control traffic Secondary performance effects: session resets, router CPUs driven to high utilization, etc.
6
Control Theoretic Viewpoint
Black boxes that are managed by a control system Actuation points that can acted upon to control the system E.g., Apply backpressure to clients to slow down request rate (control); degrade content quality (data) E.g., Prioritize/reserve bandwidth for control traffic; Policy settings are control actions, enforcement of policy are data actions Single vs. independent control loops: which is better? Theory provides tools for managing “disturbances” Note that the control system can itself be the target of attack Hellerstein: Action is a change to a configuration E.g., buffer pool size, weights in load balancer E.g., uninstall/reinstall software
7
General Observations Causality and Visibility
Actions can lead to cascaded actions Can interactions/side effects be modeled/made explicit? Action graph model: probability that a following action will be invoked as the result of a given current action In general, difficult to determine in advance Could it be learned via observe/analyze? Feasible to place action points at every potential bottleneck site? Note that routers are badly designed black boxes, difficult and time consuming to extract their internal state Tradeoff between centralized collection of state that may be “complete” but out-of-date vs. decentralized collection that may be more timely but globally incomplete Principle of containment: first do no harm, local actions potential less disastrous than global actions
8
General Observations Managing Disturbances
Instabilities arise where delays in taking action are introduced Latencies in response Imperfect knowledge of the state Tradeoff in making decisions based on longer intervals spanning more state vs. shorter intervals spanning less state Time intervals adapt … short time to ensure useful work always being done E.g., Disk scheduling in Storage Server You can only do work you are aware of Keep the queues short to achieve best performance
9
General Observations Predictive actions
Waiting too long to detect problem limits ability to respond Characterize workload/response changes as signature of impending system performance failure Response to workload changes: “gradual” vs. cliff degradation E.g., as I/O workload grows, predict increases in response latency E.g., IBM detects changes to slope of activity to trigger resource allocation to manage flash crowds in web server farms
10
General Observations Don’t ignore the human decision maker
Human operators in the loop Research challenge: visualizing the configuration and state of the system to a human decision maker Higher order configuration and administration tools and frameworks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.