Download presentation
Presentation is loading. Please wait.
Published byChester McBride Modified over 9 years ago
1
MyOps An Operational Framework for PlanetLab Deployments 1
2
Outline o Objective of MyOps o Current status o Future ideas o Questions at any time 2
3
Example of Feedback 3
4
Objective : Close Operational Cycle System - Provides service (slice) Monitoring - Feedback from running system Operator - Interpret feedback into tasks Management - Control running system 4
5
Challenges: Break-down System may not deliver service Monitoring not observe useful metrics Operator may not know o how to interpret observations o how to control the system o what the service goals are Management may not control system 5
6
Requirements for Operational Systems Satisfy Minimal Conditions 1. Physical Integrity 2. Interconnectivity 3. Controllable 4. Provide a Service Two requirements o Reliably reach the final condition o When failures occurs, repair or report automatically Two approaches in MyOps o Precise bootstrap stages (not discussed) o Operational monitoring & management in platform 6
7
System: PlanetLab Slices 7
8
Monitoring Types Open-loop monitoring Identify the unknown More information, fine-grained Operational monitoring (closed-loop) Correctness Less information, coarse-grained Actionable 8
9
Management Types Open-loop management Bootstrap/Deploy from the ground up Inefficient, coarse-grained No feed-back Operational management (closed-loop) Tweak the system to correct behavior More efficient, fine-grained 9
10
Example Observe: Node is Off-Line Control: Attempt to Power-On Observe: Node is On-line but Failed to boot Observe: Failed to boot Error Control: Create ticket & Send email to local contact Time passes Control: Disable slice creation Observe: Local contact responds Observe: Node is Power-on and Running Control: Re-enable slice creation Contro: Close ticket 10
11
History of PlanetLab Operations Open-loop Monitoring with Open-loop Management Collect fine-grained statistics using CoMon Act with coarse-grained operations (e.g. Reinstall) Manual bridge between the two Moving towards Closed-loop Operations Collect targeted metrics Take directed, problem-specific actions Automate actions based on policy 11
12
PlanetLab Operations Close the monitor/management cycle Direct automation of common operations Indirect through remote contacts and incentives 12
13
MyOps Architecture Collection from Node Translated by policy to Automated action 13
14
MyOps Architecture Collection from Node Send notice to Local contact to take action 14
15
MyOps Architecture When there is no response Indirect influence with incentives 15
16
Collection Operational monitoring specific targets, such as: o Boot status, Filesystem status o DNS - internal and external o RPMs o System services, etc Periodic collection o Coarse-grained collection at a human-timescale o Time-series of events and status 16
17
Policy Constraints over a time-series of events To satisfy a constraint o Automated action o Send notice o Apply incentive Policy defines o Preferred status of system o Frequency of actions o Magnitude of incentives 17
18
Automation Automatic correction of common bootstrap problems o Communication errors with MyPLC o Corrupt filesystem repair o Retry when state is unknown o PCU Reboot o Reinstall Automation Notices o Bad disk o Minimal hardware o Bad DNS o Bad node configuration 18
19
Notices & Incentives Notices are indirect paths to node management o Node down / online / specific problem (i.e. DNS, disk) o Site down / online o Privilege reduced / restored o PCU errors The incentives on MyPLC o Sites 10 slices o Disable slice creation o Disable running slices 19
20
Validation of Notices & Incentives ABCDE Notice BugFixKernel BugFix Fix2 20
21
Time to Restore Down Node (all issues) 21
22
Future Ideas Generalize Configuration Collect from multiple sources Expose policy Act on multiple targets Self-monitoring Positive Incentives Special access to services Additional resources (Slices, Bandwidth, CPU, etc) 22
23
Time to Reply (when there is a reply) 23
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.