Sponsored by the National Science Foundation GENI Campus Ops Workflow Chaos Golubitsky San Juan, Puerto Rico Mar
Sponsored by the National Science Foundation2March 16, 2011www.geni.net Outline Introduction Experimenter support Resources Monitoring
Sponsored by the National Science Foundation3March 16, 2011www.geni.net Towards a more “production-like” GENI Some Spiral 3 ops goals: –Resources are easier for experimenters to find and use –Provisioning an experiment doesn’t require picking up the phone (as often) –Resources are more reliably available –Problems with resources are easier to detect and resolve Here are some steps we think will be useful
Sponsored by the National Science Foundation4March 16, 2011www.geni.net Campus ops workflow? A workflow is a set of steps to achieve a goal: –Become a production GENI campus! This process will change as more campuses try it Proposed workflow steps we think will be useful Three categories: –Experimenter support –Resource deployment –Monitoring There’s more than one way to do this; input is welcome!
Sponsored by the National Science Foundation5March 16, 2011www.geni.net GPO as reference campus We try things out, test, and provide guidance and support to campuses deploying similar things –And pass along ideas for other reference campuses We hope to help: –Small testbeds with diverse resources (OpenFlow, MyPLC, ProtoGENI, L2 backbone connectivity) –Campuses who want to create testbeds –Bigger testbeds (where we can) We’re working on: –Experimenter support –More (and more GENI-like) resources –Useful monitoring –Templates for transitioning to GENI operations
Sponsored by the National Science Foundation6March 16, 2011www.geni.net Workflow Steps for Experimenter Support Subscribe to –Report your outages –Answer questions from experimenters Tell GPO you’re willing to support some experimenters: rces Create a page advertising each of your aggregates: SiteAggregatehttp://groups.geni.net/geni/wiki/GeniAggregate/Your SiteAggregate –What resources do you have? –Who can use them? –How do they use them? –Resources don’t need to be fully open to the public to be advertised here –Template:
Sponsored by the National Science Foundation7March 16, 2011www.geni.net Experimenter Support at GPO
Sponsored by the National Science Foundation8March 16, 2011www.geni.net Workflow Steps for Adding Resources Connectivity Aggregates: –Give local users access to your resources –Run software that supports the GENI AM API –Give remote users access to your resources (consistent with your site policy) Configuration management: –Know what you’re running –Especially if it’s GENI software (things change fast) –Allows you to help experimenters better –Allows us (and other campuses) to help you better
Sponsored by the National Science Foundation9March 16, 2011www.geni.net Resources at GPO GPO can provide templates and help for aggregates we have experience with Things we have: –Connections to NLR and I2 backbones –OpenFlow switches (HP/NEC/Quanta), FlowVisors, controllers, GENI AM API support –Reference installation of WiMAX software –ProtoGENI cluster A simple resource you can deploy: –MyPLC plus SFA to support the GENI AM API: eImplementation eImplementation
Sponsored by the National Science Foundation10March 16, 2011www.geni.net Workflow Steps for Monitoring (1) Two consumers of monitoring data: –Operators and experimenters Operators: –Goals: Detect and resolve outages quickly Plan for the future –Monitoring steps: Polling and trending of local resources Alerting on local resource outages Visibility into status of connected remote resources Visibility into many remote resources in a consistent format
Sponsored by the National Science Foundation11March 16, 2011www.geni.net Workflow Steps for Monitoring (2) Experimenters: –Goals: Identify problems affecting the slice Collect measurement data for their slice –Monitoring steps: Status of available resources (how many nodes?) Status of resources I’m using (is my node up?) External characteristics of slice (CPU usage? Network bandwidth?) Internal characteristics of slice (I&M working session Thursday)
Sponsored by the National Science Foundation12March 16, 2011www.geni.net Monitoring at GPO Strategy: –Collect as much data as possible from our site now: –Integrate our data with collectors (GMOC, aggregates) Tactics: –Trending is more important than alerting: Remote operators and experimenters are casual consumers Don’t want alerts for resources which may not be relevant Do want historical availability information on request –Collect numeric trending data in a consistent format: Using ganglia to collect data in rrdtool format for now –Generate webpages that format ganglia’s data more meaningfully
Sponsored by the National Science Foundation13March 16, 2011www.geni.net Monitoring at GPO: Ganglia’s native UI
Sponsored by the National Science Foundation14March 16, 2011www.geni.net Monitoring at GPO: Collecting GENI Data Active testing: –Use simple scripts to run tests and report results to ganglia –Test recent values for freshness and sanity –GPO uses this to monitor reachability across the NLR and Internet2 OpenFlow backbone Collecting external slice data: –Run locally on aggregate manager –Query aggregate data: slice names, node counts –Query operational data: packet counters, node state, CPU usage
Sponsored by the National Science Foundation15March 16, 2011www.geni.net Monitoring at GPO: Status of core VLANs
Sponsored by the National Science Foundation16March 16, 2011www.geni.net Monitoring at GPO: FlowVisor slice status
Sponsored by the National Science Foundation17March 16, 2011www.geni.net Summary Spiral 3 ops goals: –Test operations across several unaffiliated campuses –Ramp up GENI-wide experiment support GPO is trying to be an example campus, but there are many others If you do only two things, please: –Join –Make sure we know what you would like to support this year, and what we can do to