Download presentation
Presentation is loading. Please wait.
1
Cloud Ops Master Class:
Lessons learned from a multi-year implementation of Cloud automation at scale. Michael Osburn DevSecOps @mosburn Nathan Wallace Founder & CEO, Turbot @nathanwallace
2
Our Technical Ecosystem
80+ DevOps Teams Millions of Production Users Mike – Overview of MHE setup / requirements Broad App Stack (Legacy + Cutting Edge) Tens of Thousands of Transactions Per Second Lambda
3
How can we achieve agility,
ensure control & Mike Balancing these forces Tell a story about the challenge when unbalanced accelerate best practice?
4
Continuous Deployment
← DEV OPS → ChatOps Issue Tracking Incident Mgmt FEEDBACK Monitoring DEV DEV DEV TEST STAGE PROD DEVELOPMENT DEPLOY & RUN Mike We’re all familiar with the DevOps cycle But, doing this consistently for 80 teams is hard. We need standards. We need controls. Story about how to scale / manage software DevOps? Automation is powerful, but also increases risk – DevOps Borat joke. And .. CD has traditionally focused on software – not cloud infrastructure. Infrastructure change adds more fundamental risks. Secrets Secrets Secret & Environment Mgmt Artifacts Artifacts Releases & Artifacts Artifacts Artifacts CI CD CD Continuous Deployment v1.3.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 Code #aaa #bbb v1.2.0 v1.1.1 Code Code Code RELEASE Code
5
Workload isolation Hard blast radius Clear ownership Cost allocation
Network isolation Access management Change management Workload isolation Mike Tell stories about why isolation became important to MHE LESSON 1
6
Mike Graphic to help depict the separation between dev & prod etc.
7
Ride the rockets Do not abstract or compete
Their speed is your advantage Focus on enabling your business Unlock the power of open source Nathan LESSON 2
8
Maturity Model for AWS Account Management
Share House Multiple teams sharing an account for different projects. Hosted Services Handful of accounts (dev, prod, etc) are centrally managed and shared by different teams / projects. Multi-tenant Projects operate with independence and isolation within agreed rules and services. Nathan Connect back to Mike’s isolation comments Focus on this as an example of riding the rocket Innovator Small team working on shared goal.
9
Teach, don’t do Avoid being a bottleneck Eliminate the cycle of blame
Leverage public tutorials & answers (You can’t do it in real-time anyway!) Mike With isolation, we can now enable the teams Move out of being the ticket taker / bottleneck And … since the app controls their infrastructure … we can’t do it for them anyway! LESSON 3
10
(Customer and/or Partner)
App Team Infrastructure Team REQUEST FULFILL Network Hardware VMs DBs Software App Teams SELF-SERVICE & APIS CONFIGURE SECURE MONITOR MANAGE AUTOMATE SUPPORT Cloud Team (Customer and/or Partner) LEARN
11
Policies Simple rules behind the controls
Policies (MUST) vs recommendations (SHOULD) Full automation requires a lot of policies There are always exceptions! Use exceptions to experiment & learn Mike With the isolation, we need rules for how they will work e.g. S3 must be encrypted e.g. Exceptions always happen – be ready to handle them at scale LESSON 4
12
Mike: Policy example if you find it helpful
13
Learn by doing Experiment within blast radius
Use exceptions & limited SuperUser Collaborate on new services Hand build > pattern > automation Mike: Collaborate side-by-side How can we make this cloud service work within our policies? LESSON 5
14
Kickstart with best practice Learn by doing with specific apps
Automate & teach other teams
15
Guardrails Detect & Correct
Real-time – more effective & more user friendly Native to the services & tools Automate patterns & best practices Nathan LESSON 6
16
AWS Events & SNS SQS Context & Policies Guardrail Audit Trail CHANGE
MANAGE REPORT Nathan Talk through real-time configuration event in AWS tools
17
Patterns at scale Use common language & models
Automate & repeat patterns Avoid custom central services Learn & enhance patterns over time Accelerate, don’t constrain, teams Nathan We need to repeat rollout … not bundle things together Use small DB for each app – not a shared DB Common language – accelerate conversation LESSON 7
19
Visibility Audit trail for security & compliance
Change history to understand behavior Review code & setup, not docs Automated decisions need records Detailed logs for trouble-shooting Nathan With real-time infra We need full visibility into what happened – both for audit and for devs! LESSON 8
21
Automate3! Kill the ticket Automate all Level 1-2 responses
Invest to elevate & remain agile LESSON 9
22
Software Defined Operations: Go faster, safely.
Application teams get self-service Direct AWS console & API access. Hard blast radius. Cloud team has oversight & policy mgmt Policy management. Performance & metrics monitoring Support request fulfillment. Monitoring and incident response Training on best practices. Automation scales solution Preventative & detective controls. Automation of common deployment patterns Automated ticketing response. Increased coverage area over time. Cloud Team App Team Application SELF-SERVICE APIS CONFIGURE AUTOMATE SECURE MONITOR HELP LEARN DB OS …
23
Let’s see it live: #sdops
Main screen – separate workloads into different accounts Click to account Click to AWS login Create S3 bucket Show automatic updates to it – tags, permissions, etc Go back to Turbot See activity history built in there Show the diff Show controls – dive deep into the actual event Show policies – this is how we decide what to do Show an exception Show permissions – simple, repeatable model Show a change
24
Benefits Move at cloud speed Common language & patterns Security
Workload Isolation Patterns at scale Visibility Teach, don’t do Ride the rockets Guardrails Policies Learn by doing Move at cloud speed Common language & patterns Security Compliance Cost control Optimal use of skills Alignment & reduced friction
25
Questions? Michael Osburn DevSecOps @mosburn Nathan Wallace
Founder & CEO, Turbot @nathanwallace
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.