Beyond DevOps: Database Reliability Engineering

Slides:



Advertisements
Similar presentations
CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
Advertisements

Database Administration Chapter 16. Need for Databases  Data is used by different people, in different departments, for different reasons  Interpretation.
Database Systems: Design, Implementation, and Management Ninth Edition
Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.
Lee Kinsman (soon to be) Consultant, Chamonix IT Consulting
Chapter 15 Database Administration and Security
Windows 2000 Security Policies & Practices: How to build your plan Mandy Andress, CISSP President ArcSec Technologies.
OBJECT ORIENTED SYSTEM ANALYSIS AND DESIGN. COURSE OUTLINE The world of the Information Systems Analyst Approaches to System Development The Analyst as.
Putting the “Engineering” in Software Engineering: Technology Infrastructure in Process Improvement Adam Kolawa, Ph.D. CEO, Parasoft.
Cloud Computing Project By:Jessica, Fadiah, and Bill.
What Is DevOps? DevOps is "a portmanteau of 'development' and 'operations'" and is "a software development method that stresses communications, collaboration,
Some constructive thoughts on “Choosing the Work Order Management System” that is right for you and your organisation.
6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation form at the end of the day in the Registration.
L’Oreal USA RSA Access Manager and Federated Identity Manager Kick-Off Meeting March 21 st, 2011.
1 Jason Shepard – Managing Principal – Cresna MCS Bernd Harzog – CEO – OpsDataStore Dave Wagner – CTO – OpsDataStore ITOM 1.1 Application Performance Management.
Clouding with Microsoft Azure
Dr. Ir. Yeffry Handoko Putra
CompSci 280 S Introduction to Software Development
Bringing DevOps to the Database
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
Software Quality Control and Quality Assurance: Introduction
Azure Infrastructure for SAP®
Azure-Based Project Management App Helps Creative Agencies Run Their Projects Efficiently “With Microsoft Azure PaaS, we can focus on our app and offer.
Continuous Delivery- Complete Guide
Server Upgrade HA/DR Integration
Impact-Oriented Project Planning
Lecture 1 Introduction to Database
Fundamentals of Information Systems, Sixth Edition
An assessment framework for Intrusion Prevention System (IPS)
Docker Birthday #3.
What is Cloud Computing - How cloud computing help your Business?
Oracle Database Administration
Владимир Гусаров Директор R&D, Dell Visual Studio ALM MVP ALM Ranger
Planning an Effective Upgrade from SQL Server 2008
SQL Saturday Pittsburgh
Database Systems: Design, Implementation, and Management Tenth Edition
Microsoft SharePoint Server 2016
A CIO’s view of SDN Who are REANNZ Current Maturity of SDN (top-down)
Software Engineering (CSI 321)
DevOps for the DBA Grant Fritchey Product Evangelist Redgate Software.
CCNET Managed Services
BUILDING A PRIVACY AND SECURITY PROGRAM FOR YOUR NON-PROFIT
THE STEPS TO MANAGE THE GRID
Speaker’s Name, SAP Month 00, 2017
Introduction.
AWS. Introduction AWS launched in 2006 from the internal infrastructure that Amazon.com built to handle its online retail operations. AWS was one of the.
Cloud Testing Shilpi Chugh.
AWS DevOps Engineer - Professional dumps.html Exam Code Exam Name.
What Is Sharepoint? Mohsen Ashkboos
DevOps CSCI 577b.
IT INFRASTRUCTURES Business-Driven Technologies
Smart Team Making a Beautiful software
Database Administrators
DAT381 Team Development with SQL Server 2005
Cloud Consulting Services and Solutions
Shifting Security Left
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Automating Profitable Growth™
Andy Puckett – Sales Engineer
HCL’s Viewpoint – DevOps on MS Cloud
IST346: Virtualization and Containerization
Capitalize on Your Business’s Technology
{Project Name} Organizational Chart, Roles and Responsibilities
Mark Quirk Head of Technology Developer & Platform Group
Are you measuring what really counts?
Beyond DevOps: Database Reliability Engineering
Setting up PostgreSQL for Production in AWS
OU BATTLECARD: Oracle Linux Training and Certification
OU BATTLECARD: Oracle Identity Management Training
OU BATTLECARD: Oracle Systems Learning Subscription
Presentation transcript:

Beyond DevOps: Database Reliability Engineering Dave Dustin Principal Engineer - Xero Beyond DevOps: Database Reliability Engineering

Site Reliability Engineering (SRE) A discipline that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create ultra-scalable and highly reliable software systems. According to Ben Treynor, founder of Google's Site Reliability Team, SRE is "what happens when a software engineer is tasked with what used to be called operations."

Today's database professionals must be engineers, not administrators Today's database professionals must be engineers, not administrators. We build things. We create things. As engineers practicing devops, we are all in this together, and nothing is someone else's problem. As engineers, we apply repeatable processes, established knowledge, and expert judgment to design, build, and operate production data stores and the data structures within. As database reliability engineers, we must take the operational principles and the depth of database expertise that we possess one step further.

Introducing Data(base) Reliability Engineering

Introducing Data(base) Reliability Engineering D R E Introducing Data(base) Reliability Engineering

Guiding Principles of DRE Protect the Data Elimination of Toil Self-Service for Scale Databases Are Not Special Snowflakes Eliminate the Barriers Between Software and Operations

Protect the Data Traditionally, protecting data always has been a foundational principle of the database professional It still is.

Protect the Data The generally accepted approach has been attempted via: A strict separation of duties between the software and the database engineer Rigorous backup and recovery processes, regularly tested Well-regulated security procedures, regularly audited Expensive database software with strong durability guarantees Underlying expensive storage with redundancy of all components Extensive controls on changes and administrative tasks

Protect the Data In teams with collaborative cultures, the strict separation of duties can become not only burdensome, but also restrictive of innovation and velocity. We need to discuss ways to create safety nets and reduce the need for separation of duties. Additionally, these environments focus more on testing, automation, and impact mitigation than extensive change controls.

Protect the Data More often than ever, architects and engineers are choosing open source datastores that cannot guarantee durability the way that something like SQL Server might have in the past. Sometimes, that relaxed durability gives needed performance benefits to a team looking to scale quickly. Recognizing that there are multiple tools based on the data you are managing and choosing effectively is rapidly becoming the norm.

Protect the Data The new approach to data protection might look more like this: Responsibility of the data shared by cross-functional teams. Standardized and automated backup and recovery processes blessed by DRE. Standardized security policies and procedures blessed by DRE and Security teams. All policies enforced via automated provisioning and deployment.

Protect the Data The new approach to data protection might look more like this: Data requirements dictate the datastore, with evaluation of durability needs becoming part of the decision making process. Reliance on automated processes, redundancy, and well-practiced procedures rather than expensive, complicated hardware. Changes incorporated into deployment and infrastructure automation, with focus on testing, fallback, and impact mitigation.

Elimination of Toil The Google SRE teams often use the phrase “Elimination of Toil,” which is defined as: Toil is the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows.

Elimination of Toil Effective use of automation and standardization is necessary to ensure that teams and engineers are not overburdened by toil. The definition of the word "toil" is vague and differs throughout industries, with lots of preconceptions that vary from person to person. Normally though, we are talking about manual work that is repetitive, non-creative, and non-challenging.

Elimination of Toil For many organisations, specialist DBAs are asked to review and apply database changes to not just production, but other environments. These can include modifications to tables or indexes, the addition, modification, or removal of data, or any other number of tasks. Everyone feels reassured that the DBA is applying these changes and monitoring the impact of the changes in real time.

Elimination of Toil With modern agile businesses, the rate of change can be quite high, and those changes are often impactful. And DBAs have been known to spend all their time running these repetitive tasks over and over, eventually becoming jaded and often end up quitting.

Self-Service for Scale A talented DBA is a rare commodity. Most companies cannot afford and retain more than one or two. With the adoption of Devops and team autonomy, expecting each team to properly manage their own datastores can be unrealistic So, we must create the most value possible, which comes from creating self-service platforms for teams to use. By setting standards and providing tools, teams are able to deploy new services and make appropriate changes at the required pace without serializing on an overworked database engineer.

Self-Service for Scale Examples of these kinds of self-service methods include: Ensure the appropriate metrics are being collected from data stores by providing the correct plug-ins. Building backup and recovery utilities that can be deployed for new data stores. Defining reference architectures and configurations for data stores that are approved for operations, and can be deployed by teams. Working with Security to define standards for data store deployments. Building safe deployment methods and test scripts for database changesets to be applied.

In other words, an effective Data Reliability Engineer functions by empowering others and guiding them, not acting as a gatekeeper.

Databases are not special snowflakes Our systems are no more or less important than any other components serving the needs of the business. We must strive for standardization, automation, and resilience. Critical to this is the idea that the components of database clusters are not sacred. We should be able to lose any component and efficiently replace it without worry. Fragile data stores in glass rooms are a thing of the past.

Databases are not special snowflakes The metaphor of pets versus cattle is often used to show the difference between a special snowflake and a commodity service component. A pet server is one that you feed, care for, and nurture back to health when it is sick. It also has a name. Cattle servers have numbers, not names. You don’t spend time customizing servers, much less logging on to individual hosts. When they show signs of sickness, you cull hem from the herd.

Eliminate the Barriers between Software & Operations Your infrastructure, configurations, data models, and scripts are all part of software. Study and participate in the software development lifecycle as any engineer would. Code, test, integrate, build, test, and deploy. Did we mention test?

Eliminate the Barriers between Software & Operations In a traditional environment, the process of designing, building, testing, and deploying infrastructure and related services was separate from software development, system engineering, and DBAs. Software development teams often have very defined approaches to building, testing, and deploying features and applications. The paradigm shifts discussed previously are pushing for removal of this impedance mismatch, which means DREs and Systems Engineers find themselves needing to use similar methodologies to do their jobs.

Eliminate the Barriers between Software & Operations DREs might also find themselves embedded directly in a software engineering team, working in the same code base, examining how code is interacting with the datastores, and modifying code for performance, functionality, and reliability. The removal of organizational separation creates an improvement in reliability, performance, and velocity an order of magnitude greater than traditional models, and DREs must adapt to these new processes, cultures, and tooling.

The DRE role is a paradigm shift from an existing, well-known role The DRE role is a paradigm shift from an existing, well-known role. More than anything, the framework gives us a new way to approach the functions of managing datastores in a continually changing world.

Thank you…

Thanks to our great Sponsors Gold Sponsors Silver Sponsors Global Sponsors Bronze Sponsors