Idea Generation Pipeline SPARK 4/27/2017 Idea Generation Pipeline EVALUATE USE VALIDATE CONCEPT PROTOTYPE A/B FLIGHT Ideas 1w turnaround from idea to Prototype Flights assess user f/b quantitatively Quant only Knowledge Qual + Quant Qual only © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
The states of feature development Inner Dev Loop Feature development Concludes at checkin Outer Dev Loop Build validation Concludes at PROD deployment Monitoring Live Site quality Continuous Flighting Controlled exposure of features
The states of feature testing Inner Dev Loop Mocked automation Visual validation Perf analysis Outer Dev Loop E2E automation Monitoring Exploratory testing AP monitoring Feature parity Flighting Pre-rotation validation Testing is composed of overlapping states
20x / Week The Agility Pipeline @Bing One Repo Stack Release Cadence The Application Platform Stack Release Cadence Scenarios Bing.com APIs 20x / Week MVC XBOX Cortana ASP.NET Windows 10 One Repo Mobile Windows Server … and 1000s of experiments / month
The Agility Pipeline @ Bing Continuous Integration Azure Service Bus Functional tests 500 tests/sec Browser tests Selenium Device tests Run All Tests Chutzpah MStest < 15 Minutes Jasmine Performance tests
Developer footprint: the testing pyramid Browser-based Highest cost (~10-40s) Most flakiness, programmatic Visual Parity Moderately costly (~10-15s) Flakiness, though limited interactions L2/AQG HTTP request Least costly (~1-2s) Very high reliability Unit tests DI/MOQ Zero cost Highest reliability
Test automation landscape Analysis and Reporting BTS Functional automation test framework Agility and Validation Pipeline Treadmill / RO Browser/Client/Device UI drivers Context Independency Parallelized Test Execution ATQ and in-bed testing Parallelized Test Execution Context Independency CITA Agility and Validation Pipeline Analysis and reporting Selenium / Selenium Grid Browser / Client/ Device UI Drivers Functional/ Feature Test Framework Turanga
Measure Everything – Example We review metrics weekly with management and ICs Metrics track Live Site Outer Loop Inner Loop Metrics regressions focus next areas to invest in Metrics set targets and track progress toward goals Scorecard is generated by scripts for very accurate and clear week over week comparison
The impact of flakiness: impact to agility Outer Loop metrics High bug count Long triage Build abandoned No new bugs No triage Auto-ship!