AppInsight: Mobile App Performance Monitoring in the Wild

Slides:



Advertisements
Similar presentations
Pages and boxes Building quick user interfaces. learning objectives o Build a quick UI with pages and boxes o understand how pages and boxes work o click.
Advertisements

W3C Workshop on Web Services Mark Nottingham
Objectives Overview Define an operating system
Mobile DevOps Mobile Apps + APIs = Mobile DevOps Alex Gaber Crittercism QCon New York 2014.
Trace Analysis Chunxu Tang. The Mystery Machine: End-to-end performance analysis of large-scale Internet services.
VanarSena: Automated App Testing. App Testing Test the app for – performance problems – crashes Testing app in the cloud – Upload app to a service – App.
Automatic and Scalable Fault Detection for Mobile Applications Lenin Ravindranath, Suman Nath, Jitu Padhye, Hari Balakrishnan.
Ellucian Mobile: Don’t text and drive, kids!
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
Chapter 11 - Monitoring Server Performance1 Ch. 11 – Monitoring Server Performance MIS 431 – created Spring 2006.
Real-Time Kernels and Operating Systems. Operating System: Software that coordinates multiple tasks in processor, including peripheral interfacing Types.
70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network Chapter 14: Troubleshooting Windows Server 2003 Networks.
Loupe /loop/ noun a magnifying glass used by jewelers to reveal flaws in gems. a logging and error management tool used by.NET teams to reveal flaws in.
Timecard: Controlling User-Perceived Delays in Server-Based Mobile Applications Lenin Ravindranath, Jitu Padhye, Ratul Mahajan, Hari Balakrishnan.
OPERATING SYSTEMS AND SYSTEMS SOFTWARE. SYSTEMS SOFTWARE Systems software consists of the programs that control the operations of the computer and its.
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Module 8: Monitoring SQL Server for Performance. Overview Why to Monitor SQL Server Performance Monitoring and Tuning Tools for Monitoring SQL Server.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Virtual Memory Tuning   You can improve a server’s performance by optimizing the way the paging file is used   You may want to size the paging file.
Apps VS Mobile Websites Which is better?. Bizness Apps Survey Bizness Apps surveyed over 500 small business owners with both a mobile app and a mobile.
Computer System Architectures Computer System Software
CS 0004 –Lecture 1 Wednesday, Jan 5 th, 2011 Roxana Gheorghiu.
Restaurants & Mobile Why Your Restaurant Needs A Mobile Experience.
CS378 - Mobile Computing App Project Overview. App Project Teams of 2 or 3 students Develop an Android application of your choosing subject to instructor.
CN1176 Computer Support Kemtis Kunanuraksapong MSIS with Distinction MCT, MCTS, MCDST, MCP, A+
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 16 System Architecture and Design II.
Explain the purpose of an operating system
Ideas to Improve SharePoint Usage 4. What are these 4 Ideas? 1. 7 Steps to check SharePoint Health 2. Avoid common Deployment Mistakes 3. Analyze SharePoint.
Mobile Apps For Small Businesses Your customers are mobile. Is your business?
Chapter 34 Java Technology for Active Web Documents methods used to provide continuous Web updates to browser – Server push – Active documents.
Architectures of distributed systems Fundamental Models
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Procrastinator: Pacing Mobile Apps’ Usage of the Network mobisys 2014.
Learningcomputer.com SQL Server 2008 – Profiling and Monitoring Tools.
Timecard: Controlling User-Perceived Delays in Server-Based Mobile Applications Lenin Ravindranath, Jitu Padhye, Ratul Mahajan, Hari Balakrishnan.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
Scheduling Lecture 6. What is Scheduling? An O/S often has many pending tasks. –Threads, async callbacks, device input. The order may matter. –Policy,
AppInsight: Mobile App Performance Monitoring In The Wild Lenin Ravindranath, Jitu Padhye, Sharad Agarwal, Ratul Mahajan, Ian Obermiller, Shahin Shayandeh.
End-to-End Performance Analytics For Mobile Apps Lenin Ravindranath, Jitu Padhye, Ratul Mahajan Microsoft Research 1.
CISC Machine Learning for Solving Systems Problems Presented by: Suman Chander B Dept of Computer & Information Sciences University of Delaware Automatic.
Restaurants & Mobile Why Your Restaurant Needs A Mobile Experience.
Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,
Module 9 Planning and Implementing Monitoring and Maintenance.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris
Chapter 9 Operating Systems Discovering Computers Technology in a World of Computers, Mobile Devices, and the Internet.
Improving and Controlling User-Perceived Delays in Mobile Apps Lenin Ravindranath By the time it loads, the church service is over. Too slow!!! Slow responsiveness,
MGT305 - Application Management in Private and Public Clouds Sean Christensen Senior Product Marketing Manager Microsoft Corporation MGT305.
WIDESCREEN PRESENTATION Tips and tools for creating and presenting wide format slides.
Internet of Things. Creating Our Future Together.
A Checklist of Testing Tips for Developing a Mobile App Presented By: Konstant Infosolutions.
Tuning Threaded Code with Intel® Parallel Amplifier.
CHAPTER 6 Threads, Handlers, and Programmatic Movement.
Maintaining and Updating Windows Server 2008 Lesson 8.
3 main operating system! BY Charlotte Oates. Microsoft windows! A family of operating systems for personal computers. Windows dominates the personal computer.
Session Name Pelin ATICI SQL Premier Field Engineer.
Mobile Application Test Case Automation
Development-Introduction
Week 01 Comp 7780 – Class Overview.
AppInsight: Mobile App Performance Monitoring in the Wild
Serverless CQRS in Azure!
Cloud Connect Seamlessly
Chapter 4.
PredictRemainingTime
Report from the trenches of an HTML5 game provider
Threads CSE 2431: Introduction to Operating Systems
Presentation transcript:

AppInsight: Mobile App Performance Monitoring in the Wild Lenin Ravindranath, Jitu Padhye, Sharad Agarwal, Ratul Mahajan, Ian Obermiller, Shahin Shayandeh Microsoft Research

> One Million Apps > 300,000 Developers

There’s an App For That But its slow “… Too slow - killing the usefulness when you really need to go.”

“So slow. Did an intern write this app??” “Slower than a snail.” “Slow and unresponsive like mud” “Sluggish and freezes my HTC phone.” “Very very slow compared to even browsing web.” “Consistently 3 seconds behind where I touch.” “Loading GPS data is *** slow”

Performance problems are inevitable in the wild “So slow. Did an intern write this app??” “Slower than a snail.” Diverse environmental conditions Network connectivity, GPS signal quality, etc Variety of hardware and OS versions Wide range of user interactions “Slow and unresponsive like mud” “Sluggish and freezes my HTC phone.” “Very very slow compared to even browsing web.” Hard to emulate in the lab “Consistently 3 seconds behind where I touch.” “Loading GPS data is *** slow” Performance problems are inevitable in the wild

What is the user-perceived delay? Monitor Performance in the Hands of Users What is the user-perceived delay? Where is the bottleneck? Little platform support Only option is to instrument your app Manage your own logging infrastructure Two sub bullets Goal, How? Significant barrier for most app developers

AppInsight What is the user-perceived delay? Where is the bottleneck? Automatic App Instrumentation Zero developer effort Binary Instrumentation Readily deployable No changes to the OS or runtime Low overhead Does not need any app specific information

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Store

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Store

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Store

App Instrumentation is Challenging Instrumentation impacts app performance They are already slow enough!!! Limited Resources Network, Battery, Memory Make it a bullet, everything is black

App Instrumentation is Challenging Highly Asynchronous Programming Pattern Highly interactive, UI centric Single UI thread that should not be blocked Most tasks are performed asynchronously Asynchronous APIs for Network, Sensor, IO etc. Computation performed on background threads Blue color for title Tracing async code is challenging

53% Mitt Rating Mitt Rating Internet Mitt Romney Tweets “If I had a nickel for every time Mitt Romney said something stupid I'd be in his tax bracket” “Mitt Romney might vote for Obama as well” “We recommend Mitt Romney for president” “I would definitely trust Mitt Romney with my money.”

Hypothetical Synchronous Code Mitt Rating ClickHandler() { tweets = HttpGet(url); rating = ProcessTweets(tweets); display.Text = rating; } ProcessTweets(tweets) ... 53% LogStart(); LogEnd(); Hypothetical Synchronous Code Thread Click Handler Start Click Handler End User Perceived Delay

Asynchronous Code System 53% ClickHandler() { Mitt Rating ClickHandler() { AsyncHttpGet(url, DownloadCallback); } DownloadCallback(tweets) rating = ProcessTweets(tweets); UIDispatch(DisplayRating, rating); DisplayRating(rating) display.Text = rating; ProcessTweets(tweets) ... 53% System Download Callback UI Dispatch Background Thread Asynchronous code ProcessTweets Async Get Call UI Thread ClickHandler End ClickHandler Start Display Rating User Click

System User Transaction 53% User Perceived Delay Transaction time Mitt Rating 53% User Transaction System Download Callback UI Dispatch Background Thread Dotted lines Async Get Call UI Thread ClickHandler Start User Perceived Delay Transaction time Display Rating

User Transaction 53% Transaction time Background Thread UI Thread Get Rating 53% User Transaction Background Thread UI Thread Transaction time

Mitt Rating 53% User Transaction Background Thread UI Thread 7 edges

Where is the bottleneck? Apps are highly asynchronous 30 popular apps 167,000 transactions Up to 7000 edges On average, 19 asynchronous calls per user transaction On average, 8 parallel threads per user transaction Where is the bottleneck? Focus development efforts

User Transaction 47% Twitter Process Tweets Callback Fire Facebook Mitt Rating Tweets 47% Posts Twitter Process Tweets Callback Fire Background Thread Facebook Process Posts Callback Fire Background Thread Facebook Twitter Background Thread Lets try to add a feature to our example app here. Instead of making a single request, we want to make two requests and compare the rating. Here is how the execution trace looks like. This is still is a simple user transaction but it already contains 18 edges. - Since most tasks need to be performed asynchronously, transactions quickly gets complicated with complex thread synchronization. Here is a example of transaction which processes data from two http calls. When the user interacts, it creates a background thread And start two Http requests. Then it blocks the thread. The first response comes back, does some processing and informs the blocked thread that it has finished. Later, the second response comes back, and fires the blocked thread At this point the thread wakes up, does some processing and updates the UI To get the transaction graph and the user perceived delay, you have to track all the causality between threads and thread synchronizations. And the user perceived delay is from the start to the end. Thread Blocked Thread Wakeup UI Thread User Click User Transaction Display

Critical Path User Transaction Optimizing the critical path reduces the user perceived delay Twitter Process Tweets Callback Fire Background Thread Facebook Process Posts Callback Fire Background Thread Facebook Twitter Background Thread Lets try to add a feature to our example app here. Instead of making a single request, we want to make two requests and compare the rating. Here is how the execution trace looks like. This is still is a simple user transaction but it already contains 18 edges. - Since most tasks need to be performed asynchronously, transactions quickly gets complicated with complex thread synchronization. Here is a example of transaction which processes data from two http calls. When the user interacts, it creates a background thread And start two Http requests. Then it blocks the thread. The first response comes back, does some processing and informs the blocked thread that it has finished. Later, the second response comes back, and fires the blocked thread At this point the thread wakes up, does some processing and updates the UI To get the transaction graph and the user perceived delay, you have to track all the causality between threads and thread synchronizations. And the user perceived delay is from the start to the end. Thread Blocked Thread Wakeup UI Thread User Click User Transaction Display

AppInsight What is the user-perceived delay? Where is the bottleneck? Automatically instruments the app to track user transactions and critical path

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Store

Capturing User Transaction Should not impact app performance Limited Resources Low Overhead User Transaction Background Thread UI Thread

User Transaction Background Thread UI Thread

Capture UI Manipulations User Transaction User Click Event Handler Background Thread UI Thread Event Handler User Click

Capture UI Manipulations User Transaction Background Thread UI Thread

Capture UI Manipulations Thread Execution User Transaction Callback Start Callback End Background Thread UI Thread Click Handler Start Click Handler End Update UI Start End

Capture UI Manipulations Thread Execution User Transaction Background Thread UI Thread

Capture UI Manipulations Thread Execution Async Calls and Callbacks User Transaction Download Callback UI Dispatch Call Background Thread Async Get Call UI Thread UI Dispatch Callback

Capture UI Manipulations Thread Execution Async Calls and Callbacks User Transaction Background Thread UI Thread

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates User Transaction User Transaction Background Thread UI Thread UI Update

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates User Transaction Background Thread UI Thread

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization User Transaction Fire Background Thread Fire Background Thread Background Thread Thread Blocked Thread Wakeup UI Thread

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization User Transaction Background Thread UI Thread

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization User Transaction User Transaction Callback Start Callback End Background Thread UI Thread Click Handler Start Click Handler End Update UI Start End

Capturing Thread Execution Trace every method Prohibitive overhead Enough to log thread boundaries Log entry and exit of Upcalls System App Callback Start Callback End Background Thread Upcalls UI Thread Click Handler Start Click Handler End Update UI Start End

Identify Upcalls Event Handlers are Upcalls Function pointers point to potential Upcalls Callbacks are passed as function pointers ClickHandler() { AsyncHttpGet(url, DownloadCallback); } DownloadCallback(tweets) rating = ProcessTweets(tweets); UIDispatch(DisplayRating, rating); DisplayRating(rating) display.Text = rating; ProcessTweets(tweets) ... <Button Click=“ClickHandler” />

Instrument Upcalls Low Overhead Rewrite app 7000 times less overhead Trace Upcalls ClickHandler() { Logger.UpcallStart(1); AsyncHttpGet(url, DownloadCallback); Logger.UpcallEnd(1); } DownloadCallback(response) Logger.UpcallStart(2); rating = ProcessTweets(tweets); UIDispatch(DisplayRating, rating); Logger.UpcallEnd(2); DisplayRating(rating) Logger.UpcallStart(3); display.Text = rating; Logger.UpcallEnd(3); ProcessTweets(tweets) ... Low Overhead 7000 times less overhead Callback Start Callback End Background Thread UI Thread Click Handler Start Click Handler End Update UI Start End

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization User Transaction Download Callback UI Dispatch Call Background Thread Async Get Call UI Thread UI Dispatch Callback

Async Calls and Callbacks Log Callback Start We capture start of the thread Log Async Call Any call that accepts a function pointer Match Async Call to its Callback ClickHandler() { AsyncHttpGet(url, DownloadCallback); Logger.AsyncCall(5); } DownloadCallback(tweets) Logger.UpcallStart(2); .... Callback Async Get Call

Matching Async Call to its Callback A method could be a callback to many async calls Replicate the callback for each Async call [Hirzel’01] Async call called in a loop (e.g. download list of URLs) Cannot be matched correctly Solution: Detour Callbacks

Callbacks System App AsyncHttpGet Async Call DownloadCallback DownloadCallback(response) { } Async Call

Detour Callbacks System App AsyncHttpGet Async Call obj.DetourCallback DownloadCallback DownloadCallback(response) { } Async Call while(...) { } MatchId = 3 MatchId = 3 obj = new DetourObject(DownloadCallback, MatchId++); AsyncHttpGet(url, obj.DetourCallback); AsyncHttpGet(url, GetCallback); class DetourObject { } MatchId = 3 DetourCallback(response) { DownloadCallback(response); }

Detour Callbacks System App AsyncHttpGet Async Call obj.DetourCallback DownloadCallback(response) { } Async Call while(...) { } MatchId = 4 MatchId = 4 obj = new DetourObject(DownloadCallback, MatchId++); AsyncHttpGet(url, obj.DetourCallback); class DetourObject { } MatchId = 4 DetourCallback(response) { DownloadCallback(response); }

Detour Callbacks System App AsyncHttpGet Async Call obj.DetourCallback DownloadCallback(response) { } Async Call while(...) { } MatchId = 5 MatchId = 5 obj = new DetourObject(DownloadCallback, MatchId++); AsyncHttpGet(url, obj.DetourCallback); class DetourObject { } MatchId = 5 DetourCallback(response) { DownloadCallback(response); }

Matching Async Call to its Callback Detour Callbacks Event Subscriptions Delayed Callbacks Track Object Ids Low Overhead Download Callback UI Dispatch Call Background Thread Async Get Call UI Thread UI Dispatch Callback

Capture UI Manipulations Thread Execution Async Calls and Callbacks UI Updates Thread Synchronization User Transaction Background Thread UI Thread

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Marketplace

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Marketplace

Analysis Critical Path Aggregate Analysis Exception Path

Critical Path Bottleneck path in the user transaction Critical Path Optimizing the critical path reduces user perceived delay Critical Path UI Update User Manipulation Multiple UI updates Timers, Sensors

Analysis Critical Path Aggregate Analysis Exception Path

Aggregate Analysis Group similar transactions Outliers Same transaction graph Outliers Points to corner cases Highlight common critical paths Focus development effort Root causes of Performance Variability Highlight what really matters in the wild

Analysis Critical Path Aggregate Analysis Exception Path

Failures in the wild Exception Path Exception at parseXML() DownloadCallback() at parseXML() Background Thread ThreadCallback() Stack trace parseXML() … DownloadCallback() Background Thread ClickHandler() AsyncHttpGet(url) UI Thread ThreadStart() User Manipulation Exception Path

Analysis Critical Path Aggregate Analysis Exception Path

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Store

Developer Feedback Long User Transactions Critical Path Web based Interface Long User Transactions Critical Path Aggregate Analysis Outliers Common Case Factors affecting Crashes - Exception Path

Deployment Developer Feedback Instrumenter Analysis Server App App Instrumented App Traces Downloads App Store

Deployment 30 Windows Phone apps 30 Users Over 4 months of data 6750 app sessions 33,000 minutes in apps 167,000 user transactions

User Transactions and Critical Path 15% of the user transactions take more than 5 seconds Apps are highly asynchronous Up to 7000 edges # edges in the critical path << # edges in the transaction Top 2 edges responsible for 82% of transaction time App developer can focus on optimizing these edges Key reason why a system like AppInsight is needed

Key reason why performance monitoring in the wild is important Aggregate Analysis High variability in the wild 29% transaction groups has multiple critical paths You would think that critical path remains the same Network, GPS state, Device model, User all affects performance Key reason why performance monitoring in the wild is important

AppInsight Overhead 0.02% 2% 4% 1.2% <1% Negligible Overhead Compute Memory Network Binary Size Battery 0.02% 2% 4% 1.2% <1% Negligible Overhead Impact on app performance is minimal Our users reported no cases of performance degradation For any system like this, the question is what is the overhead Low resource consumption

Developer Feedback Instrumenter Analysis Server App App Developer Instrumented App Traces Downloads App Store

Improved Developer Feedback Instrumenter Analysis Server App App Instrumented App Traces Downloads App Store

Developer Case Study App: My App Problem: UI Hog AppInsight: Aggregate analysis showed high performance variability Attributed to UI thread Abnormal latencies only at the start of the session System loading DLLs in the critical path

Developer Case Study App: Popular App Problem: Slow Transactions AppInsight: Aggregate analysis showed 3G latencies significantly affecting critical paths Known issue but no quantitative data Current caching policy insufficient Quantitative data

Developer Case Study App: Professional App Problem: Slow Transactions AppInsight: Custom instrumentation in the critical path Affecting user perceived delay

Helps app developers understand performance bottlenecks in the wild AppInsight Helps app developers understand performance bottlenecks in the wild Automatically tracks User Transactions Identifies Critical Paths and Exception Paths Aggregate Analysis highlights factors affecting performance Extremely low overhead Zero developer effort Readily deployable

FAQ Who cares about Windows Phone. Can you do it for Android and iPhone? This was done in 1982. Don’t you read related work? Are you one of those who don’t care about user privacy? Sounds too good. Are there any limitations? Are you going to release it? Or is it a one off paper? I totally don’t get why you are doing this!?

Limitations AppInsight can miss certain causal relationships Does not track data dependencies Two threads communicating through a shared variable One thread polling Uses heuristics to warn the developer Will miss implicit casual relationships Resource contention Does not track that state left behind Dependencies resulting from such state