Presentation is loading. Please wait.

Presentation is loading. Please wait.

Patterns & practices Designing For Performance (For.NET) Vance Morrison Performance Architect.NET Runtime Team.

Similar presentations


Presentation on theme: "Patterns & practices Designing For Performance (For.NET) Vance Morrison Performance Architect.NET Runtime Team."— Presentation transcript:

1 patterns & practices Designing For Performance (For.NET) Vance Morrison Performance Architect.NET Runtime Team

2 patterns & practices Goals of Talk 1.Motivate doing perf during development – Encourage best practices especially at design time – Show you the best available tools 2.Provide a framework for understanding Perf – Talk meant to be a ‘Survey Course’ – Lots of links for more information – If you remember only one thing Vance Morrison's Weblog

3 patterns & practices To Design for High Performance 1.Care about Performance – Performance IS extra work, schedule for it – You must know what performance you care bout – You must PLAN for performance from Design to Servicing 2.Measure, Measure, Measure – You will be measuring in ALL parts of the release cycle – Often neglected early in the product cycle – Can also loose perf anywhere along the way (even servicing).

4 patterns & practices Talk Outline 1.Perf Early in Design: Planning and Design 2.Perf Theory: What’s Important and Why 3.Perf Practice: Measuring Tools

5 patterns & practices Performance Planning ALL projects should have a performance plan Performance plans CAN be easy 1.Start with your most important User Scenarios e.g. Startup, various response times. 2.Articulate what is Bad, Good and Excellent Perf e.g. Startup 10 sec bad, 3 sec Good, < 1 sec Excellent 3.Coarse Estimate if ‘Good Perf’ is in Jeopardy Uncertainty => more prototyping and measurement Bad Perf => design change Followup: CLR Inside Out: Measure Early and Often for Performance, Part 1

6 patterns & practices Importance of Design Time Most performance is lost during initial design Design perf loss can’t be fixed easily Guiding Principle: Pay For Play – Users should not pay for what they don’t use Lay groundwork for better perf in V2 Need to know what things cost to do design 1.Getting data from references / past experience 2.Doing experiments to gather needed data Followup: Perf data on.NET primitive operations: CLR Inside Out: Measure Early and Often for Performance, Part 2

7 patterns & practices If You Want Perf, You MUST Design for It 1.NotePad.exe vs Browse.exe – Notepad maps whole file into memory, scans it => Notepad is unusable on files > 100 Meg 2.IE as XmlViewer vs XmlView – IE maps XML into DOM, XMLView keeps pointers => IE is unusable on XML files > 50 Meg 3.XmlDocument vs XmlReader – XmlDocument reads whole document into memory  unusable on XML files > 50 Meg 4.MS Manager Review Tool – Downloaded 50 Meg into an In Memory Database Cache => Long startup times, Large working sets, sluggishness because of paging Examples of What happens when you Don’t Care

8 patterns & practices Talk Outline 1.Perf Early in Design: Planning and Design 2.Perf Theory: What’s Important and Why 3.Perf Practice: Measuring Tools

9 patterns & practices What Should You Measure? Lots to choose from – OS METRICS: %CPU, User Time, System Time, Working Set, Private Working Set, Commit Size, Page Faults, I/O Counts, Bytes Read,, Cache misses, Branch Mispredicts, TLB misses, Interupts, Context Switches, Registry Access, File Access, DLLs loaded, Thread Count, … –.NET METRICS: Methods Jit compiled, IL Size compiled, % Time in JIT, # GCs, GC Memory Alloced, GC Heap Size, % Time in GC, #Exceptions, #Contentions, # CCWs, # Transitions, – Other METRICS Groups: IE, SQL Server, ASP.NET, WPF, IPSec, TCP/IP, … Simplify: You really care about TIME – You only care about other metrics to the extent that they affect the TIME of interest – TIME is the ‘currency’ for making tradeoffs

10 patterns & practices Taxonomy of a Perf Investigation 1.Determine Interval of Time Of Interest – Startup Time, Response Time, Throughput (Time for batch of work), … 2.Determine the Critical Path – If there is any concurrency, you only care about longest sequential Path 3.Minimize Critical Path Time – Do less work – Do work more efficiently (less expensive operations) – Move work off critical path (use multiple threads) Synchronous I/O Disk ReadBuff ASynchronous I/O … 1ms1us ProcessBuff … 2ms1us Disk Read 1ms Process … 2ms Critical Path 3 ms Critical Path 2.001 ms

11 patterns & practices Minimizing Critical Path Time What Can a Single Thread Be Doing? 1.CPU (Executing Instructions) 2.Blocked (Waiting For Something else) 1.Disk (Fetching persistent Data) 2.Network (Waiting on Cross-Machine Resources) 3.Event/Locks (Waiting on other Threads (e.g. SQL DB)) Concentrate optimization on the biggest items – Easy to only worry about CPU, Think about blocked time too! – Critical path can change to another thread

12 patterns & practices Blocked Time:Events/Locks Client programs typically don’t have problems here – Client programs tend to be sequential – Some other thread is doing work while main thread waits. For parallel (server) workloads, can be a big issue – Several threads can be blocked waiting for event/lock – ‘Hot Locks’ are most common reason for poor scaling. – Symptom is that CPU is not being consumed fully. The best way to solve scaling problem by sharing less between threads. – Read-only data is much cheaper (memory system makes copies) – Update in place is generally bad. Functional style good.

13 patterns & practices Blocked Time: Network Network slower than Disk (> 10 msec round trip) Many Apps don’t have any network cost (good) If yours does, however, manage it carefully – Design should optimize to minimize round trips – Synchronous waits on network are particularly bad

14 patterns & practices Blocked Time: Disk Disk is 10,000 X slower than RAM. – 4-8 msec to seek, 20Msec / Meg to transfer Disk time Dominates on ‘Cold Startup’ (page faults) – OS caches disk data, so ‘Warm Startup’ uses no disk – Cold times of 10sec or > are not uncommon To improve Disk time – Run less code at startup (and load fewer DLLs) – Pack the data you do bring in from disk Unmanaged code use Profile-Guided OptimizationProfile-Guided Optimization Nothing for managed code (yet) – Use less.NET Reflection (harder to pack well) Follow up: Vance Morrison's Weblog : A model for cold startup time Track down DLL loading using Visual Studio

15 patterns & practices CPU Time Optimization CPU cost breaks down as 1.Time to execute the instructions (often 1 cycle or less) 2.Time used to fetch arguments (can be many cycles) Improve CPU by 1.Executing fewer instructions (better algorithms) 2.Keeping args in cache (making structures / code smaller) If CPU used by.NET GC 1.If GC uses > 10% CPU, GC Heap needs tuning 2.To improve, allocate less, avoid ‘long lifetimes that die’ IF CPU used by.NET JIT Compiler – Use NGEN to pre-compile the.NET code Followup: Garbage Collector Basics and Performance Hints Maoni Stephens's WebLog Speed: NGen Revs Up Your Performance with Powerful New Features

16 patterns & practices Why / When Memory is Important Memory is not a primary metric (Time is) Important when it affects time – Code size affects startup (page faults), and task switching – Data size affects CPU Cache, which affects CPU Time – Heap size affects.NET GC – Memory your app uses ‘steals’ memory from other applications Some memory more important than other memory – Read only Memory (Code) can be shared across processes is less ‘expensive’ if it is actually shared (OS dlls etc) – ‘Private’ (Heap, or GC Heap) memory is more expensive Follow up: Memory Usage Auditing for.NET Applications

17 patterns & practices Talk Outline 1.Perf Early in Design: Planning and Design 2.Perf Theory: What’s Important and Why 3.Perf Practice: Measuring Tools

18 patterns & practices Monitoring Tools Task Manager (start taskmgr.exe) – Built into Windows – Monitors at process granularity – Resource Monitor a very useful addition – Process Explorer (free) more feature rich option Process Explorer Performance Counters (start PerfMon.exe) – Also built into windows – Also monitors at process granularity – A large number of counters available Eg:.NET Memory # Gen0, #Gen1, #Gen2, Bytes In All Heaps – Designed for long-lived (server) process monitoring

19 patterns & practices Event Tracing For Windows (ETW) An high performance logging infrastructure Kernel,.NET already support it In Vista+ Supports Stack traces on kernel events. Important Events – Process Start End, Thread Start End, DLL Load, Unload – 1 MSec Sampling per CPU – Thread Context Switch – Page Faults (Soft faults, Hard Faults), VirtualAlloc Calls – Disk I/O – File System Access, Registry Access – ReadyThread (what makes a thread runable) –.NET Thread Pool, GC, Module Loads, Appdomains, … Further Reading: Event Tracing: Improve Debugging And Performance Tuning With ETW Core OS Events in Windows 7, Part 1 Core Instrumentation Events in Windows 7, Part 2

20 patterns & practices CPU Measurement Instrumentation based profiling – Modify code to add logging on method entry and exit – Requires modification of code, will affect memory cache behavior – Can slow the program substantially – CLRProfiler and some Visual Studio profiling work this way Sample based profiling – Stop processor and crawl stack every on a given interval (e.g. 1 msec) – Assign the full 1msec to wherever the sample was taken – Efficient (< 5% overhead), dialable, non-intrusive – Sampling ‘noise’, Need 10 samples in interval to start to be meaningful ETW CPU profiling is sample based. Other events are instrumentation

21 patterns & practices Investigation Tools XPERF Windows Performance Analyzer (WPA) - Free Microsoft DownloadWindows Performance Analyzer (WPA) – Collects and Visualizes ETW logs (ETL files) – Vista and above. Can collect stacks for system events – Fixed, 1 MSec sampling for CPU – Symbolic resolution for unmanaged stacks – Currently does not support symbolic stacks for managed code Visual Studio 2008 Profiler (Visual Studio Team System) Visual Studio 2008 Profiler – Works on.NET code or unmanaged – Can do Sample based, Instrumentation Based profiling – Sample based profiling only does user mode CPU profiling – Sample based profiling can sample other useful CPU investigation metrics (cache misses, mispredicts …) Visual Studio 2010 Parallel Performance Analyzer(ETW based) Visual Studio 2010 Parallel Performance Analyzer – Shows all threads, and what they are doing (CPU, Disk, Blocked) – Allows you to determine which threads unblocked a blocked thread (what was it waiting for) VMMap - Free Microsoft Download VMMap – shows coarse memory usage of a single process. – Useful for seeing whether unnecessary DLLs loaded, ClrProfiler - Free Microsoft Download ClrProfiler – shows fine grained usage of.NET GC heap

22 patterns & practices Investigation Technique Understand your Critical path, and the resource that constrains you – Visual Studio 2010 Parallel Performance Analyzer – Xperf Based on the critical resource, you can drill down with other tools CPU – Visual Studio 2010 Parallel Performance Analyzer – Visual Studio 2008 Profiler (Visual Studio Team System) – XPerf Disk – Visual Studio 2010 Parallel Performance Analyzer – XPerf Blocked / Network – Visual Studio 2010 Parallel Performance Analyzer Measuring Memory – VMMap - shows coarse memory usage of a single process. VMMap – ClrProfiler – shows fine grained usage of.NET GC heap ClrProfiler

23 patterns & practices Links and More Links Articles CLR Inside Out: Measure Early and Often for Performance, Part 1 CLR Inside Out: Measure Early and Often for Performance, Part 2 Memory Usage Auditing for.NET Applications Blogs – Vance Morrison's Weblog Vance Morrison's Weblog – Windows Performance Analysis Developer Center (not really a blog, but has FAQ and links to other blogs …) Windows Performance Analysis Developer Center…) – CLR and Framework Perf Blog (.NET Runtime’s Performance Team notes on Performance) CLR and Framework Perf Blog – Rico Mariani's Performance Tidbits Rico Mariani's Performance Tidbits – Visual Studio Profiler Team Blog Visual Studio Profiler Team Blog – Hazim Shafi's Blog (details on VS 2010 new Performance tools) Hazim Shafi's Blog – Pigs Can Fly : Xperf, a new tool in the Windows SDK Pigs Can Fly : Xperf, a new tool in the Windows SDK Tools – MeasureIt (Benchmarking tool for design time) MeasureIt – Visual Studio 2008 Profiler (Part of Visual Studio Team System) General CPU profiling Visual Studio 2008 Profiler – Visual Studio 2010 Parallel Performance Analyzer (Part of Visual Studio Team System) Good all-round profiling (CPU, Disk, Blocked) Visual Studio 2010 Parallel Performance Analyzer – Windows Performance Analyzer (WPA) (XPERF), General Sub-process performance analysis. Windows Performance Analyzer (WPA) – VMMap (Measuring the coarse memory usage within a process) VMMap – CLR Profiler for the.NET Framework 2.0 (Measuring detailed memory usage within the GC heap) CLR Profiler for the.NET Framework 2.0 – Process Explorer (A more feature-rich Task Manager) Process Explorer – Process Monitor (A tool for monitoring Process Monitor Event Tracing Windows (ETW) Articles – Event Tracing: Improve Debugging And Performance Tuning With ETW Event Tracing: Improve Debugging And Performance Tuning With ETW – Core OS Events in Windows 7, Part 1 Core OS Events in Windows 7, Part 1 – Core Instrumentation Events in Windows 7, Part 2 Core Instrumentation Events in Windows 7, Part 2

24 patterns & practices Review FOLLOW UP – Slides at Vance Morrison's Weblog follow linksVance Morrison's Weblog CARE ABOUT PERF (especially at design time) – Understand the scenarios that are performance critical, set Goals. – Estimate Perf at Design time, Experiment to reduce uncertainty MEASURE, MEASURE, MEASURE – You care about TIME – Understand how other metrics affect TIME – Invest in understanding your tools and metrics. – Don’t stop measuring (go all the way through servicing)

25 patterns & practices Questions?


Download ppt "Patterns & practices Designing For Performance (For.NET) Vance Morrison Performance Architect.NET Runtime Team."

Similar presentations


Ads by Google