DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime.

Slides:



Advertisements
Similar presentations
Garbage Collection in the Next C++ Standard Hans-J. Boehm, Mike Spertus, Symantec.
Advertisements

.NET Technology. Introduction Overview of.NET What.NET means for Developers, Users and Businesses Two.NET Research Projects:.NET Generics AsmL.
Tahir Nawaz Introduction to.NET Framework. .NET – What Is It? Software platform Language neutral In other words:.NET is not a language (Runtime and a.
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
Instruction Set Design
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Program Representations. Representing programs Goals.
Memory Management Tom Roeder CS fa. Motivation Recall unmanaged code eg C: { double* A = malloc(sizeof(double)*M*N); for(int i = 0; i < M*N; i++)
Introduction to Advanced Topics Chapter 1 Mooly Sagiv Schrierber
Debugging Production SharePoint Applications Wouter van Vugt.
DEV392: Extending SharePoint Products And Technologies Through Web Parts And ASP.NET Clint Covington, Program Manager Data And Developer Services - Office.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
Aarhus University, 2005Esmertec AG1 Implementing Object-Oriented Virtual Machines Lars Bak & Kasper Lund Esmertec AG
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
Csci4203/ece43631 Review Quiz. 1)It is less expensive 2)It is usually faster 3)Its average CPI is smaller 4)It allows a faster clock rate 5)It has a simpler.
Introducing the Common Language Runtime for.NET. The Common Language Runtime The Common Language Runtime (CLR) The Common Language Runtime (CLR) –Execution.
Introducing the Common Language Runtime. The Common Language Runtime The Common Language Runtime (CLR) The Common Language Runtime (CLR) –Execution engine.
1 Software Testing and Quality Assurance Lecture 31 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Lecture 1: Overview of Java. What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++ Designed.
CLR: Garbage Collection Inside Out
2 Debugging Performance Issues, Memory Issues and Crashes in.net Applications Tess Ferrandez - Norlander Support Escalation Engineer Microsoft Session.
A Free sample background from © 2001 By Default!Slide 1.NET Overview BY: Pinkesh Desai.
Compiler Construction Lecture 17 Mapping Variables to Memory.
Visual C New Optimizations Ayman Shoukry Program Manager Visual C++ Microsoft Corporation.
.NET Framework & C#.
1 The Java Virtual Machine Yearly Programming Project.
Fast, Effective Code Generation in a Just-In-Time Java Compiler Rejin P. James & Roshan C. Subudhi CSE Department USC, Columbia.
SPL/2010 StackVsHeap. SPL/2010 Objectives ● Memory management ● central shared resource in multiprocessing RTE ● memory models that are used in Java and.
Lecture 10 : Introduction to Java Virtual Machine
Java Virtual Machine Case Study on the Design of JikesRVM.
Introduction and Features of Java. What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++
tom perkins1 XML Web Services -.NET FRAMEWORK – Part 1 CHAPTER 1.1 – 1.3.
DEV384 COM+ Lives : New Features in Enterprise Services Included in Windows Server 2003 Catherine Heller Senior Consultant Microsoft Spain.
Writing faster managed code Claudio Caldato Program Manager CLR Performance Team.
C# EMILEE KING. HISTORY OF C# In the late 1990’s Microsoft recognized the need to be able to develop applications that can run on multiple operating system.
Instrumentation in Software Dynamic Translators for Self-Managed Systems Bruce R. Childers Naveen Kumar, Jonathan Misurda and Mary.
CS 3500 L Performance l Code Complete 2 – Chapters 25/26 and Chapter 7 of K&P l Compare today to 44 years ago – The Burroughs B1700 – circa 1974.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
UniProcessor Garbage Collection Techniques Paul R. Wilson University of Texas Presented By Naomi Sapir Tel-Aviv University.
Ben Watson Principal Software Engineer Shared Platform Group, Application Services Group, Microsoft Author, Writing High-Performance.NET Code.
Compiler Optimizations ECE 454 Computer Systems Programming Topics: The Role of the Compiler Common Compiler (Automatic) Code Optimizations Cristiana Amza.
DAT300 SQL Server Notification Services: Application Development Ken Henderson Technical Lead, SQL Server Support Microsoft Corporation
CSE 598c – Virtual Machines Survey Proposal: Improving Performance for the JVM Sandra Rueda.
Object Oriented Software Development 4. C# data types, objects and references.
Copyright 2014 – Noah Mendelsohn Code Tuning Noah Mendelsohn Tufts University Web:
Tips & Tricks: Writing Performant Managed Code Rico Mariani FUNL04 Performance Architect Microsoft Corporation.
DEV394 Windows Forms Performance Tips And Tricks Mike Henderlight Development Manager.NET Client Team Microsoft Corporation
DEV394.NET Framework: Migrating To Managed Code Adam Nathan QA Lead Richard Lander Program Manager Microsoft Corporation.
“WALK IN” SLIDE. August Taming the CLR: How to Write Really Fast Managed Code Rico Mariani Architect Developer Division Performance Team Microsoft.
DEV414 Black-belt ASP.NET Tips And Tricks For Your ASP.NET Applications Rob Howard Program Manager Web Platform and Tools Team.
DEV491.NET Framework: Writing Faster Managed Code Jan GrayAbhi Khune ArchitectDevelopment Lead //blogs.msdn.com/jangr//blogs.msdn.com/akhune CLR Performance.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
.NET Memory Primer Martin Kulov. "Out of CPU, memory and disk, memory is typically the most important for overall system performance." Mark Russinovich.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Code Optimization.
Module 9: Memory and Resource Management
Introduction to Advanced Topics Chapter 1 Text Book: Advanced compiler Design implementation By Steven S Muchnick (Elsevier)
Rifat Shahriyar Stephen M. Blackburn Australian National University
Introduction Enosis Learning.
Introduction Enosis Learning.
Patrick Dussud Technical Fellow Developer Division
CSc 453 Interpreters & Interpretation
Closure Representations in Higher-Order Programming Languages
Created By: Asst. Prof. Ashish Shah, J.M.Patel College, Goregoan West
CSc 453 Interpreters & Interpretation
Dynamic Binary Translators and Instrumenters
IS 135 Business Programming
Presentation transcript:

DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime

Outline Introduction and design patterns Managed code performance issues Cost model Tools Wrap-up

Slow Software is Bad Don’t Ship It Symptoms Locked UI – splash screen, wait cursor Bad citizenship – paging, CPU utilization Scalability – server farms Ultimate causes Inattentive engineering Bad design – bad architecture, interfaces, data structures, algorithms Waste not Premature optimization...

Design Patterns Faster Code, Smaller Data Measure it – time and space Speedup techniques Cache, batch, precompute, defer Smarter recalc Incremental, progressive, background Smaller data Don’t hoard; size appropriately Arrays vs. links; frugal interfaces

Performance Anti-Patterns Think (Twice) Waiting on remote data XML Excessive OOP Ignorance and Apathy Not measuring Not setting performance goals

Perf Process Patterns “That which gets measured gets done” Perf budgets, goals, requirements Perf unit tests, regression tests Process of “constant” improvement Measuring, tracking, refining, trend lines Perf culture Users: perf as a key feature Devs: perf as a correctness issue

Outline Introduction and design patterns Managed code performance issues Cost model Tools Wrap-up

Why Managed Code? Programmer productivity Goodbye, corrupt heap debugging Target modern requirements FX: ++clean, ++consistent, ++streamlined Better apps sooner Performance barrier to adoption? Real – improves with each release Perceived – “blame it on managed code” Reality – its “pedal to the metal”

The Challenge of Writing Fast Managed Code We’re all newbies! Learning how May not be learning how much things cost Everything is easier... The Knowledge Ildasm, debuggers, CLR Profiler, profilers, timing, vadump, events, Rotor

Managed Code Close to the Machine Not your father’s bytecode interpreter Source → IL → native (JIT or NGEN) Optimizing JIT compiler Constant folding; Constant and copy propagation; Common subexpression elimination; Code motion of loop invariants; Dead store and dead code elimination; Register allocation; Method inlining; Loop unrolling (small loops/small bodies).NET Framework 1.1 NGEN does same opts Disabled when debugging

Managed Data Automatic Storage Management Fast new; fast garbage collection GC traces and compacts reachable object graph >50 million objects per second Generational GC Heaps Gen0 – new objects – cache sized; fast GC Gen1 – objects survived a GC of gen0 Gen2 – objects survived a GC of gen1,2 Large object heap Server GC Cache affinitive; concurrent; ASP.NET/hosted Managed data costs space & time over its lifetime

Managed Data Best Practices Often performance == allocation profile Short lived objects are cheap (not free) Try not to churn old objects Inspect with CLR Profiler GC “gotchas” Keeping refs to “dead” object graphs Caches; weak references Pinning Boxing Finalization...

Managed Data Finalization and the Dispose Pattern Finalization: ~C() : non-det. res. clean up GC; object unref’d; promote; queue finalizer Costs – retains object and its objects; finalizer thread; bookkeeping; call Use rarely; use Dispose Pattern Implement IDisposable Call GC.SuppressFinalize Hold few obj fields and null them out ASAP Dispose early; try/finally ; C# using

Managed Code Threading and Synchronization Use the ThreadPool Easy, self tuning, good citizen QueueUserWorkItem, BeginInvoke lock() – not cheap Granularity trade-off – concurrency vs. cost Scales much better in.NET 1.1 Consider Interlocked.Exchange, R.W.Lock

Managed Code Reflection Slower and larger than direct use Prefer is / as to typeof() == Member lookup/enum slow but cached Reflective invoke is quite slow Lookup, overload res., security, stack frame Activator.CreateInstance too Prefer MethodInfo.Invoke to Type.InvokeMember Beware of code that uses reflection Late binding in VB.NET, use Option Explicit On and Option Strict On

Managed Code P/Invoke and COM Interop Efficient, but frequent calls add up Costs depend on marshaling Primitive types and arrays of same: cheap Others, not; e.g. Unicode to ANSI strings COM interop – learn threading models Avoid STA threaded components Avoid calling or being callable via IDispatch Mitigate interop call costs Chunky interfaces; move to managed code

Outline Introduction and design patterns Managed code performance issues Cost model Tools Wrap-up

C/C++ Cost Models The Gut-Feel Cost of a Line of Code C – close to the machine WYWIWYG; int = * + call → instructions C++ (OOP) C features: same cost New features: additional, hidden costs Ctors; SI, MI, VI; virtual; PMs; EH; RTTI What does a function cost?

Towards a Managed Code Cost Model Optimized native code C features: similar cost? OOP features: similar cost? Let’s measure it Simple timing loops, unrolled some Modified to prevent CSE/dead code elim. 50 ms each (2 18 to 2 30 iterations) Measured on 1.1 GHz P-III laptop, Win XP Disclaimers: uncertainty, subj. to change

Costs: Math Nicely optimized and run at full native code speed

Costs: Method Calls Inlining – !virtual, small, simple, no try Instance method call-site null this check Virtual – like C++: (*this->MT[m])(…) Interface – quadruple indirect (*this->MT->itfmap[i]->MT[m])( … ) Disclaimers: !inlining, branch prediction, arguments

Costs: Construction class A { int a; } // L1 class B : A { int b; } // L2 class C : B { int c; } // L3 etc. Allocation / management / GC cost Value types: “0” Ref types: fast but ~proportional to size Construction cost All fields 0-initialized Small ctors can be inlined Larger ctors incur up to 1 call/level

Costs: Casts and IsInsts Safe, secure, verifiable → type safety Cast may throw exception IsInst will not – is and as operators Up casts always safe and free Down casts incur a helper function call

Costs: Write Barriers Gen0 GC: trace roots and gen0 only? Could miss gen0 refs from gen1/gen2 Write barrier notes obj ref field stores contact.address = newAddress; Tracks refs to newer gen objects Not needed for locals, non-objects Incurs a helper function call

Costs: Array Bounds Checks For productivity and runtime integrity Checks index against array.Length Inlined, optimized – inexpensive Range check elimination: for (i=0; i < a.Length; i++)…a[i]… Helper call for store object array elt. Bounds check, type check, write barrier

Summary A Managed Code Cost Model Like C/C++, close to the machine ~1 ns: int, float = * + - * == ~6 ns: calls (perfectly predicted) Unlike C++ ~20-40 ns: new small obj + gen0 GC, box ~6-8 ns: casts, write barriers ~16 ns: object[] stores Reflection? Think 100 times slower

(Get Real) Consider Computer Architecture Cache misses, page faults 1983: 1 MIPS; 300 ns DRAM; 25 ms disk 2003: 10 BOPS; 100 ns DRAM; 10ms disk “Branch-predicting out-of-order superscalar trace-cache RISC w/ 3L data caches” Issue 10,000 ops in 1 μs – or 10 DRAM reads Full cache miss – 1,000 ops Page fault – 100 M ops! 100 ns full cache miss >> any CLR op’n Locality of reference matters

Outline Introduction and design patterns Managed code performance issues Cost model Tools Wrap-up

Tools Inspectors Ildasm, debuggers – beware “debug mode” Measurers Code profilers, perfmon, CLR Profiler, vadump Simple timing loops [...InteropServices.DllImport("KERNEL32")] private static extern bool QueryPerformanceCounter(ref long lpCount); QueryPerformanceFrequency(ref long lpFreq); Rotor

Outline Introduction Managed code performance issues Cost model Tools Wrap-up

In Conclusion… The Secret to Faster Managed Code “There is no magic faster pixie dust!” Look in the mirror You have the power and the responsibility Mantra: Set goals, measure, understand the platform

Resources Managed code perf papers at MSDN.NET Developer Center [ “GC Basics and Performance Hints” “Writing Faster Managed Code: Know What Things Cost” CLR Profiler (same site) SSCLI [ Stutz et al., Shared Source CLI Essentials

Community Resources Most Valuable Professional (MVP) Newsgroups Converse online with Microsoft Newsgroups, including Worldwide User Groups Meet and learn with your peers

evaluations evaluations

© 2003 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.