Download presentation
Presentation is loading. Please wait.
Published byElla West Modified over 9 years ago
1
Visual C++ 2005 New Optimizations Ayman Shoukry Program Manager Visual C++ Microsoft Corporation
2
How can your application run faster? ► Maximize optimization for each file. ► Whole Program Optimization (WPO) goes beyond individual files. ► Profile Guided Optimization (PGO) specializes optimizations specifically for your application. ► New Floating Point Model. ► OpenMP ► 64bit Code Generation.
3
Maximum Optimization for Each File ► Compiler optimizes each source code file to get best runtime performance The only type optimization available in Visual C++ 6 ► Visual C++ 2005 has better optimization algorithms Specialized support for newer processors such as Pentium 4 Improved speed and better precision of floating point operations New optimization techniques like loop unrolling
4
Whole Program Opitmization ► Typically Visual C++ will optimize programs by generating code for object files separately ► Introducing whole program optimization First introduced with Visual C++ 2002 and has since improved Compiler and linker set with new options (/GL and /LTCG) Compiler has freedom to do additional optimizations ► Cross-module inlining ► Custom calling conventions Visual C++ 2005 supports this on all platforms Whole program optimizations is widely used for Microsoft products.
5
Profile Guided Optimization ► Static analysis leaves many open optimization questions for the compiler, leading to conservative optimizations ► Visual C++ programs can be tuned for expected user scenarios by collecting information from running application ► Introducing profile guided optimization Optimizing code by using program in a way how its customer use it Runs optimizations at link time like whole program optimization Available in Visual Studio 2005 Widely adopted in Microsoft if (p != NULL) { /* Perform action with p */ } else { /* Error code */ } Is it common for p to be NULL? If it is not common for p to be NULL, the error code should be collected with other infrequently used code
6
PGO: Instrumentation ► We instrument with “probes” inserted into the code ► Two main types of probes Value probes ► Used to construct histogram of values Count (simple/entry) probes ► Used to count number of times a path is taken ► We try to insert the minimum number of probes to get full coverage Minimizes the cost of instrumentation
7
PGO Optimizations ► Switch expansion ► Better inlining decisions ► Cold code separation ► Virtual call speculation ► Partial inlining
8
Compile with /GL & Optimizations On (e.g. /O2) Source Object files Instrumented Image Scenarios Output Profile data Object files Link with /LTCG:PGI Instrumented Image Profile data Object files Link with /LTCG:PGO Optimized Image Profile Guided Optimization
9
PGO: Inlining Sample ► Profile Guided uses call graph path profiling. foo bat barbaza
10
PGO: Inlining Sample (Cont) 100 foo bat 2050 barbaz 15 bar baz ► Profile Guided uses call graph path profiling. a 1075 bar baz 15
11
PGO – Inlining Sample (cont) foo bat 20125 barbaz 100 15 barbaz ► Inlining decisions are made at each call site. a 10 15
12
PGO – Switch Expansion if (i == 10) goto default; switch (i) { case 1: … case 2: … case 3: … default:… } Most frequent values are pulled out. switch (i) { case 1: … case 2: … case 3: … default:… } // 90% of the // time i = 10; ►
13
PGO – Code Separation A CB D 100 10 A B C D Default layout A B C D Optimized layout Basic blocks are ordered so that most frequent path falls through.
14
PGO – Virtual Call Speculation class Foo:Base{ … void call(); } class Bar:Base { … void call(); } class Base{ … virtual void call(); } void Bar(Base *A) { … while(true) { … A->call(); … } void Func(Base *A) { … while(true) { … if(type(A) == Foo:Base) { // inline of A->call(); } else A->call(); … } The type of object A in function Func was almost always Foo via the profiles
15
PGO – Partial Inlining Basic Block 1 Cond Cold CodeHot Code More Code
16
PGO – Partial Inlining (cont) Basic Block 1 Cond Cold CodeHot Code More Code Hot path is inlined, but NOT the cold
17
Demo Optimizing applications with VC++ 2005
18
New Floating Point Model ► /Op made your code run slow No intermediate switch ► New Floating Point Model /fp:fast /fp:precise (default) /fp:strict /fp:except
19
/fp:precise ► The default floating point switch ► Performance and Precision ► IEEE Conformant ► Round to the appropriate precision At assignments, casts and function calls
20
/fp:fast ► When performance matters most ► You know your application does simple floating point operations ► What can /fp:fast do? Association Distribution Factoring inverse Scalar reduction Copy propagation And others …
21
/fp:except ► Reliable floating point exceptions ► Thrown and not thrown when expected Faults and traps, when reliable, should occur at the line that causes the exception FWAITs on x86 might be added ► Cannot be used with /fp:fast and in managed code
22
/fp:strict ► The strictest FP option Turns off contractions Assumes floating point control word can change or that the user will examine flags ► /fp:except is implied ► Low double digit percent slowdown versus /fp:fast
23
What is the output? #include #include int main() { double x, y, z; double sum; x = 1e20; y = -1e20; z = 10.0; sum = x + y + z; printf ("sum=%f\n",sum); } / fp:fast /O2 = 0.000 /fp:strict /O2 = 10.0
24
OpenMP A specification for writing multithreaded programs It consists of a set of simple #pragmas and runtime routines Makes it very easy to parallelize loop-based code Helps with load balancing, synchronization, etc… In Visual Studio, only available in C++
25
OpenMP Parallelization ► Can parallelize loops and straight-line code ► Includes synchronization constructs first = 1 last = 1000 1 ≤ i ≤ 250251 ≤ i ≤ 500501 ≤ i ≤ 750751 ≤ i ≤ 1000 void test(int first, int last) { #pragma omp parallel for for (int i = first; i <= last; ++i) { a[i] = b[i] + c[i]; }
26
64bit Compiler in VC2005 ► 64bit Compiler Cross Tools Compiler is 32bit but resulting image is 64bit ► 64bit Compiler Native Tools Compiler and resulting image are 64bit binaries. ► All previous optimizations apply for 64bit as well.
27
Resources ► Visual C++ Dev Center http://msdn.microsoft.com/visualc http://msdn.microsoft.com/visualc This is the place to go for all our news and whitepapers Also VC2005 specific forums at http://forums.microsoft.com http://forums.microsoft.com ► Myself http://blogs.msdn.com/aymans http://blogs.msdn.com/aymans
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.