More HP Caliper, an Update on the Itanium ® HP-UX and Linux Performance Measurement Tool Tuesday, September 21, 2004 Speaker: Dave Babcock Caliper Development Team Hewlett-Packard

Preface HP Caliper: New Performance Analysis Tool An introduction to HP Caliper, what it is, and how to use it. Eric Gouriou Caliper Development Team Webcast: September 9, 2003 Replay:,,498!0!,00.html

Agenda Overview of HP Caliper New Features in Caliper 3.6 Caliper on Itanium Linux Hints and Tips Summary DSPP Information Q & A

What is HP Caliper? Execution [performance] measurement tool Itanium, Itanium2 native code HP-UX, Linux "Swiss Army Knife" - Many different measurements - Common user interface and options - Multiple report formats – text, csv, html, visualizer Per-process (user + kernel), system-wide Uses Performance Monitor Unit (PMU) hardware and dynamic instrumentation as needed Measures any application

Measured Applications C, C++, Fortran9x, assembly 32-bit, 64-bit executable Single-process, multi-process Single-thread, multi-threaded Optimized, un-optimized, debug, stripped Main, static/dynamic shared libraries Entire run, partial run, server/daemon PA-RISC binaries "tracked" "Just run it"

Making Measurements caliper [options] application …. - or - caliper [options] --attach= e.g.: caliper cgprof sweep3d caliper dcache_miss -o report --attach=1234 caliper func_cover -o report --process=all \ cc -o hello +O hello.c

Measurements Overview: cpu_metrics, total_cpu Profiles: alat_miss, branch_prediction, dcache_miss, dtlb_miss, fprof, icache_miss, itlb_miss Traces: pmu_trace Coverage: func_cover* Counts: arc_count*, func_count* Call graph: cgprof* * not in Linux beta Used for: What? Where? Details? (instrumented)

Caliper Text Report Run info Total run metrics Metrics per load module Metrics per function caliper fprof --out=report my_app

Caliper Text Report (cont'd) Function details

Caliper CSV Report caliper fprof --csv=report my_app

Caliper HTML Report caliper fprof --html=report my_app

New Feature – cpu_metrics Measures and reports a group of related CPU metrics - total_cpu = 4 CPU metrics (max) per run - cpu_metrics = 1 complete CPU metric group per run Groups (metric sets): - cycles per instruction, instruction dispersal, stalls - L1 data, L1 instruction, L2, and L3 caches - branch path, branch prediction - cache coherence, tlb misses - control speculation, data speculation - cpu bus, system bus, queues Requires HP-UX 11i V2 (B.11.23.0409) or later

cpu_metrics Examples caliper cpu_metrics --metrics=stall my_app caliper cpu_metrics --metrics=cpi my_app

New Feature – func_cover Measures and reports function execution (coverage) Reports in text and csv formats Coverage data from multiple runs can be aggregated into a single report Multiple levels of reports: - load module summary - source directory summary - source file summary - function level detail HP-UX only

func_cover Example caliper func_cover my_app

New Feature – Cell Local Memory Measures and reports Cell Local Memory (CLM) usage Can be requested in addition to any measurement Specified with --memory-usage=exit option Reports at application exit - Total system memory usage by cell - Application memory usage by cell Works with multi-cell (ccNUMA) and SMP systems Runs on HP-UX 11i V2 or later

Cell Local Memory Example caliper dcache_miss --memory-usage=exit my_app

New Feature – System-wide Kernel Profiling Measures and reports kernel performance profiles Can use any overview or profile measurement - Overview metrics of kernel and/or user code - Profilers can only measure kernel code Can measure DLKMs All CPU's are measured and results aggregated Must have root privileges No other Caliper runs at the same time Requires HP-UX 11i V2 (B.11.23.0409) or later

System-wide Kernel Profiling Example caliper fprof --scope=kernel --duration=10

New Feature – Datafile Enhancements Save performance data for later analysis & [re-]reporting Datafiles can contain performance data for: - single or multiple runs - single or multiple processes - single or multiple threads New --join option can: - merge different types of performance data onto one report - aggregate similar performance data into one set for reporting

New Feature – Graphical Data Visualizer Interactive tool to graphically explore performance data Works with one or more pre-collected datafiles Full GUI supports: - easy selection of performance data - one or more metrics viewed together in 2D graphs - selectable sorts - click-down from application, thru shared libraries and functions to source statements and assembly code Java client can be run on local or remote system Server runs on Itanium collection system Preview release – we'd like your feedback

Graphical Data Visualizer Example

New Features – Miscellaneous Support for new system & compiler features - Adaptive Address Space - New HP C/C++ compiler - Unwind Express - Enhanced feedback-based optimization - New Itanium processors Other new Caliper features - Attach to multiple processes - Greater control of measurements & reports

Caliper on Itanium Linux Same features as HP-UX version except: - instrumented measurements - OS feature differences Linux 2.6.x kernel (or later) - SUSE Linux Enterprise Server 9 - RedHat Enterprise Linux 4.0 Perfmon 2.0 subsystem (or later) Beta version available now - Available through your technical support contact Send product feedback (

Hints and Tips Use overview metrics to determine what the performance problem is and then profilers to determine where Use overview metric total value to gauge sampling rate for profile measurements Save measurement runs in datafiles for re-reporting Make source code available for source correlated reports Create custom personal/site configuration files for common measurements Use caliper info to determine CPU metrics to use

Hints and Tips (cont'd) Extensive shared library usage: --module-include/exclude=… measure only some libraries Multi-threaded applications: --thread=sum-all for aggregrated report --thread=all for per-thread reports Multi-process applications: --process=all measure all processes --process=some:… measure only some processes --process= custom selection of processes

Process Tree Example caliper fprof --process=cg.B.4 mpirun -np 4 cg.B.4

Summary Itanium execution performance tool Measures production applications Wide range of performance metrics available Explore performance data thru various reports Available on HP-UX and Linux (beta)

DSPP Tools & Resources for Itanium ® 2 Architecture Set You Up for Success Software –development environments, compilers, operating systems, installation/configuration tools, performance tools and more Technical documentation –white papers, tutorials, references documents and manuals, FAQ's, known problems, sample code, etc. Training and Education –online and classroom training

More DSPP Tools & Resources Community –Itanium ® architecture forums, source code repository, document sharing and mailing lists Equipment –rentals and purchase discounts Partner Resources News & Events

Where to go … Start with the HP's web site for Itanium® Architecture DSPP partners: Contact points for additional information, general support, equipment, localization resources and more: Americas telephone 1.800.249.3294 Europe telephone 800.100.929.70 Asia-Pac or go to for local country

HP & Intel Webcast Series Promotion HP & Intel are giving away an iPAQ, digital camera, or all-in-one to 1(one) lucky winner!! Promotion Period: 9 am PST September 21, 2004 through 12am PST October 23, 2004 How to become eligible: 1. Attend the September 21 webcast titled More Caliper, An Update on the Itanium ® HP-UX and Linux Performance Measurement Tool AND complete the post-event survey. OR 2. View the replay of the webcast AND complete the post-event survey.OR 3. Complete and mail a 3 x 5 card with your name, employer name and address, phone number, email address to: HP/Intel Webcast Series, Attn: R.Keyler, Suite 100, 510 E. 96 th Street, Indianapolis, Indiana 46240 Full promotion details can be found on DSPP at:,1252,6543,00.html

Intel® Early Access Program Benefits Delivers the resources your company needs to develop and market cutting-edge software solutions that run best on Intel's latest processors. One company- level membership for all your developers and marketers.

Technology The Early Access Program (EAP) gives you access to Intel ® technology to support your current development cycle as well as early access to tools and information on new technologies. Your membership includes: –Early access to pre-release software development platforms –Access to Intel and 3rd party software and testing tools –Training through Intel ® Software College and Web events –Technical content and how–to articles – Protected remote access to easily evaluate and develop software safely and securely on platforms over the Internet

Marketing Opportunities and Support Extensive marketing and business development opportunities: –Inclusion in online and print versions of the Intel ® Developer Solutions Catalog –Intel quotes to support your PR –Case studies –Access to Intel's event marketing asset kit –Participation in selected industry events and trade shows Support in your development efforts provided through: –A dedicated Intel Account Representative who acts as your primary contact –Intel ® Premier Support for confidential technical support –24/7 online support via

Related Intel® Resources Intel® Early Access Program Homepage – Intel® Developer Services Homepage – Intel® Software College – Intel® Software Development Tools – Intel® Remote Access Forum – Experience Intel® Itanium® 2 Architecture –

Experience Intel® Itanium® 2 Microarchitecture Shared access systems available for all Intel® Developer Services members –Upload, execute, and test ported 64-bit server applications –Evaluate and test compatibility with the latest versions of middleware and database apps –All without setting up a single server! Simply go to –Follow instructions on this page to select the Test Drive configuration and establish your account

Questions & Answers

39 Intel, the Intel logo and Itanium is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. Portions © 1998-2004 Intel Corporation | Portions © 1998-2004 Hewlett-Packard Corporation 39

