Summertime Fun Everyone loves performance Shirley Browne, George Ho, Jeff Horner, Kevin London, Philip Mucci, John Thurman.

Slides:



Advertisements
Similar presentations
Performance Analysis and Optimization through Run-time Simulation and Statistics Philip J. Mucci University Of Tennessee
Advertisements

Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
CSCI 4125 Programming for Performance Andrew Rau-Chaplin
Achieving over 50% system speedup with custom instructions and multi-threading. Kaiming Ho Fraunhofer IIS June 3 rd, 2014.
Extensible Networking Platform 1 Liquid Architecture Cycle Accurate Performance Measurement Richard Hough Phillip Jones, Scott Friedman, Roger Chamberlain,
DTC AOP GSI October 1, FY09 Funding Resource/Tasks AFWA (February January 2010): – GSI code management and support (1.1FTE)
NUMA Tuning for Java Server Applications Mustafa M. Tikir.
Combining Static and Dynamic Data in Code Visualization David Eng Sable Research Group, McGill University PASTE 2002 Charleston, South Carolina November.
PARALLEL PROCESSING The NAS Parallel Benchmarks Daniel Gross Chen Haiout.
December 9, 2002 UNICON / IBS: Adam Rybicki Michael Erdely Sun Microsystems: Jeff Weiss Testing and Proving uPortal’s Scalability at the Sun iForce Center.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
METRICS Standards and Infrastructure for Design Productivity Measurement and Optimization Andrew B. Kahng and Stefanus Mantik UCLA CS Dept., Los Angeles,
Evaluating current processors performance and machines stability R. Esposito 2, P. Mastroserio 2, F. Taurino 2,1, G. Tortone 2 1 INFM, Sez. di Napoli,
Database Administration Chapter 16. Need for Databases  Data is used by different people, in different departments, for different reasons  Interpretation.
Understanding and Managing WebSphere V5
What is Unix Prepared by Dr. Bahjat Qazzaz. What is Unix UNIX is a computer operating system. An operating system is the program that – controls all the.
2006 NSF CRI-PI Meeting1 ns-3 Project Plan Tom Henderson and Sumit Roy, University of Washington Sally Floyd, ICSI Center for Internet Research George.
PAPI Tool Evaluation Bryan Golden 1/4/2004 HCS Research Laboratory University of Florida.
PAPI Update Shirley Browne, Cricket Deane, George Ho, Philip Mucci University of Tennessee Computer.
Promoting Open Source Software Through Cloud Deployment: Library à la Carte, Heroku, and OSU Michael B. Klein Digital Applications Librarian
Rich Internet Applications for the Enterprise Creating RIA from your Oracle database using TURBO Enterprise Web 2.0 Presented By: John Krahulec Bizwhazee.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
Computer Science Department University of Texas at El Paso PCAT Performance Counter Assessment Team PAPI Development Team SC 2003, Phoenix, AZ – November.
1 LabVIEW DSP Test Integration Toolkit. 2 Agenda LabVIEW Fundamentals Integrating LabVIEW and Code Composer Studio TM (CCS) Example Use Case Additional.
® IBM Software Group © 2007 IBM Corporation J2EE Web Component Introduction
1 “How Can We Address the Needs and Solve the Problems in HPC Benchmarking?” Jack Dongarra Innovative Computing Laboratory University of Tennesseehttp://
John Mellor-Crummey Robert Fowler Nathan Tallent Gabriel Marin Department of Computer Science, Rice University Los Alamos Computer Science Institute HPCToolkit.
Introduce to Java. Outline History of Java History of Java Something about Java Something about Java Brief introduction to Java programming Brief introduction.
Nick Draper 05/11/2008 Mantid Manipulation and Analysis Toolkit for ISIS data.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting September 11-12, 2003 Washington D.C.
Trace-Based Optimization for Precomputation and Prefetching Madhusudan Raman Supervisor: Prof. Michael Voss.
Numerical Libraries Project Microsoft Incubation Group Mary Beth Hribar Microsoft Corporation CSCAPES Workshop June 10, 2008 Copyright Microsoft Corporation,
HPCMP Benchmarking Update Cray Henry April 2008 Department of Defense High Performance Computing Modernization Program.
Portable Parallel Performance Tools Shirley Browne, UTK Clay Breshears, CEWES MSRC Jan 27-28, 1998.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Database Administration
Contract Year 1 Review Computational Environment (CE) Shirley Moore University of Tennessee-Knoxville May 16, 2002.
PerfSONAR-PS Functionality February 11 th 2010, APAN 29 – perfSONAR Workshop Jeff Boote, Assistant Director R&D.
1 SciDAC High-End Computer System Performance: Science and Engineering Jack Dongarra Innovative Computing Laboratory University of Tennesseehttp://
Using Cache Models and Empirical Search in Automatic Tuning of Applications Apan Qasem Ken Kennedy John Mellor-Crummey Rice University Houston, TX Apan.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
Connections to Other Packages The Cactus Team Albert Einstein Institute
Threaded Programming Lecture 2: Introduction to OpenMP.
Computer Science Department University of Texas at El Paso PCAT Performance Counter Assessment Team PAPI Development Team UGC 2003, Bellevue, WA – June.
25 April Unified Cryptologic Architecture: A Framework for a Service Based Architecture Unified Cryptologic Architecture: A Framework for a Service.
Performance Data Standard and API Shirley Browne, Jack Dongarra, and Philip Mucci University of Tennessee from the Ptools Annual Meeting, May 1998.
Paul Alexander 2 nd SKADS Workshop October 2007 SKA and SKADS Costing The Future Paul Alexander Andrew Faulkner, Rosie Bolton.
1 The DIRECT Project Delaware Interprocedural REgion-based Compiler Toolset Tom Way Ben Breech Wei Du Matt Bridges Ves Stoyanov Lori Pollock Department.
PAPI on Blue Gene L Using network performance counters to layout tasks for improved performance.
Benchmarking, Performance Evaluation, Modeling and Prediction Erich Strohmaier.
3 Project Objectives Aspectual Collaborations (AC) for the Connection Aspect –Metric: Does the restructuring of the UAV code with AC reduce the tangling.
October 18, 2001 LACSI Symposium, Santa Fe, NM1 Towards Scalable Cross-Platform Application Performance Analysis -- Tool Goals and Progress Shirley Moore.
July 19, 2004Joint Techs – Columbus, OH Network Performance Advisor Tanya M. Brethour NLANR/DAST.
Michael Ernst, page 1 Application Communities: Next steps MIT & Determina October 2006.
Nek5000 preliminary discussion for petaflops apps project.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Reference Implementation of the High Performance Debugging (HPD) Standard Kevin London ( ) Shirley Browne ( ) Robert.
Cache Simulations and Application Performance Christopher Kerr Philip Mucci Jeff Brown Los Alamos, Sandia.
Measuring Performance Based on slides by Henri Casanova.
Learn Jmeter testing tool in online. What is Jmeter? Jmeter is an open source testing software. It is used to perform load test, performance test. It.
Shirley Moore Towards Scalable Cross-Platform Application Performance Analysis -- Tool Goals and Progress Shirley Moore
The Use of AMET and Automated Scripts for Model Evaluation
An Introduction to the IVC Software Framework
Performance Analysis, Tools and Optimization
Graduation Project Kick-off presentation - SET
Performance Load Testing Case Study – Agilent Technologies
CMSC 611: Advanced Computer Architecture
Computer Based Adaptive Testing
CMSC 611: Advanced Computer Architecture
Presentation transcript:

Summertime Fun Everyone loves performance Shirley Browne, George Ho, Jeff Horner, Kevin London, Philip Mucci, John Thurman

Projects PerfAPI Cache Simulator DOD Performance Optimization Rice/PET Collaboration Benchmarking HPD implementation Graduation

PerfAPI Research gathered on the following platforms –Sun Ultra –Pentium Pro/II –IBM Power Series –MIPS R10000 –DEC Alpha, Cray T3E –Cray T90/SV1

PerfAPI Standard API to access hardware performance counters Standard set of definitions for performance metrics Resulting in: –Data for performance tool developers –Data for tuning and evaluating applications –Portable performance tools for every major platform

PerfAPI Research on the user’s needs through Mailing list Web page Collaboration with existing researchers Vendors (Sun, Cray, Digital, SGI) SPDT98/SC98 poster

PerfAPI - Coming this fall Draft API by 8/31 released to mailing list 9/31 revisions incorporated into a tech report 10/31 implementations for MIPS and Ultra 11/31 Implementations for IBM and Intel 12/31 Implementations for Alpha 12/31 Portable hardware counter based prof

Cache Simulator Motivated by the need for information correlated with the source code and run- time reference patterns Redesign into object oriented structure and raw output format Statistical reduction techniques GUI design Parser design

Cache Simulator GUI written in Java Parsers written using Octave from Edinburg. (?) Tool will allow browsing and instrumentation of source Reporting will be done with perl scripts Money from Sandia pending Conflict matrix adopted by IBM/Watson

DOD Performance Optimization Optimization Tutorial and Poster at User’s Group Meeting Visit to ASC to help scalability of Cobalt Putting together a performance team with a suite of in-house tools. Meeting at Rice to direct a possible collaboration with Rice on run-time data collection for optimization.

DOD Performance Optimization Upcoming tutorial at ARL 2 day + 1 day workshop Speaking at annual ARL UGM Work with Cobalt, MAGI, GAMESS, HELIX Developers reluctance! Lack of tools! PET lead?

DOE Benchmarking Attended Weather/Climate modeling conference in June. Need for standardized benchmarks Complete lack of understanding of performance Opportunity for UT involvement Meeting in September at NCAR Virgin territory?

Benchmarking Modifications to 3 benchmarks plus a new one. –MPBench –BLASBench –CacheBench –ClockBench Standardized options and graph generation Integration into a Low-Level suite

Benchmarking MSRC Machines Completed benchmark runs for BlasBench, CacheBench and MPBench on all the MSRC platforms. In the Process of finishing Dedicated Runs of the ParkBench Suite. Made graphs of the results and added them into the BenchRib repository.

MPBench –Add alltoall test –Dynamic memory allocation –NUMA measurement –Cache flushing code

Benchmarking BLASbench –Addition of solvers CacheBench –Latency benchmark ClockBench –Timer accuracy

HPD Implementation Elimination of Dolphin as a solution provider Received front end parser Building distributed Perl framework DBX/GDB as backends