Performance Tuning ECEN5043 Software Engineering of Multi-Program Systems University of Colorado.

Slides:



Advertisements
Similar presentations
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Advertisements

Analysis of Computer Algorithms
1 Processes and Threads Creation and Termination States Usage Implementations.
Processes and Threads Chapter 3 and 4 Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community College,
Construction process lasts until coding and testing is completed consists of design and implementation reasons for this phase –analysis model is not sufficiently.
Configuration management
Software change management
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 18 Methodology – Monitoring and Tuning the Operational System Transparencies © Pearson Education Limited 1995, 2005.
When and How to Improve Code Performance? Ivaylo Bratoev Telerik Corporation
QA practitioners viewpoint
Code Tuning Strategies and Techniques CS524 – Software Engineering Azusa Pacific University Dr. Sheldon X. Liang Mike Rickman.
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Hash Tables.
Cache and Virtual Memory Replacement Algorithms
By Snigdha Rao Parvatneni
By Waqas Over the many years the people have studied software-development approaches to figure out which approaches are quickest, cheapest, most.
Management Information Systems [MOIS470]
3.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Process An operating system executes a variety of programs: Batch system.
Processes Management.
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
Executional Architecture
Operating Systems: Internals and Design Principles
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
Chapter 11 Describing Process Specifications and Structured Decisions
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,
Lecture # 2 : Process Models
MIS 2000 Class 20 System Development Process Updated 2014.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Copyright © 2003, SAS Institute Inc. All rights reserved. Where's Waldo Uncovering Hard-to-Find Application Killers Claire Cates SAS Institute, Inc
Chapter 6 Concurrency: Deadlock and Starvation
Fundamentals of Information Systems, Second Edition
1 Software Testing and Quality Assurance Lecture 40 – Software Quality Assurance.
A. Frank - P. Weisberg Operating Systems Introduction to Tasks/Threads.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
1 Threads Chapter 4 Reading: 4.1,4.4, Process Characteristics l Unit of resource ownership - process is allocated: n a virtual address space to.
Data Structures and Programming.  John Edgar2.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
SAMANVITHA RAMAYANAM 18 TH FEBRUARY 2010 CPE 691 LAYERED APPLICATION.
Chapter 11 Heap. Overview ● The heap is a special type of binary tree. ● It may be used either as a priority queue or as a tool for sorting.
CS 3500 L Performance l Code Complete 2 – Chapters 25/26 and Chapter 7 of K&P l Compare today to 44 years ago – The Burroughs B1700 – circa 1974.
CY2003 Computer Systems Lecture 04 Interprocess Communication.
Object Oriented Database By Ashish Kaul References from Professor Lee’s presentations and the Web.
Data Structures and Algorithms Dr. Tehseen Zia Assistant Professor Dept. Computer Science and IT University of Sargodha Lecture 1.
Lecture 5: Threads process as a unit of scheduling and a unit of resource allocation processes vs. threads what to program with threads why use threads.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Object-Oriented Programming Chapter Chapter
CSC Multiprocessor Programming, Spring, 2012 Chapter 11 – Performance and Scalability Dr. Dale E. Parson, week 12.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Memory Management OS Fazal Rehman Shamil. swapping Swapping concept comes in terms of process scheduling. Swapping is basically implemented by Medium.
1 ECE 526 – Network Processing Systems Design System Implementation Principles I Varghese Chapter 3.
Parallel Computing Presented by Justin Reschke
Introduction to operating systems What is an operating system? An operating system is a program that, from a programmer’s perspective, adds a variety of.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
7/9/ Realizing Concurrency using Posix Threads (pthreads) B. Ramamurthy.
Certification of Reusable Software Artifacts
A brief intro to: Parallelism, Threads, and Concurrency
Why Events Are A Bad Idea (for high-concurrency servers)
Parallel Algorithm Design
Objective of This Course
Threads Chapter 4.
Multithreaded Programming
PROCESSES & THREADS ADINA-CLAUDIA STOICA.
Foundations and Definitions
Programming with Shared Memory Specifying parallelism
Operating System Overview
CSC Multiprocessor Programming, Spring, 2011
Presentation transcript:

Performance Tuning ECEN5043 Software Engineering of Multi-Program Systems University of Colorado

June 7, 2002ECEN5043 University of Colorado Performance Tuning 2 Reference Connie U. Smith and Lloyd G. Williams, Performance Solutions, Addison-Wesley, Object Technology Series, ISBN

June 7, 2002ECEN5043 University of Colorado Performance Tuning 3 FOCUS Fixing what has already been implemented Harder to get large impact at local code level Harder to fix what is really a design problem Final touches on a performance-oriented design May be needed for positive reasons Actual usage requires different choices New, unanticipated performance requirements

June 7, 2002ECEN5043 University of Colorado Performance Tuning 4 Realistic Achievement Benefit of tuning May obtain significant performance improvement Would have been even better if designed in Cost of implementing tuning-tricks Code typically less reusable Harder to maintain Biggest benefit will occur in the code most frequently executed -- limit tuning to that code

June 7, 2002ECEN5043 University of Colorado Performance Tuning 5 Tuning, Not Fundamental Change Tuning is aiming to perform the same or equivalent function better Not changing what is accomplished Changing how it is done Nature of the process provided in this part of book Assumes changes focus on highest immediate payoff Software developers have limited time available Can also use to attempt to reach new performance objectives imposed (discovered) after installation*

June 7, 2002ECEN5043 University of Colorado Performance Tuning 6 Tuning Overview Prepare test plans Probably have reason to believe tuning is needed What is being measured (what kind of performance is the issue?) What kind of data is needed to know if objectives are met? How will the data be gathered (measurement tools & procedures)? Execute the plans; analyze them Consider improving environment instead of code Determine heaviest users of bottleneck points

June 7, 2002ECEN5043 University of Colorado Performance Tuning 7 Tuning Overview (cont.) Profile the heavy hitters to isolate hot spots inside those processes This is hard, detailed, specific Need operational loading conditions Emphasis on getting useful data -- some measurement attempts may alter performance -- think Identify remedies; quantify improvement vs impact Select and Do. (Note: overall approach is a basic Plan-Do-Check- Act cycle, typical for improvement efforts.)

June 7, 2002ECEN5043 University of Colorado Performance Tuning 8 Generic Performance Solutions Useful whether OO or not Applicable to detailed design Applicable for initial coding style, not just tuning after the fact.

June 7, 2002ECEN5043 University of Colorado Performance Tuning 9 Dominant Workload Functions Centering Principle and Fast Path -- Focus on code executed in the dominant workload functions Profile* to identify operations called most frequently Tells which and what order Order evaluation of conditions by likely order of occurrence so a most likely hit occurs most quickly Optimize algorithms & data structures for the typical case Minimize object creation, unnecessary context switches, consider adjusting data structure format* Extreme suggestions for... extreme situations. Only.

June 7, 2002ECEN5043 University of Colorado Performance Tuning 10 Improve Scalability If CPU is the bottleneck, add processors? THAT required initial design so that processing steps can be performed in parallel Very hard to retrofit Keep the monolith for totally independent copies of software run concurrently Increase throughput of individual processes by upgrading processors Parallel design allows for concurrent threads decide optimal number of tasks & whether lightweight (threads) or heavyweight (processes)

June 7, 2002ECEN5043 University of Colorado Performance Tuning 11 Thread and process performance Lightweight threads initiated dynamically during execution priority depends on priority of parent process Heavyweight processes initiated at software startup has its own priority May need to fine tune number of concurrent tasks and how they share and exchange information. More processes/threads = more context switches --> overhead May need fewer threads to improve throughput Communication protocol -- distributed systems Concurrent tasks --> accessing shared data... and all that entails...

June 7, 2002ECEN5043 University of Colorado Performance Tuning 12 Thread & Process Performance (cont.) Have already studied concurrent-process/shared- resource issues -- nuances of (in)correct algorithms Note that books term spin locks applies to what we called short critical sections (resource likely to be available soon) that engage in busy-waiting Careful thought needed to implement appropriate P & V code solutions for long critical sections* If you use a language-defined help, make sure you understand how it really works

June 7, 2002ECEN5043 University of Colorado Performance Tuning 13 Other Choices Algorithms -- books available that discuss performance characteristics. Data structure choices -- influence the speed with which certain tasks can be performed such as searches Hash collections initial size big enough prime number vs. power of two verify that the default hash code method distributes objects evenly over the collectionss buckets. integer keys (base types)

June 7, 2002ECEN5043 University of Colorado Performance Tuning 14 Time vs. Space Algorithms for efficient code May require more data storage to allow the code to be more efficient Data structures for efficient use of memory May require more code to access it Trading off space vs. time is a classic performance choice -- book gives 6 rules on this topic

June 7, 2002ECEN5043 University of Colorado Performance Tuning 15 Performance Solutions for OO Some performance problems are specific to OO software characteristics Some are OO language independent Some are OO language specific Code optimization level limited performance payback code may be harder to understand or modify recommend: use as last resort on the Fast Path and nowhere else may be necessary

June 7, 2002ECEN5043 University of Colorado Performance Tuning 16 Language-independent solutions Reduce unnecessary object creation & destruction Creation of an object requires creation of all those objects nested within Creation of an object in an inheritance hierarchy requires creation of all ancestors -- memory to hold them allocated Destruction of an object requires destruction of all the above and the reclamation of the memory Biggest impact from most complex objects Heap vs. stack

June 7, 2002ECEN5043 University of Colorado Performance Tuning 17 Is it really a good thing? Encapsulation requires method calls to access data values method calls involve stack usage expensive overhead for what may be just an assignment Does your compiler do an inline optimization? If called from several places, inlining at each place may increase overall code size (another space vs. time tradeoff) C++ lets programmer decide when Need to understand language implementations to avoid some and take advantage of other performance characteristics

June 7, 2002ECEN5043 University of Colorado Performance Tuning 18 Now Consider Performance Improvement Suggestions in Multi-Program Context Suggestions re inlining? Suggestions re modifying data structures? Multithreading Synchronization issues? Language choice issues? Optimizing serialization? god class and other method-call inefficiencies? Batching, Coupling and other method-call and data access efficiencies? Customizing to underlying platform?

June 7, 2002ECEN5043 University of Colorado Performance Tuning 19 Are these contradictory? Store Precomputed Results Compute the results of expensive functions once, store the results, and satisfy subsequent requests with a table lookup Postpone evaluation until an item is needed

June 7, 2002ECEN5043 University of Colorado Performance Tuning 20 Second verse, same as the first Many performance tuning suggestions are localized variations of earlier design-level principles and patterns. For example, Fixing-point Principle, Locality Principle, Processing vs. Frequency Principle Even if applying them early in design, may be necessary to apply repeatedly at lower levels of abstraction.

June 7, 2002ECEN5043 University of Colorado Performance Tuning 21 Little dogs have little fleas upon their backs to biteem. And little fleas have lesser fleas And so ad infinitum.