Writing High Performance Delphi Applications

Slides:



Advertisements
Similar presentations
Recursion Chapter 14. Overview Base case and general case of recursion. A recursion is a method that calls itself. That simplifies the problem. The simpler.
Advertisements

Fundamentals of Python: From First Programs Through Data Structures
Algorithm Analysis (Big O) CS-341 Dick Steflik. Complexity In examining algorithm efficiency we must understand the idea of complexity –Space complexity.
Review for Test 2 i206 Fall 2010 John Chuang. 2 Topics  Operating System and Memory Hierarchy  Algorithm analysis and Big-O Notation  Data structures.
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
Elementary Data Structures and Algorithms
Sorting Algorithms and Analysis Robert Duncan. Refresher on Big-O  O(2^N)Exponential  O(N^2)Quadratic  O(N log N)Linear/Log  O(N)Linear  O(log N)Log.
CPU PROFILING FIND THE BOTTLENECK. WHAT? WHEN? HOW?
Pleasures and Pitfalls of Profiling Primož Gabrijelčič.
SIGCSE Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
CIS 068 Welcome to CIS 068 ! Lesson 9: Sorting. CIS 068 Overview Algorithmic Description and Analysis of Selection Sort Bubble Sort Insertion Sort Merge.
Chapter 12 Recursion, Complexity, and Searching and Sorting
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
CS 162 Intro to Programming II Searching 1. Data is stored in various structures – Typically it is organized on the type of data – Optimized for retrieval.
SEARCHING. Vocabulary List A collection of heterogeneous data (values can be different types) Dynamic in size Array A collection of homogenous data (values.
FastMM in Depth Primož Gabrijelčič,
Interface: (e.g. IDictionary) Specification class Appl{ ---- IDictionary dic; dic= new XXX(); application class: Dictionary SortedDictionary ----
1 5. Abstract Data Structures & Algorithms 5.6 Algorithm Evaluation.
Searching and Sorting Searching: Sequential, Binary Sorting: Selection, Insertion, Shell.
Searching Topics Sequential Search Binary Search.
CMPT 120 Topic: Searching – Part 2 and Intro to Time Complexity (Algorithm Analysis)
1 Algorithms Searching and Sorting Algorithm Efficiency.
1 5. Abstract Data Structures & Algorithms 5.6 Algorithm Evaluation.
Jim Fawcett CSE 691 – Software Modeling and Analysis Fall 2000
Measuring Where CPU Time Goes
Data Structures I (CPCS-204)
The Design and Analysis of Algorithms
Faster parallel programs with improved FastMM
Algorithmic Efficency
Informatica PowerCenter Performance Tuning Tips
Introduction to complexity
Sorting by Tammy Bailey
Hashing Exercises.
Teach A level Computing: Algorithms and Data Structures
ITEC 2620M Introduction to Data Structures
Main Memory Management
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
CSCI1600: Embedded and Real Time Software
Memory Management 11/17/2018 A. Berrached:CS4315:UHD.
Advanced Debugging in Delphi
Data Structures & Algorithms Priority Queues & HeapSort
Computer Science 2 Hashing
Searching and Sorting Topics Sequential Search on an Unordered File
CS 3343: Analysis of Algorithms
Welcome to CIS 068 ! Lesson 9: Sorting CIS 068.
Writing High Performance Delphi Application
Introduction to Testing and Maintainable code
Searching and Sorting Topics Sequential Search on an Unordered File
Analyzing an Algorithm Computing the Order of Magnitude Big O Notation
Measuring “Work” Linear and Binary Search
Simple Sorting Methods: Bubble, Selection, Insertion, Shell
Searching, Sorting, and Asymptotic Complexity
Database Design and Programming
Contents Memory types & memory hierarchy Virtual memory (VM)
Analysis of Algorithms
Analysis of Algorithms
Revision of C++.
Min Heap Update E.g. remove smallest item 1. Pop off top (smallest) 3
Data Structures Introduction
Fundaments of Game Design
CSE 589 Applied Algorithms Spring 1999
Analysis of Algorithms
Chapter 11 Instructor: Xin Zhang
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Working with the Compute Block
CSCI1600: Embedded and Real Time Software
Rohan Yadav and Charles Yuan (rohany) (chenhuiy)
Analysis of Algorithms
CSE 542: Operating Systems
Presentation transcript:

Writing High Performance Delphi Applications Primož Gabrijelčič

http://tiny.cc/dk2018gp1 http://tiny.cc/dk2018hp

About performance

Performance What is performance? How do we “add it to the program”? There is no silver bullet!

What is performance? Running “fast enough” Raw speed Responsiveness Non-blocking

Improving performance Analyzing algorithms Measuring execution time Fixing algorithms Fine tuning the code Mastering memory manager Writing parallel code Importing libraries Execution time: a bit on that tomorrow Parallel code: sic

Algorithm complexity

Algorithm complexity Tells us how algorithm slows down if data size is increased by a factor of n O() O(n), O(n2), O(n log n) … Time and space complexity

O(1) accessing array elements O(log n) searching in ordered list O(n) linear search O(n log n) quick sort (average) O(n2) quick sort (worst), naive sort (bubblesort, insertion, selection) O(cn) recursive Fibonacci, travelling salesman

Comparing complexities Data size O(1) O(log n) O(n) O(n log n) O(n2) O(cn) 1 10 4 43 100 512 8 764 10.000 1029 300 9 2.769 90.000 1090

Complexities in RTL http://bigocheatsheet.com/ Lists Dictionary Trees access O(1) search O(n) / O(n log n) sort O(n log n) Dictionary * O(1) search by value O(n) Unordered! Spring4D 1.2.1: CreateSortedDictionary Trees * O(log n) Spring4D 1.2.1: TRedBlackTree http://bigocheatsheet.com/

Measuring performance

Measuring Manual Automated - Profilers GetTickCount QueryPerformanceCounter TStopwatch Automated - Profilers Sampling Instrumenting Binary Source

Free profilers AsmProfiler Sampling Profiler André Mussche Instrumenting and sampling 32-bit https://github.com/andremussche/asmprofiler Sampling Profiler Eric Grange Sampling https://www.delphitools.info/samplingprofiler

Commercial profilers AQTime Nexus Quality Suite ProDelphi Instrumenting and sampling 32- and 64-bit https://smartbear.com/ Nexus Quality Suite Instrumenting https://www.nexusdb.com ProDelphi Instrumenting (source) http://www.prodelphi.de/

A task for you! If you choose to accept it …

Task 1_1 Primoz\Task11 Move one line in function Test to a different place (in the same function) to make the code run faster. About 8x on my laptop

Fixing the algorithm

Fixing the algorithm Find a better algorithm If a part of program is slow, don’t execute it so much If a part of program is slow, don’t execute it at all Borrowed from CI philisophy: If it hurts, do it more often.

“Don’t execute it so much” Don’t update UI thousands of times per second Call BeginUpdate/EndUpdate Don’t send around millions of messages per second

“Don’t execute it at all” UI virtualization Virtual listbox Virtual TreeView Memoization Caching Dynamic programming TGpCache<K,V> O(1) all operations GpLists.pas, https://github.com/gabr42/GpDelphiUnits/

A task for you! If you choose to accept it …

Task 1_2 Primoz\Task12 Make function Test run faster!

Fine tuning the code

Compiler settings

Behind the scenes Strings Arrays Records Classes Interfaces Reference counted, Copy on write Arrays Static Dynamic Reference counted, Cloned Records Initialized if managed Classes Reference counted on ARC Interfaces Reference counted ARC is going away!

Calling methods Parameter passing Inlining Dynamic arrays are strange Single pass compiler!

A task for you! If you choose to accept it …

Task 1_3 Primoz\Task13 Without changing the program architecture, make it faster! 4 changes: OPTIMIZATION ON, RANGECHECKS OFF, for..in, const data

Memory management

Why memory manager? No fine-grained allocation on OS level Speed Additional functionality (debugging) VirtualAlloc = 4K (ok, one page)

String and memory allocations NO†: string := string + ‘c’ NO†: SetLength(array, Length(array) + 1) KIND OF OK‡: for … do list.Add(something) † KIND OF OK with a good memory manager ‡ May waste LOTS of memory [but improved in 10.3]

Memory management functions GetMem, AllocMem, ReallocMem, FreeMem GetMemory, ReallocMemory, FreeMemory New, Dispose Initialize, Finalize

Records vs. objects Objects Records† TObject.NewInstance TObject.InstanceSize GetMem InitInstance constructor (inherited …) AfterConstruction (inherited …) Records† [_Initialize] † Will change in 10.3

FastMM4: 58 memory managers in one! Image source: ‘Delphi High Performance’ © 2018 Packt Publishing

Optimizations Reallocation Allocator locking Small blocks: New size = at least 2x old size Medium blocks: New size = at least 1.25x old size Allocator locking Small block only Will try 2 ‘larger’ allocators Problem: Freeing memory allocIdx := find best allocator for the memory block repeat if can lock allocIdx then break; Inc(allocIdx); Dec(allocIdx, 2) until false

Optimizing parallel allocations FastMM4 from GitHub https://github.com/pleriche/FastMM4 DEFINE LogLockContention or DEFINE UseReleaseStack

Alternatives ScaleMM TBBMalloc https://github.com/andremussche/scalemm https://www.threadingbuildingblocks.org

A task for you! If you choose to accept it …

Task 1_4 Primoz\Task14 Optimize memory allocations to make the program run faster!