Download presentation
Presentation is loading. Please wait.
Published byDaisy Clarke Modified over 9 years ago
1
Ultra sound solution Impact of C++ DSP optimization techniques
2
Research Team discussion Ultra-sound probe (20 MHz) that sends out signals into body that reflect off moving blood cells in (Artery? Vein?) Ultra-sound frequency received is Doppler shifted compared to transmitted frequency Same as sound when ambulance goes by. Higher if approaching, lower if receding They get the positive frequencies (towards) on the left audio channel and negative frequencies (away) on the right audio channel. 9/17/2015.ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 2 / 33
3
Picture looks like this Note that the display loses all direction information Can I help them to output the maximum frequency? 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 3 / 33
4
Captured audio signal 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 4 / 33 Engineering Problems Problem 5 – Different amplitudes common Problem 6 – Why are funny dead spots not lining up in left and right channels? Handling stereo not mono signals Incorrect labeling / misinterpreation Problem 7 – How to remove dead-spots?
5
Max frequency – definition 1 Frequency below which X% of the frequencies fall Noisy signal for large thresholds > 80% 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 5 / 33
6
After XPI Stage 2 Have a working algorithm concept Engineering problem 1 – Complex math (a + jb) on SHARC! Engineering Problem 2 – Define maximum frequency zillions of blood cells – therefore distribution of frequencies Workable prototype – discuss more with customer Engineering Problem 3 – SHARC D/A can’t handle DC signal Workable prototype – discuss more with customer Engineering Problem 4 – Can SHARC handle all this in real-time? Problem 5 – Is different amplitudes of input channels common? Yes Problem 6 – Why are funny dead spots not lining up in left and right channels? Artifact – mislabeled and misinterpreted sampled Problem 7 – How to remove dead-spots? – Discuss more with customer 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 6 / 33
7
ProcessBlock DONE OUTSIDE INTERRUPT AVOIDS RACE 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 7 / 33
8
Real life problem -- Stereo Minor changes to Audio Premptive Task 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 8 / 33
9
Make “C – code more general Moved buffer[ ] to external files Unknown size of arrays being processed 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 9 / 33
10
Switch to Release mode Switch to optimizing compiler (ReleaseNWC) means can no longer set breakpoints – Fix with these steps 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 10 / 33
11
First look at code Timing -- software loop with r2 as loop counter – test at end N * (10 – 1) cycles (jump is not db) -1 for 1 parallel instruction 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 11 / 33
12
Use Compiler Info button 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 12 / 33 3 Stalls – 2 on software jump. 1 on ?
13
Obvious things to do We are already processing left and right channels in one program Switch to left audio in dm memory and right audio in pm memory Need to do Make right buffers ‘pm’ Change prototype of function to padd pm 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 13 / 33
14
As expected 2 cycles saved Parallel dm and pm reads and writes 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 14 / 33
15
Why software loop? Switch does know what to do about size of loop so can’t oprtimize loop 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 15 / 33 THIS PRAGMA IS A CONTRACT BETWEEN THE DEVELOPER AND COMPILE DON’T LIE
16
This does not compile Pragma variables not handled by preprocessor 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 16 / 33
17
Variable as end of loop Compile will not optimize when loop parameter is declared external, or internal or static 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 17 / 33
18
Loop parameters all constants known to compiler Drop from 8 cycles to 2 cycles as compiler knows enough to switch to hardware loop control – STALLS FROM JUMP GONE 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 18 / 33
19
Where am I getting all my info? 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 19 / 33
20
Can we switch to SIMD mode VECTORIZATION MAY NOT BE POSSIBLE IF COMPILER DOES NOT KNOW ABOUT ALIGNMENT OF ARRAYS (How arrays placed in memory) 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 20 / 33
21
Impact of vectorization Before -- loop count was 0x80 With memory operations of the form r2 = dm(i4, m6) where m6 = 1 meaning code is doing r2 = i4++; 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 21 / 33
22
New instructions – SIMD mode Bit set mode1 0x200000 (bit clr mode 1) Processor doing r2 = dm(i5, 2) Same as r2 = dm(i5, 1) AND s2 = dm(i5, 1) Loading two registers 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 22 / 33
23
Try using #pragma inline BEFORE AFTER (20 cycles faster?) 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 23 / 33
24
C++ showing out of order execution 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 24 / 33 WARNING
25
Lets do “inline” ProcessOneBlock( ) is called by four subroutines – lets in 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 25 / 33
26
Mixed mode view is interesting 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 26 / 33
27
Mixed Mode Out of order execution with 4 copies of the code for DoCopyBlock( ) (one for each of Process 0, Process1, Process2, Process 3) NO CODE OF ProcessOneBlock( ) 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 27 / 33
28
Speed improvement Moving from software loop and using dm and pm memories caused a change from 8 cycles / pt to 2 cycles for two points processed in SIMD (4 CALLS * 7 CYCLES SAVED * N POINTS PROCESSED) Moving to IN_LINE causes a change of around 120 cycles for each subroutine call (4 CALLS * 120 CYCLES SAVED) N = 128 -- (4 * 1800 to 4 * 120) 480 Mhz processor -- 15 us to 1 us LESSON LEARNT – SPEND YOUR TIME OPTIMIZING THE LOOPS – REST IS SMALLER AND GETS SMALLER WITH LARGER N 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 28 / 33
29
Other improvements depend on code Characteristics specifics 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 29 / 33
30
9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 30 / 33
31
Profile guided optimization 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 31 / 33
32
Memory alignment can be important After first char fetch, system and move to move 8 chars in SIMD 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 32 / 33
33
9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 33 / 33
34
Conditional code (manual PGO) 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 34 / 33
35
Correct ways to process loops 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 35 / 33
36
9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 36 / 33
37
9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 37 / 33
38
#pragma all_aligned #pragma loop_unroll N #pragma SIMD_for #pragma align num #pragma alignment_region( and #pragma alignment_region_end 9/17/2015 ENCM515 – Ultrasound Problem Copyright smithmr@ucalgary.ca 38 / 33
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.