Download presentation
Presentation is loading. Please wait.
1
Divide-and-Conquer Design
Ladner-Fischer construction T(n) = T(n/2) + 1 = log2n C(n) = 2C(n/2) + n/2 = (n/2) log2n Simple Ladner-Fisher Parallel prefix network (its delay is optimal, but has fan-out issues if implemented directly) Prefix sum network built of two n/2-input networks and n/2 adders. Fall 2008 Parallel Processing, Extreme Models
2
8.4 Parallel Prefix Networks
T(n) = T(n/2) + 2 = 2 log2n – 1 C(n) = C(n/2) + n – 1 = 2n – 2 – log2n This is the Brent-Kung Parallel prefix network (its delay is actually 2 log2n – 2) Fig Prefix sum network built of one n/2-input network and n – 1 adders. Fall 2008 Parallel Processing, Extreme Models
3
Example of Brent-Kung Parallel Prefix Network
Originally developed by Brent and Kung as part of a VLSI-friendly carry lookahead adder One level of latency T(n) = 2 log2n – 2 C(n) = 2n – 2 – log2n Brent–Kung parallel prefix graph for n = 16. Fall 2008 Parallel Processing, Extreme Models
4
Example of Kogge-Stone Parallel Prefix Network
T(n) = log2n C(n) = (n – 1) + (n – 2) + (n – 4) n/2 = n log2n – n + 1 Optimal in delay, but too complex in number of cells and wiring pattern Kogge-Stone parallel prefix graph for n = 16. Fall 2008 Parallel Processing, Extreme Models
5
Comparison and Hybrid Parallel Prefix Networks
Brent/Kung 6 levels 26 cells Kogge/Stone 4 levels 49 cells Fig A hybrid Brent–Kung / Kogge–Stone parallel prefix graph for n = 16. Han/Carlson 5 levels 32 cells Fall 2008 Parallel Processing, Extreme Models
6
Another Divide-and-Conquer Algorithm
T(p) = T(p/2) + 1 T(p) = log2 p Strictly optimal algorithm, but requires commutativity Each vertical line represents a location in shared memory Another divide-and-conquer scheme for parallel prefix computation. Fall 2010 Parallel Processing, Shared-Memory Parallelism
7
کاربردهایی از شبکه جمع پیشوندی
شماره بیتهای یک کشف بیت یک دارای اولویت
8
n-th Root of Unity Xn = 1 wnk+n/2 = -wnk wndkd = wnk
9
DFT, DFT-1 O(n2)-time naïve algorithm using Horner-evaluation
10
Application of DFT to Smoothing or Filtering
Fall 2008 Parallel Processing, Extreme Models
11
Application of DFT to Spectral Analysis
Fall 2008 Parallel Processing, Extreme Models
12
Multiplying Two Polynomial
13
Multiplying Two Polynomial
15
Fast Fourier Transform (FFT)
16
Parallel Processing, Extreme Models
Fast Fourier Transform (FFT) yi = ui + wni vi (0 i < n/2) yi+n/2 = ui + wni+n/2 vi = ui - wni vi T(n) = 2T(n/2) + n = n log2n sequentially T(n) = T(n/2) + 1 = log2n in parallel Fall 2008 Parallel Processing, Extreme Models
20
Computation Scheme for 16-Point FFT
Fall 2008 Parallel Processing, Extreme Models
21
Parallel Architectures for FFT
Butterfly network for an 8-point FFT. Parallel Processing, Extreme Models
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.