Download presentation
Presentation is loading. Please wait.
1
Reconfigurable Computing University of Arkansas
Dr. Christophe Bobda CSCE Department University of Arkansas
2
Chapter 05 Applications
3
Overview FPGAs have been used in the past mostly in Rapid prototyping
Non-frequent reconfigurable systems Hardware implementation, sometimes specific for the FPGA architectures have been implemented. The most famous are: Searching (text, genetic database, etc..) Image processing Mechanical control etc
4
Agenda Pattern matching Distributed arithmetic Video processing
Mechanical control Cryptography Software Defined Radio
5
1. Pattern matching Pattern matching is the basis of search engine
The purpose is to find and (count) the occurrence of a given pattern in a given text Useful in: Dictionaries Document collection indexing Document filtering and classification Spams avoidance Content surveillance
6
1. Pattern matching The sliding windows (Cockscot & Foulk )
Keywords are kept in register. One character / Byte A set of comparators are used. One comparator / Byte Hit signal is set whenever the text- segment matches the corresponding word Advantage: Easy to replace old patterns Drawbacks: Not flexible: Fixed length of registers Redundancy: more comparators than necessary for word with same prefix
7
1. Pattern matching Avoid redundancy Data folding(Foulk )
Use only one comparator for common characters in different words Data folding(Foulk ) Folds the data in the circuit Consider the bit-representation of each character Generate a comparator circuit for each character in the words to be searched 8-bit comparator Comparator
8
1. Pattern matching FSM-Based pattern matcher
Each regular grammar can be recognized by an FSM In pattern matching, the target words define the regular grammar The target words are compiled in the automaton Each word defines a unique path from the start state to an end state When scanning a text, the automaton changes its state with the appearance of characters Reaching a final state corresponding to the appearance of a word Redundancy is avoided by implementing common prefix FSM-Recognizer and corresponding state transition table for the word conte
9
1. Pattern matching FSM-Based pattern matcher RAM-implementation
One RAM or ROM for storing the states One state register One character register A hit detector The Input character and the state register are used to determine the next state The hit detector checks if the current state is equal to a hit state and sets a hit for the corresponding word Advantage: Simple to implement Drawback: Expensive in terms of flip flops Char reg RAM ROM Character stream Hit detect Next state State Reg RAM/ROM implementation of the word recognizer
10
1. Pattern matching FSM-Based pattern matcher c o n t e
One-hot-implementation Each state is coded in one flip flop The D-input of the flip flop is obtained by an AND of the output of the previous flip flop with the result of the comparator The comparator is character specific Only n FF are used to implement a word of length n Advantage: Low cost Reflects the structure of the grammar Drawback: Not easy to build Redundancy in the comparators c o n t e Character specific comparators
11
1. Pattern matching FSM-Based pattern matcher Exploiting common prefix
For words with common prefix, only on starting path corresponding to the length of the common prefix is used. Redundancy of comparators can be avoided by implementing only one comparator for each character. The result of the comparison will then be provided to all gates using them Words with common prefix and the corresponding FSM
12
1. Pattern matching FSM-Based pattern matcher Optimized architecture
Implement the common prefix Redundancy of comparators is removed: Each character in the set is implemented in a position vector pos(i) = 1 iff character i is detected Block diagram of the optimal pattern matcher Detailed structure of the optimal pattern matcher
13
1. Pattern matching FSM-Based pattern matcher New character comparator
Use of reconfiguration Replace the character comparators Replace the FSM for a set of words New character comparator New set of words
14
2. Distributed arithmetic
Signal applications (FFT, Convolution, Filter algorithms) are characterized by MAC-intensive computations Signal processing function are usually implemented on special processors DSPs ASIC FPGAs provide the reconfigurability advantage, But MAC intensive applications are expensive However for MAC-computation involving one constant vector, FPGAs present one of the best alternatives to DSPs
15
2. Distributed arithmetic
𝑍=𝐴∗𝑋= 𝑖=0 𝑛 𝐴 𝑖 ∗ 𝑋 𝑖 Solution of the following equation: , A constant 𝑋 𝑖 = 𝑗=0 𝑊 𝑋 𝑖𝑗 2 𝑗 With the binary representation of 𝑋 𝑖 𝑍=𝐴∗𝑋= 𝑖=0 𝑛 𝐴 𝑖 ∗ 𝑗=0 𝑊 𝑋 𝑖𝑗 ∗ 2 𝑗 = 𝑗=0 𝑊 2 𝑗 ∗ 𝑖=0 𝑛 𝐴 𝑖 ∗ 𝑋 𝑖𝑗 𝑍= 𝑗=0 𝑊 2 𝑗 ∗ 𝑖=0 𝑛 𝐴 𝑖 ∗ 𝑋 𝑖𝑗 Is the classical form of distributed arithmetic 𝑖=0 𝑛 𝐴 𝑖 ∗ 𝑋 𝑖𝑗 Because the 𝐴 𝑖 2 𝑛 are constant, there exist Possible values for We can pre-compute the possible values and store them in a LUT (DALUT) and retrieve them on demand at run-time FPGA Advantage: Computation is memory-based (use of LUT)
16
2. Distributed arithmetic
To better understand, we spread the DA equation 𝑋 10 ∗ 𝐴 1 + 𝑋 20 ∗ 𝐴 2 Z=[ 𝑋 𝑛−1 0 ∗ 𝐴 𝑛−1 + 𝑋 𝑛0 ∗ 𝐴 𝑛 ] 2 0 𝑋 11 ∗ 𝐴 1 + 𝑋 21 ∗ 𝐴 2 + [ 𝑋 𝑛−1 1 ∗ 𝐴 𝑛−1 + 𝑋 𝑛1 ∗ 𝐴 𝑛 2 1 𝑋 1𝑊 ∗ 𝐴 1 + 𝑋 2𝑊 ∗ 𝐴 2 𝑋 𝑛−1 𝑊 ∗ 𝐴 𝑛−1 + 𝑋 𝑛𝑊 ∗ 𝐴 𝑛 2 𝑊 The bits of the variables will be used to address the memory and retrieve the required values in a bit-serial way. The DA-datapath implementation is straightforward
17
2. Distributed arithmetic
DA-LUT Address DA-LUT 𝑋 1𝑊 𝑋 1 𝑊−1 ...... 𝑋 11 𝑋 10 𝑋 2𝑊 𝑋 2 𝑊−1 𝑋 21 𝑋 20 𝑋 𝑛𝑊 𝑋 𝑛 𝑊−1 𝑋 𝑛1 𝑋 𝑛0 ... 𝐴 0 𝐴 1 𝐴 1 + 𝐴 0 𝐴 2 𝐴 2 + 𝐴 0 𝐴 2 + 𝐴 1 𝐴 2 + 𝐴 1 + 𝐴 0 𝐴 3 Parallel bit-serial input j-shift Z +/-
18
2. Distributed arithmetic
k-parallel 𝑋 1𝑊 𝑋 1 𝑊−1 ...... 𝑋 11 𝑋 10 𝑋 2𝑊 𝑋 2 𝑊−1 𝑋 21 𝑋 20 𝑋 𝑛𝑊 𝑋 𝑛 𝑊−1 𝑋 𝑛1 𝑋 𝑛0 DA-LUT 1 DA-LUT 2 DA-LUT k ACC 1 ACC 2 ACCk Z Adder tree
19
2. Distributed arithmetic
Recursive convolution of time domain simulation of optical multimode intrasystem interconnects Recursive formula to be implemented on 3 intervals 𝑦 𝑡 𝑛 = 𝑓 0 ∗𝑦 𝑡 𝑛−1 + 𝑓 4 ∗ 𝑥 0 − 𝑓 5 ∗ 𝑥 1 + 𝑓 24 ∗ 𝑥 2 + 𝑓 53 ∗ 𝑥 3 Comparison of different implementations Virtex 2000E implementation on the Celoxica RC1000-PP board
20
3. Image processing Image processing algorithms usually process an image (set of points with a given characteristics like color, gray level, luminance, etc..) point by point. The resulting pixels depend only on the pixels in the original picture. A sequential processor needs quadratic run-time to process a complete image. By using parallelism, each pixel can be computed independently. Many image processing system are based on the following operators: Median filtering Basic Morphological operations Convolution Edge detection Algorithms are usually based on the moving window operator
21
3. Image processing The moving or sliding window usually process one pixel of the image at a time The value of the pixel is changed by a function of a local pixel region covered by the window The operator moves over the image to cover all pixels For a pipelined implementation, all the pixel of the windows must be accessed at the same time for each clock FPGA implementation uses FIFO buffers 3x3 and 5x5 moving windows
22
3. Image processing FIFO Implementation 3x3 windows
FIFO are implemented using circular buffers constructed from Multi-ported RAMs (Available in e.g Virtex FPGA) Indexes keep tracks of the front and tail items in the buffer BLOCK RAMs are readable and writable in one clock-cycle. This allows a throughput of one pixel per cycle. 3x3 windows 2 buffers with size W-3 (W = image width) are used The two FIFO buffers must be full to access all the window pixels in one cycle For every clock cycle a pixel is read from the memory and placed into the bottom left corner The content of the window is shifted to the right with the right most member being added to the tail of the FIFO The top right pixel is disposed after computation, since it is not used in the future computation
23
3. Image processing – Median filtering
Basics An impulse noise (or salt and pepper noise) in an image has a gray level with higher low different from the neighbor point. Linear filters have no ability to remove this type of noise Median filters share remarkable advantages on removing those type of noise Very used in digital signal and image/video applications Implementation Use a sliding window of odd size (ex. 3X3) over an image At each window position the median of the sample values is taken to replace the value at the center of the window High computational cost O(NlogN) even with most efficient sorting algorithms General purpose processors are not a good solutions for real time implementation. This justified the used of FPGAs.
24
3. Image processing – Median filtering
Sequential implementation (pseudo code) For x=1 to # rows For y = 1 to # cols Build Windows array pixel(x,y) = Median(window array) End Hardware sorting implementation
25
3. Image processing – Median filtering - result
Original image Filtered image
26
3. Image processing – Basic Morphological Operators
Morphology in image processing studies the appearance of objects. Useful for example in: Skeletonization Edge detection Restoration Processing The image is processed pixel-by-pixel using a structuring element (the sliding windows) The window may fit or not to the image Most basic building blocks: Erosion (shrinks or erodes an object in the image) Dilation (grows the image) Operations like opening and closing of an image can be derived by performing erosion and dilation in different order
27
3. Image processing – Basic Morphological Operators
Erosion Replaces the center pixel in the sliding window by the smallest pixel value in the window array The bright area of the image shrinks, or erodes Dilation Replaces the center pixel in the sliding window by the greatest pixel value in the window array The bright area of the image grows Algorithm Same as the median Instead of selecting the median element, the minimum is selected for erosion and the maximum is selected for dilation
28
3. Image processing – Median filtering - result
Original image Erosion Dilation
29
3. Image processing – Use of reconfiguration
Intelligent image processing system According to input image and other conditions, Some operations are done to improve the image Filtering (the correct filter is chosen) Smoothing Segmentation (Edge detection) Skeletonization Some adjustments are done on the image input hardware Calibration Focussing Everything is done while the system keeps running Fixed parts of the system will run continuously Reconfigurable must be replace at run-time
30
3. Image processing – Use of reconfiguration
System architecture RAM RAM RAM RAM Capture Process1 ProcessN Rendering Intermediate data exchange
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.