Download presentation
Presentation is loading. Please wait.
Published byBeverly Bond Modified over 9 years ago
1
Parallel Deposit (bit scatter) Deposits in the result register, at positions flagged by 1’s in r 3, the right justified bits from r 2 Yedidya Hilewitz and Ruby B. Lee, “Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors,” to appear in Journal of VLSI Signal Processing Systems. Yedidya Hilewitz and Ruby B. Lee, “Fast Bit Compression and Expansion with Parallel Extract and Parallel Deposit Instructions,” Proceedings of the IEEE 17th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), pp. 65-72, September 11-13, 2006 (Best Paper Award).1111111111111 Advanced Bit Manipulation Instructions for Commodity Processors Yedidya Hilewitz and Ruby B. Lee Princeton Architecture Laboratory for Multimedia and Security Department of Electrical Engineering, Princeton University Background and Motivation Advanced bit manipulations are not well supported by commodity microprocessors These operations are performed using “programming tricks” (see Hacker’s Delight ) Bit manipulations play a role in applications of increasing importance We propose adding direct support for a few key bit manipulation operations to accelerate these applications Example Applications New Instructions Butterfly and Inverse Butterfly Parallel Extract and Parallel Deposit Bit Matrix Multiply Summary and Conclusions Ongoing and Future Work Applications (and Speedup) Permutation Butterfly and Inverse Butterfly Bit Gather and Bit Scatter Parallel Extract and Parallel Deposit Bit Matrix Multiply Other bit manipulation instructions (not covered here) Bit matrix transpose Population count Advanced bit manipulations play an important role in many applications We have introduced a few select bit manipulation instructions that speed up these applications We have evolved the shifter to a new design using butterfly and inverse butterfly datapaths to support basic and advanced bit manipulation instructions Advanced bit manipulations are no longer esoteric “programming tricks” but rather supported directly by microprocessors at only a marginal cost Cryptography Random number generation Von Neumann Extractor Toeplitz Matrix Multiply Steganography Cryptanalysis (Gaussian elimination) Other applications: Binary compression Binary image morphology Bioinformatics Communications coding FFT Finite field arithmetic Integer compression Pattern matching Other applications suggested by you! (up to 2.24× speedup) (9.9× speedup) (14.9× speedup) (2.92× speedup) Identify new applications where bit manipulation instructions are useful (e.g., LFSR and FCSR RNGs, software radio) Implementation Refine current circuit implementation Integrate new shifter in scalable crypto co- processor (PAX) Butterfly lg( n ) stages of n 2:1 MUXes split into n /2 pairs that pass through or swap inputs bfly+ibfly = general permutation network Any of the n ! permutations of n bits can be done with one pass of both instructions Inverse Butterfly Parallel Extract (bit gather) extracts bits from r 2 flagged by 1’s in r 3 and compresses and right justifies in result register r2r2r2r2 r1r1r1r1 r3r3r3r31111111111111 r2r2r2r2 r1r1r1r1 r3r3r3r3 Cryptography – permutations in ciphers and hash functions, e.g., TDES: Random Number Generators – extract bits from source of entropy Von Neumann Extractor (Intel RNG) – given bit-pair sequence { x 2 i, x 2 i +1 } from entropy pool, extract x 2 i if the bits differ: Toeplitz Matrix Multiply Extractor – multiply bit string from entropy pool by a binary Toeplitz matrix: LSB Steganography – embed secret message in least significant bits of image or audio file: bmm.n C = B, A A, B, C : n × n bit matrices: C = A × B mod 2 for i from 1 to n for j from 1 to n c i, j = a i,1 b 1,j a i,2 b 2,j … a i,n b n,j bmm.8 unit (pictured above) can be directly incorporated into the ALU (<¼ size) Yedidya Hilewitz and Ruby B. Lee, “Achieving Very Fast Bit Matrix Multiplication in Commodity Microprocessors,” Princeton University Department of Electrical Engineering Technical Report CE-L2007-006, August 2007. New Shifter Architecture Brand new shifter architecture that replaces the shifter with a new unit that directly supports bit manipulation operations New shifter performs basic shifter operations: shift, rotate, extract and deposit multimedia shift-permute operations: mix advanced bit manipulation operations: bfly, ibfly, pex, pdep Yedidya Hilewitz and Ruby B. Lee, “A New Basis for Shifters in General-Purpose Processors for Existing and Advanced Bit Manipulations,” to appear in IEEE Transactions on Computers. Yedidya Hilewitz and Ruby B. Lee, “Performing Advanced Bit Manipulations Efficiently in General-Purpose Processors,” Proceedings of 18 th IEEE Symposium on Computer Arithmetic (ARITH-18), June 2007.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.