Sequential Logic for Synthesis Based on Aldec Active-HDL ECE 448: Spring 11 Lab 3 Sequential Logic for Synthesis FPGA Design Flow Based on Aldec Active-HDL Add design flow from lecture 1 Why are we interested in PRNG Generate with R=8 Reduce PRNG to 3 slides: purpose, GIF, example Add Loading circuit for PRNG
Agenda for today Part 1: Introduction to the new Lab Assignment: Square Root Unit based on CORDIC Part 2: FPGA Design Flow based on Aldec Active-HDL - using Xilinx XST - using Synplify Premier DP Part 3: Demos of Lab 2 2
Introduction to the new Lab Assignment Part 1 Introduction to the new Lab Assignment Square Root Unit based on CORDIC 3
CORDIC Algorithms - Motivation Operations such as trigonometric functions, division, and logarithms are not synthesizable. Some alternative methods Lookup tables Can require large amounts of memory. Taylor/Maclaurin series Requires multipliers CORDIC algorithms Small area = Inexpensive in hardware High latency
CORDIC Algorithm for Square Root Calculates Pseudocode y = 0 for i=N/2-1 downto 0 do temp = (y + 2i)2 if temp ≤ x then y = y + 2i end if end for sqrt_x = y (y + 2i)2 = y2 + (2i+1)y + 22i
Modified Pseudocode y = 0 y_sq = 0 for i=N/2-1 downto 0 do temp = y_sq + (2i+1)y + 22i if temp ≤ x then y = y + 2i y_sq = temp end if end for sqrt_x = y All computations performed using only addition, bit shifts, and comparisons.
Example N = 8, x = 26 i = 3, temp = 0 + 2(0)(8) + 82 = 64, y = 0, y_sq = 0 i = 2, temp = 0 + 2(0)(4) + 42 = 16, y = 4, y_sq = 16 i = 1, temp = 16 + 2(4)(2)+ 22 = 36, y = 4, y_sq = 16 i = 0, temp = 16 + 2(4)(1) + 12 = 25, y = 5, y_sq = 25 Done! sqrt_x = 5
Block Diagram “1000…..000” out_valid x in_valid Shift Reg. sqrt_x + load in_valid Shift Reg. ld_en sqrt_x N/2 Q(0) ‘0’ s_in Q N/2 2i Q D rst en + A A << B y N/2 N B (2i+1)y i+1 ld_en Down counter N Q N/2 -1 load Q A ≥ B A B L-1 N (2i )(2i)=22i A A << B L-1 i + B temp +1 L y_sq Q D rst en N L = ceil(log2(N)) N
Bonus Make output with M variable. Allows greater output precision Output is of form:
Bonus Pseudocode y = 0 y_sq = 0 x_shifted = x << (2M – N) for i=M-1 downto 0 do temp = y_sq + (2i+1)y + 22i if temp ≤ x_shifted then y = y + 2i y_sq = temp end if end for sqrt_x = y
Example N = 8, M = 6, x = 42 x_shifted = 42 << (2(6) – 8) = 42 << 4 = 672. i = 5, temp = 0 + 2(0)(32) + 322 = 1024, y = 0, y_sq = 0 i = 4, temp = 0 + 2(0)(16) + 162 = 256, y = 16, y_sq = 256 i = 3, temp = 256 + 2(16)(8) + 82 = 576, y = 24, y_sq = 576 i = 2, temp = 576 + 2(24)(4) + 42 = 784, y = 24, y_sq = 576 i = 1, temp = 576 + 2(24)(2) + 22 = 676, y = 24, y_sq = 576 i = 0, temp = 576 + 2(24)(1) + 12 = 625, y = 25, y_sq = 625 Done! x_sqrt = 25. 25/22 = 6.25 Check : sqrt(42) = 6.481
Bonus Diagram “1000…..000” x << (2M-N) out_valid in_valid Q out_valid load in_valid Shift Reg. ld_en sqrt_x M Q(0) ‘0’ s_in Q 2i M Q D rst en + A A << B M y 2M B (2i+1)y i+1 ld_en Down counter 2M Q M -1 load Q A ≥ B A B L-1 i A 2M A << B (2i )(2i)=22i L-1 + B temp +1 L y_sq Q D rst en 2M L = ceil(log2(2M)) 2M
FPGA Design Flow based on Aldec Active-HDL Part 2 FPGA Design Flow based on Aldec Active-HDL 13
FPGA Design process (1) Specification (Lab Assignments) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds….. Specification (Lab Assignments) On-paper hardware design (Block diagram & ASM chart) VHDL description (Your Source Files) Library IEEE; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core; Functional simulation Synthesis Post-synthesis simulation
FPGA Design process (2) Implementation Timing simulation Configuration On chip testing
Design Process control from Active-HDL
Synthesis Tools Xilinx XST Synplify Premier DP
Logic Synthesis VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1; with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW;
Implementation After synthesis the entire implementation process is performed by FPGA vendor tools Xilinx ISE/WebPACK