硬體描述語言 Verilog範例電路設計 國立中興大學電機系 廖彥璋、黃穎聰
Introduction Goal: get familiar with the Verilog coding through a set of design examples Classifications of design examples Combinational logic Data storage and memory Counter and finite state machine (FSM) each design example includes Circuit function description Verilog coding Synthesis results Design symbol Synthesized circuit schematic Gate count report Simulation results
Part A. Combinational Logic Design
Outline one bit full adder 4-bit full adder design (unsigned) 4-bit adder/subtractor design. 8-input Priority encoder BCD to binary converter design 7-segment LED display decoder Odd parity checker Absolute value function Simplified 8-bit ALU Design
1. One Bit Adder (1) A1. A one bit full adder design, input a, b, cin; output sum, cout; a) Using behavioral modeling b) Using data flow modeling 1 Bit 加法器基本電路結構
1. One Bit Adder (2) a) Verilog design Using behavioral modeling
1. One Bit Adder (3) Gate level design of the synthesis result b XOR AO cin XOR a
1. One Bit Adder (4) b) Using data flow modeling Parenthesis indicates preference in logic implementation
1. One Bit Adder (5) “a” and “b” are XORed first because of the () specification in expression Although behavioral and data flow modeling yield identical logic structure, i.e., 2XOR and 1AO, the input orderings are different
1. One Bit Adder (6) Simulation results sum Behavioral cout sum Data flow cout Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U4 XOR2X1 slow 11.881800 U5 AO22X1 slow 10.184400 U6 XOR2X1 slow 11.881800 Total 3 cells 33.947999
2. Four Bit Adder (1) A2. A 4-bit full adder design (unsigned) Using structural level modeling and constructing the adders with four 1-bit adders Using behavioral modeling a)小題使用第一小題得到的module串連程下面圖中的樣子完成。 b)小題則是直接使用behavior level來完成。
2. Four Bit Adder (2) Using structural level modeling and constructing the adders with four 1-bit adders Use name mapping
2. Four Bit Adder (3) structural modeling Symbol view Symbol view will show all the interface signals in module declaration
2. Four Bit Adder (4) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- u1 adder_1bit_dataflow_0 33.947999 h u2 adder_1bit_dataflow_3 33.947999 h u3 adder_1bit_dataflow_2 33.947999 h u4 adder_1bit_dataflow_1 33.947999 h Total 4 cells 135.791996 ***** End Of Report ***** Note that the area of 1-bit full adder is 33.94 The area of the 4-bit adder is roughly 4 times larger
2. Four Bit Adder (5) Using behavioral modeling Inputs are 4-bit vectors
2. Four Bit Adder (6) Synthesized result of behavioral modeling c3 cout c2 c1 A ripple carry structure is synthesized Sum logic
2. Four Bit Adder (7) Simulation results
2. Four Bit Adder (8) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U3 OR2X1 slow 6.789600 U4 XNOR2X1 slow 11.881800 U5 XNOR2X1 slow 11.881800 U6 XNOR2X1 slow 11.881800 U7 XNOR2X1 slow 11.881800 U8 XNOR2X1 slow 11.881800 U9 XNOR2X1 slow 11.881800 U10 XOR2X1 slow 11.881800 U11 XOR2X1 slow 11.881800 U12 OAI2BB1X1 slow 8.487000 U13 OAI21XL slow 6.789600 U14 AO22X1 slow 10.184400 U15 OAI2BB1X1 slow 8.487000 U16 OAI21XL slow 6.789600 U17 OAI2BB1X1 slow 8.487000 U18 OAI21XL slow 6.789600 Total 16 cells 157.858198 ***** End Of Report ***** Structural modeling leads to a smaller area when compared with the synthesis result of behavioral modeling This is because explicit structural information is available in structural modeling Versus 4X3 cells Versus 135.79 behavioral structural behavioral structural
3. Adder & Substractor (1) Symbol view A4. A 4-bit adder/subtractor design. Input A[3:0], B[3:0], function select s = 1 (add), s = 0(subtract), output Y[4:0] 加減法器的其中一種形式 Symbol view Input resource sharing because Add and sub functions are mutually exclusive
3. Adder & Substractor (2) Default option is added to avoid the inference of a latch
3. Adder & Substractor (3) Synthesized circuit - - ++ Exclusive gates for 1’s Complement control of input operand - - ++
3. Adder & Substractor (4) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U4 XNOR2X1 slow 11.881800 U5 OAI22XL slow 8.487000 U6 CLKINVX1 slow 3.394800 U7 AND2X1 slow 6.789600 U8 XOR2X1 slow 11.881800 U9 XOR2X1 slow 11.881800 U10 OA21XL slow 8.487000 U11 OAI2BB1X1 slow 8.487000 U12 XOR2X1 slow 11.881800 U13 XOR2X1 slow 11.881800 U14 XOR2X1 slow 11.881800 U15 OA21XL slow 8.487000 U16 OAI2BB1X1 slow 8.487000 U17 XOR2X1 slow 11.881800 U18 XOR2X1 slow 11.881800 U19 XOR2X1 slow 11.881800 U20 AOI2BB2X1 slow 10.184400 U21 NAND2X1 slow 5.092200 U22 XOR2X1 slow 11.881800 U23 XOR2X1 slow 11.881800 U24 XOR2X1 slow 11.881800 U25 XOR2X1 slow 11.881800 -------------------------------------------------------------------------------- Total 22 cells 222.359398 ***** End Of Report ***** The area is larger than that of a pure adder design
4. Priority Encoder (1) A6. 8-input Priority encoder. Input a[7:0], output q[2:0], y The 8-bit input has a decreasing order of priority from MSB to LSB. Output q shows the bit location of input equal to 1 with the highest priority and y is set as 1. If none of the input bits equals to 1, q is set to 3’b0 and y is set to 0. Symbol view Truth table
4. Priority Encoder (2) Coding 1: Use if-else-if sequence
4. Priority Encoder (3) Synthesized circuit
4. Priority Encoder (4) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U13 NAND4BBXL slow 11.881800 U14 NOR2X1 slow 5.092200 U15 CLKINVX1 slow 3.394800 U16 OAI211X1 slow 8.487000 U17 CLKINVX1 slow 3.394800 U18 OR2X1 slow 6.789600 U19 OAI211X1 slow 8.487000 U20 AOI21X1 slow 8.487000 U21 NAND3X1 slow 6.789600 U22 CLKINVX1 slow 3.394800 U23 NOR4X1 slow 8.487000 U24 CLKINVX1 slow 3.394800 Total 12 cells 78.080401 ***** End Of Report *****
4. Priority Encoder (5) Coding 2: Use casex Same result
4. Priority Encoder (6) Synthesized circuit Smaller than use if-else-if statement
5. BCD to Binary Conversion (1) A9. A 2-digit BCD to binary converter design. Input a[3:0] (MSD), b[3:0] (LSD), output y[6:0]
5. BCD to Binary Conversion (2) y = 10*a + b Left shift 3-bit = 8a Left shift 1-bit = 2a Note: use “shift” instead of “multiplication” to reduce the logic complexity Modern synthesis tools are capable of synthesizing constant multiplication with shifters
5. BCD to Binary Conversion (3) The synthesized circuit is simply too complicated to verify manually!! 6x16+3 = 99
5. BCD to Binary Conversion (4) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U7 XOR2X1 slow 11.881800 U8 XNOR2X1 slow 11.881800 U9 NOR2X1 slow 5.092200 U10 NAND2X1 slow 5.092200 U11 XOR2X1 slow 11.881800 U12 NOR2X1 slow 5.092200 U13 XNOR2X1 slow 11.881800 U14 OA21XL slow 8.487000 U15 OAI2BB1X1 slow 8.487000 U16 XOR2X1 slow 11.881800 U17 AOI21X1 slow 8.487000 U18 OA21XL slow 8.487000 U19 XOR2X1 slow 11.881800 U20 XNOR2X1 slow 11.881800 U21 NAND2X1 slow 5.092200 U22 XOR2X1 slow 11.881800 U23 XNOR2X1 slow 11.881800 U24 CLKINVX1 slow 3.394800 U25 XOR2X1 slow 11.881800 U26 OAI21XL slow 6.789600 U27 OAI2BB1X1 slow 8.487000 U28 CLKINVX1 slow 3.394800 U29 XOR2X1 slow 11.881800 U30 XNOR2X1 slow 11.881800 U31 NAND2X1 slow 5.092200 U32 XOR2X1 slow 11.881800 -------------------------------------------------------------------------------- Total 26 cells 235.938597 ***** End Of Report *****
6. Seven-Segment Decoder (1) A10. A 7-segment LED display decoder. Input x[3:0], output a,b,c,d,e,f,g. The LED segment turns on if the control signal equal to 1.
6. Seven-Segment Decoder (2) Case description is equivalent to write down the truth table There is no need to attempt to derive the Boolean equations of a ~ g yourself The synthesis tool can perform sophisticated logic minimizations efficiently to obtain the Boolean functions Note: to perform multiple-output Boolean logic minimization, you may resort to Quine-McClauskey algorithm
6. Seven-Segment Decoder (3)
6. Seven-Segment Decoder (4)
6. Seven-Segment Decoder (5) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U24 NAND2X1 slow 5.092200 U25 OAI21XL slow 6.789600 U26 NAND3BX1 slow 8.487000 U27 OA21XL slow 8.487000 U28 NAND3X1 slow 6.789600 U29 MXI2X1 slow 11.881800 U30 NOR2X1 slow 5.092200 U31 OA21XL slow 8.487000 U32 OAI211X1 slow 8.487000 U33 CLKINVX1 slow 3.394800 U34 CLKINVX1 slow 3.394800 U35 OAI221XL slow 11.881800 U36 NOR2X1 slow 5.092200 U37 NOR3X1 slow 6.789600 U38 NAND4X1 slow 8.487000 U39 NAND3BX1 slow 8.487000 U40 CLKINVX1 slow 3.394800 U41 CLKINVX1 slow 3.394800 U42 NOR2X1 slow 5.092200 U43 CLKINVX1 slow 3.394800 U44 NAND2X1 slow 5.092200 U45 NOR2X1 slow 5.092200 U46 NAND2X1 slow 5.092200 U47 NOR2X1 slow 5.092200 U48 CLKINVX1 slow 3.394800 -------------------------------------------------------------------------------- Total 25 cells 156.160800 ***** End Of Report *****
7. Odd Parity Checker (1) Odd parity Even parity A7. Odd Parity Checker. Input x[7:0], output y = 1 if there are odd number of 1’s in input x. Odd parity Even parity Note: If it’s an odd parity bit generator, the parity bit should be 1 if there are even number of 1’s
7. Odd Parity Checker (2) Reduction XOR 0100_0101(3) 0100_1101(4) 1111_1101(7)
7. Odd Parity Checker (3)
7. Odd Parity Checker (4) 架構1: 兩架構之結果相同,但造成的延遲時間 會有不小的差距,同樣7個XOR的狀況下, 架構一只有三級,架構二卻有七級,故架 構二會使電路減慢速度。 架構2:
7. Odd Parity Checker (5) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U8 XOR2X1 slow 11.881800 U9 XOR2X1 slow 11.881800 U10 XNOR2X1 slow 11.881800 U11 XNOR2X1 slow 11.881800 U12 XOR2X1 slow 11.881800 U13 XNOR2X1 slow 11.881800 U14 XNOR2X1 slow 11.881800 Total 7 cells 83.172598 ***** End Of Report *****
8. Absolute value function (1) A13. ABS function. Input a[7:0]; return the absolute value of a
8. Absolute value function (2) Verilog 1995 不能用關鍵字"Signed" 1001_1110 = -98 0110_0010 = 98
8. Absolute value function (3) Verilog 2001 能用關鍵字"Signed"
8. Absolute value function (4) Verilog 1995 coding synthesis result
8. Absolute value function (5) Verilog 2001 coding synthesis result is identical to that of Verilog 1995
8. Absolute value function (7) Synthesis Report by Design Vision Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U5 AO22X4 slow 15.276600 U7 AO22X4 slow 15.276600 U9 AO22X4 slow 15.276600 U11 AO22X4 slow 15.276600 U13 AO22X4 slow 15.276600 U15 AO22X4 slow 15.276600 U17 AO22X4 slow 15.276600 U18 NAND2X8 slow 20.368799 U20 CLKINVX1 slow 3.394800 U21 XOR2X1 slow 11.881800 U22 NAND2BX1 slow 6.789600 U23 CLKINVX1 slow 3.394800 U24 XNOR2X1 slow 11.881800 U25 NOR2BX1 slow 6.789600 U26 XNOR2X1 slow 11.881800 U27 NOR3BXL slow 8.487000 U28 XOR2X1 slow 11.881800 U29 NAND2BX1 slow 6.789600 U30 XNOR2X1 slow 11.881800 U31 NOR3X1 slow 6.789600 U32 XNOR2X1 slow 11.881800 U33 NOR2X1 slow 5.092200 U34 XOR2X1 slow 11.881800 Total 23 cells 258.004796
9. ALU design (1) A17. A simplified 8-bit ALU Design with cmd[1:0] as a 2-bit OP code, A[7:0] and B[7:0] as two 8-bit input operands, and Y[7:0] as a 8-bit output. It also has 2 flags. Flag z = 1 if Y==0. Flag c = 1 if carry out at MSB occurs when performing the addition. cmd operation 00 Y = A+B 01 Y = A-B 10 Y = A or B 11 Y = A and B
9. ALU design (2) Use “case” to describe different functions performed by the ALU Flag update
9. ALU design (3) Synthesized ALU circuit
9. ALU design (4) Synthesis report Total 32 cells 628.038015 Cell Reference Library Area Attributes -------------------------------------------------------------------------------- U46 NOR2X1 slow 5.092200 U47 NAND4X1 slow 8.487000 U48 NAND4X1 slow 8.487000 U49 NOR3BXL slow 8.487000 U50 CLKINVX1 slow 3.394800 U51 AOI222XL slow 13.579200 U52 AO21X1 slow 8.487000 U53 CLKINVX1 slow 3.394800 U54 AOI222XL slow 13.579200 U55 AO21X1 slow 8.487000 U56 CLKINVX1 slow 3.394800 U57 AOI222XL slow 13.579200 U58 AO21X1 slow 8.487000 U59 CLKINVX1 slow 3.394800 U60 AOI222XL slow 13.579200 U61 AO21X1 slow 8.487000 U62 CLKINVX1 slow 3.394800 U63 AOI222XL slow 13.579200 U64 AO21X1 slow 8.487000 U65 CLKINVX1 slow 3.394800 U66 AOI222XL slow 13.579200 U67 AO21X1 slow 8.487000 U68 CLKINVX1 slow 3.394800 U69 AOI222XL slow 13.579200 U70 AO21X1 slow 8.487000 U71 CLKINVX1 slow 3.394800 U72 AOI222XL slow 13.579200 U73 AO21X1 slow 8.487000 U74 NOR2X1 slow 5.092200 U75 CLKINVX1 slow 3.394800 U76 NOR2BX1 slow 6.789600 r302 addsub 378.520212 -------------------------------------------------------------------------------- Total 32 cells 628.038015 ***** End Of Report *****
Part B. Data Storage and Memory
Outline 8-bit Shift Register Multiply-and-Accumulate module 256x16 Single Port RAM
1. 8-bit Shift Register (1) B2. 8-bit Shift Register, positive edge triggered, input din[7:0], cmd[1:0], output q[7:0] Command Operation 00 Load register 01 Shift left, LSB takes in a “0” 10 Logical shift right, MSB takes in a “0” 11 Arithmetic shift right, MSB is sign bit extension*
1. 8-bit Shift Register (2) Asynchronous Reset, Active High 利用Case進行Command的選擇。 * Verilogger不支援2001標準無法使用 Arithmetic shift運算子,故僅用Logical shift right。 Arithmetic shift 運算子為 >>>、<<<
1. 8-bit Shift Register (3) Load C0 => 1100_0000 1100_0000 << 1 => 1000_0000(80) 1000_0000 >> 1 => 0100_0000(40) 0100_0000 >> 1 => 0010_0000(20) (1) (2) (3) (4)
1. 8-bit Shift Register (4)
1. 8-bit Shift Register (5) Area Report: Net Report: Fanout:扇出數,由該接線所驅 動物件之總數。 Ex.因此設計有8個DFF,故clk 之 Fanout為8。
1. 8-bit Shift Register (6) Timing Report: (1) Critical Path之起終點。 範例中為q[1]到q[0]之間。 (2) Critical Path 路徑 (3) Flip Flop之Setup Time (4) Slack:值越大越好。 意義上正值表示滿足FF 之Setup/Hold Time。 負值表示不滿足,須降低 clock rate。 (2) (3) (4)
2. Multiply-and-Accumulate module(1) B3. Multiply-and-Accumulate module Perform y[17:0] = a[7:0]*b[7:0] + acc[17:0], a and b are two input operands, and acc is the output of an accumulating register. Y is then loaded to the accumulating register on the rising edge of the clock.
2. Multiply-and-Accumulate module(2) Asynchronous Reset, Active High Feedback
2. Multiply-and-Accumulate module(3) Start input data A(10) * 14(20) = C8(200) F(15) * 19(25) + C8(200) = 23F(575) 8(8) * 6(6) + 23F(575) = 26F(623) 1(1) * 2(2) + 26F(623) = 271(625) (1) (2) (3) (4) (5)
2. Multiply-and-Accumulate module(4) Adder Output Register (acc or y) Multiplyer
2. Multiply-and-Accumulate module(5) Area Report: Net Report:
2. Multiply-and-Accumulate module(6) Timing Report: Slack接近0,僅剛好符合Timing, 可調整電路加大一點讓後面的流程 更容易設計。
3. 256x16 Single Port RAM (1) B5. 256X16 single port memory. The memory module is addressed by a 8-bit address addr[7:0] and has a bi-directional data port “data[15:0]”. The module has 2 control signals rw: write if rw = 1, read if rw = 0 cs: active low chip select signal, the RAM functions only if cs = 0. The data port is high impedance if cs = 1. Inout Port
3. 256x16 Single Port RAM (2) 左邊用於宣告一個inout兩用port,須搭配左 下的assign決定何時為輸出何時為輸入。 當rw=0(讀取)及cs=0(chip function開啟)時為 輸出,反之則為高阻抗,高阻抗用於輸入。 當rw=0(讀取狀態)時,輸出該ADDR之資料 For 迴圈,可合成但須注意使用方法與 C語言不同。僅用於擴展規律性描述句。 此例自動擴展為:ram[0] <= 8’d0; ram[1] <= 8’d0; ram[2] <= 8’d0; ……
3. 256x16 Single Port RAM (3) 當cs= 1,暫停所有功能,各Register不動作。 當rw= 1,寫入資料到Address所指的位置。 當rw= 0,各Register維持原本的值。
3. 256x16 Single Port RAM (4) Area Report: 可以發現總面積比前一個範例大很多, 通常使用Register製作RAM不是好選擇, 以1Kbit為分界點,以上使用DRAM面積 會較低,為比較優秀的選擇。
3. 256x16 Single Port RAM (4) Net Report: 左邊三個紅框為Fanout>1000的Net,過高的Fanout會使需驅動的電容變大而讓速度降低,應盡量避免此狀況。 可對Compiler下Constrains來限制最高Fanout數解決此狀況。 可以從右邊的圖發現,N527、N528兩條接線是被ADDR[0]&ADDR[1]所驅動。
3. 256x16 Single Port RAM (5) Timing Report:
Part C. Counter And Finite State Machine
Outline Clock Frequency Divider(divide by 4) 4-bit Universal Counter PWM Module Debouncing Circuit Module ADD-XOR Compute
1. Clock Frequency Divider(x0.25)(1) C4. A clock frequency divider (divide by 4) using a 2-bit counter. Note that duty cycle is 50%, i.e. the period of the divided clock being 1 is equal to 50% of the total (divided) clock period. Duty Cycle 50% 一個clock週期內1與0的比例 為各一半。
1. Clock Frequency Divider(x0.25)(2) 除頻器利用一個counter及一個簡單的判斷即可達成,此例中: 將4clock合為1clock,使用2bit counter (4 state)並讓 counter = 0、1時輸出0, counter = 2、3時輸出1。
1. Clock Frequency Divider(x0.25)(3) 開始計數。 Counter動作,但仍小於2故輸出0。 Counter動作,此時大於等於2故輸出1。 計數結束,Counter歸零。 可以看到clk_o為4個clk_i的週期。 (1) (2) (3) (4) 內部信號counter
1. Clock Frequency Divider(x0.25)(4) 合成結果可以發現與例C1有類似之處,為一簡單的2bit counter, 分析counter的行為(右下表)可知,直接輸出reg[1]即為除頻後的結果。 Counter Output 0 0 0 1 1 0 1 1
1. Clock Frequency Divider(x0.25)(5) Area Report: Net Report:
1. Clock Frequency Divider(x0.25)(6) Timing Report:
2. 4-bit Universal Counter (1) C6. 4-bit universal counter The counter has a 4-bit input “data[3:0]” and a 4-bit output “count[3:0]”. The counter also has the following control inputs with decreasing priority: Reset: synchronous reset, active high (i.e. reset when 1) Load: set output “count[3:0]” value as input “data[3:0]”, synch load Enable: the counter counts only if enable is set to 1 Up_Down: up counting if set as 1, down counting if set as 0
2. 4-bit Universal Counter (2) 可以發現四個信號之間有優先順序: Reset > Load > Enable > Up_Down 此電路為作業的簡略版。
2. 4-bit Universal Counter (3) Verilog 之 if 敘述合成: If 的一般來說會合成為多工器且具有優先順序。在Part A的Priority Decoder便是利用這樣的特性,以下是if-elseif-else的合成結果: 但實際上視Synthesiser及Code行為有可能合成出平行的Mux,如上圖及下圖可達成相同的電路行為。
2. 4-bit Universal Counter (4) Load及Enable為1及reset為0,故讀入值C(12) 此時因enable為0故不動作 Enable及up_down為1,往上加1C+1=D(13) 同上狀況所以往上+10+1=1(1) Enable為1但up_down為0,遞減故1-0=0(0) (1) (2) (3) (4) (5)
2. 4-bit Universal Counter (5)
2. 4-bit Universal Counter (6) Area Report: Net Report:
2. 4-bit Universal Counter (7) Timing Report:
3. PWM Module (1) C8. A PWM (pulse width modulation) module to control the brightness level of a LED. Input clk, ctrl[1:0], output y. ctrl is the control signal, y is the output assumes the waveforms shown below according to the control signal
3. PWM Module (2) PWM 簡介: PWM(Pulse Width Modulation,脈衝寬度調變),是將類比信號 轉換為脈波的一種技術,一般轉換後脈波的週期固定,但脈波的Duty Cycle會依類比信號的大小而改變。許多類比電路,電壓和電流可直接用來進行控制,例如家用電器設備中的音量開關控制、LED燈泡的亮度控制等等。 一般而言,負載需要的調製頻率要高於10Hz,在實際應用中,頻率約在1kHz到200kHz之間。
3. PWM Module (3) 此例題可依照除頻器的方式來完成。 由題目可以發現0%、25% 、50%、75% 剛好將整個週期分成了4等份,故使用 2bit counter。可看做FSM的State register及Next state logic。 利用Case來完成一個多工器,使用ctrl來當控制信號選擇輸出哪一組亮度(Duty Cycle)。可看做FSM之Output Logic。 類似前面除頻器及VGA同步信號產生的方法 來控制Duty Cycle
3. PWM Module (4) 每個區間皆為4 clock cycle Ctrl = 0, Duty cycle = 0% (1) (2) (3) (4)
3. PWM Module (5) State Register Next State Logic PWM 輸出 Output Logic
3. PWM Module (6) Area Report: Net Report:
3. PWM Module (7) Timing Report:
4. Debouncing Circuit Module(1) C10. Design a debouncing circuit module. Bouncing is often caused by a mechanical switch that takes time to settle when switching occurs. Debouncing circuit will sample input signal at the rising edges of the clock and will change its output state only when a consistent signal is sampled in 3 consecutive clock cycles. Debounce電路用於消除機械按鈕所產生的彈跳現象,範例示範了只有連續3個clock cycle輸入皆為High情況下才輸出高電位,在箭頭所指處的彈跳並沒有影響到輸出。
4. Debouncing Circuit Module(2) State 總共分為4個: 1. WAIT:當輸入為0時的狀態。 2. DETECT_1:第一次偵測到Input為High時。 3. DETECT_2:Input維持High的第二個Clock。 4. DETECT_3:Input維持High的第三個Clock 此時若輸入一直維持,則不 跳回WAIT狀態。 State Register , 4個狀態故2位元。
4. Debouncing Circuit Module(3) Next State Logic,不斷根據in (輸入)判斷是否 跳入另一個State,若輸入持續為HIGH,最後 會維持在DETECT_3狀態。 Output Logic,僅在DETECT_3時輸出,其餘 狀態沒寫出來的話會根據default使輸出為0。
4. Debouncing Circuit Module(4) 即使偵測到In為High,out仍為Low,此時狀態應為DETECT_2。 同上狀態,此時進入狀態DETECT_2。 輸入持續,此時進入DETECT_3狀態,out輸出HIGH。 (1) (2) (3)
4. Debouncing Circuit Module(6) 輸出 輸入 以上面的圖對比前的範例,可以發現到其實FSM的架構都差不多,以 State Register Next State Logic Output Logic 所組成,電路規模會隨著State即輸出變多而成長,而Register部分也會隨著State編碼方式不同而改變數量,常見的方法為: One Hot 每個狀態僅會有一個bit為1,最耗費Register但電路會較簡單。 Gray Code 某一狀態與前後狀態只有1bit的差異,穩定性高。 Sequence根據一般的數字做邊碼,所有範例皆以此方法製成。
4. Debouncing Circuit Module(5) Area Report: Net Report:
4. Debouncing Circuit Module(7) Timing Report: 一般而言,FSM不會是電路中的Critical Path,通常是一些複雜運算,如加減乘除這類的。
5. ADD-XOR Compute(1) C12. Verilog design for the data path shown below 先相加,後將所有位元進行XOR。
5. ADD-XOR Compute(2) 完全根據架構圖進行設計,將輸入放入Reg,然後相加後放入Reg,最後XOR再放入Reg。 此種運算子在Part A中使用過,另有以下數種相同類型的運算子,皆僅需一個運算圓。 運算子 描述 ^a 逐位做XOR |b 逐位做OR &c 逐位做AND
5. ADD-XOR Compute(3) 1. A+8 = 12(0001_0010), ^(12)=0 2. 8D+4D=DA(11011010),^(DA)=1 Reg D 1 Reg A Reg B Reg C
5. ADD-XOR Compute(4) 與Odd Parity Checker (Part A)架構相似的XOR排列。 Reg C Reg B Reg D Adder Reg A
5. ADD-XOR Compute(5) Area Report: Net Report:
5. ADD-XOR Compute(6) Timing Report: 雖然Slack仍足夠,但比起前面的例子要少的多,仔細觀察Cridical Path可以發現路徑集中在Adder的部分。 對比起ParB的第3題,結構類似,但由於加入了Reg使得Slack能提升來加快速度,這樣的概念就是Pipeline。
Thanks for your Listening !!