1 Lucas-Lehmer Primality Tester Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design Manager: Prateek Goenka.

Slides:



Advertisements
Similar presentations
Programmable FIR Filter Design
Advertisements

Multiplication and Division
Table 7.1 Verilog Operators.
Verilog Intro: Part 1.
Combinational Logic with Verilog Materials taken from: Digital Design and Computer Architecture by David and Sarah Harris & The Essentials of Computer.
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
Team M1 Enigma Machine Milestone 5 Adithya Attawar (M11) Shilpi Chakrabarti (M12) Zavo Gabriel (M13) Mike Sokolsky (M14) Design Manager: Prateek Goenka.
1 4-bit Decimation Filter Rashmi Joshi Siu Kuen(Steve) Leung Cuong Trinh Advisor: Dr. David Parent December 5, 2005.
1 Lucas-Lehmer Primality Tester Presentation 8 March 22nd 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Virtual Wallet Gates Winkler Yin Shen Jordan Samuel Fei /23/2009 A handheld device that saves time and money through smart budget management and.
1 Farhan Mohamed Ali (W2-1) Jigar Vora (W2-2) Sonali Kapoor (W2-3) Avni Jhunjhunwala (W2-4) Presentation 7 MAD MAC th March, 2006 Functional Block.
Noise Canceling in 1-D Data: Presentation #13 Seri Rahayu Abd Rauf Fatima Boujarwah Juan Chen Liyana Mohd Sharipp Arti Thumar M2 April 20 th, 2005 Short.
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage VI: February 25 h 2004.
1 Lucas-Lehmer Primality Tester Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design Manager: Prateek Goenka.
1 Lucas-Lehmer Primality Tester Presentation 6 March 1st 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Viterbi Decoder: Presentation #11 M1 Overall Project Objective: Design a high speed Viterbi Decoder Stage 11: 12 th April 2004 Short Final Presentation.
Huffman Encoder Project. Howd - Zur Hung Eric Lai Wei Jie Lee Yu - Chiang Lee Design Manager: Jonathan P. Lee Huffman Encoder Project Final Presentation.
Team W3: Anthony Marchetta Derek Ritchea David Roderick Adam Stoler Milestone 10: April 5th Chip Level Simulation Overall Project Objective: Design an.
Noise Canceling in 1-D Data: Presentation #10 Seri Rahayu Abd Rauf Fatima Boujarwah Juan Chen Liyana Mohd Sharipp Arti Thumar M2 Mar 28 rd, 2005 Chip Level.
1 Team M1 Enigma Machine 3rd May, 2006 Adithya Attawar (M11) Shilpi Chakrabarti (M12) Mike Sokolsky (M14) Design Manager: Prateek Goenka Adithya Attawar.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Wed. Oct. 27 Overall Project Objective : Dynamic Control.
Group M3 Nick Marwaha Craig LeVan Jacob Thomas Darren Shultz Project Manager: Zachary Menegakis April 4, 2005 MILESTONE 11 LVS & Simulation DSP 'Swiss.
Lucas-Lehmer Primality Tester Presentation 1: Proposal Team: Nathan Stohs Joe Hurley Brian Johnson Marques Johnson.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
1 Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
Team W3: Anthony Marchetta Derek Ritchea David Roderick Adam Stoler Milestone 9: March 31st Chip Level Simulatio Overall Project Objective: Design an Air-Fuel.
Lucas-Lehmer Primality Tester Presentation 2: Architecture Proposal February 1, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques.
Idongesit Ebong (1-1) Jenna Fu (1-2) Bowei Gai (1-3) Syed Hussain (1-4) Jonathan Lee (1-5) Design Manager: Myron Kwai Overall Project Objective: Design.
Lucas-Lehmer Primality Tester Presentation 4 February 15, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Lucas-Lehmer Primality Tester Presentation 5 February 22, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
High Dynamic Range Emeka Ezekwe M11 Christopher Thayer M12 Shabnam Aggarwal M13 Charles Fan M14 Manager: Matthew Russo 6/26/
1 Lucas-Lehmer Primality Tester Presentation 6 March 1st 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1. 2 Farhan Mohamed Ali Jigar Vora Sonali Kapoor Avni Jhunjhunwala 1 st May, 2006 Final Presentation MAD MAC 525 Design Manager: Zack Menegakis Design.
1 Lucas-Lehmer Primality Tester Presentation 9 March 29, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1 Lucas-Lehmer Primality Tester Presentation 11 April 24th 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
1 GPS Waypoint Navigation Team M-2: Charles Norman (M2-1) Julio Segundo (M2-2) Nan Li (M2-3) Shanshan Ma (M2-4) Design Manager: Zack Menegakis Presentation.
1 Lucas-Lehmer Primality Tester Presentation 8 March 22nd 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design.
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage II: February 4 th 2004.
1 8 Bit ALU EE 166 Design Project San Jose State University Roger Flores Brian Silva Chris Tran Harizo Yawary Advisor: Dr. Parent May 2006.
Team W1 Design Manager: Rebecca Miller 1. Bobby Colyer (W11) 2. Jeffrey Kuo (W12) 3. Myron Kwai (W13) 4. Shirlene Lim (W14) Stage II: 26 th January 2004.
Camera Auto Focus Group W1 Tom Goff Dave Hwang Kate Killfoile Greg Look Design Manager: Bowei Gai Final Presentation, April 30 th, 2007 Project Objective:
1 Design Goal Design an Analog-to-Digital Conversion chip to meet demands of high quality voice applications such as: Digital Telephony, Digital Hearing.
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
Group M3 Jacob Thomas Nick Marwaha Craig LeVan Darren Shultz Project Manager: Zachary Menegakis April 20, 2005 MILESTONE 13 Short Final Presentation DSP.
Lucas-Lehmer Primality Tester Presentation 2: Architecture Proposal February 1, 2006 Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques.
Virtual Wallet Gates Winkler Yin Shen Jordan Fei Project Manager: Prajna Shetty /02/2009 A handheld device that saves time and money through smart.
Registers CPE 49 RMUTI KOTAT.
CS1Q Computer Systems Lecture 9 Simon Gay. Lecture 9CS1Q Computer Systems - Simon Gay2 Addition We want to be able to do arithmetic on computers and therefore.
Team MUX Adam BurtonMark Colombo David MooreDaniel Toler.
Chapter 8 Problems Prof. Sin-Min Lee Department of Mathematics and Computer Science.
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 10-1 The ALU.
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
Anurag Dwivedi. Basic Block - Gates Gates -> Flip Flops.
Divide Calculation Latency
Group M1 - Enigma Machine Design Manager: Prateek Goenka Adithya Attawar (M1-1) Shilpi Chakrabarti (M1-2) Zavo Gabriel (M1-3) Mike Sokolsky (M1-4) Milestone.
Concepts of Engineering and Technology Copyright © Texas Education Agency, All rights reserved.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
1 The ALU l ALU includes combinational logic. –Combinational logic  a change in inputs directly causes a change in output, after a characteristic delay.
Howd - Zur Hung Eric Lai Wei Jie Lee Yu - Chiang Lee Design Manager: Jonathan P. Lee [M2] Huffman Encoder Project Presentation #3 February 7 th, 2007 Overall.
ECE 3130 Digital Electronics and Design
ADPCM Adaptive Differential Pulse Code Modulation
Swamynathan.S.M AP/ECE/SNSCT
ADPCM Adaptive Differential Pulse Code Modulation
Alpha Blending and Smoothing
Digital Systems Section 14 Registers. Digital Systems Section 14 Registers.
FIGURE 1: SERIAL ADDER BLOCK DIAGRAM
EE216A – Fall 2010 Design of VLSI Circuits and Systems
Presentation transcript:

1 Lucas-Lehmer Primality Tester Team: W-4 Nathan Stohs W4-1 Brian Johnson W4-2 Joe Hurley W4-3 Marques Johnson W4-4 Design Manager: Prateek Goenka

2 Agenda Background (Marques) Project Description (Marques) Algorithmic Description (Joe) Data Flow/Block Diagram (Joe) Design Process (Nathan) Simulations (Nathan) Floorplan/Layout (Brian) Conclusions (Brian)

3 History of 2 P th century it was believed 2 P -1 was prime for all prime P’s 1536 Hudalricus Regius proved was not prime French monk Marin Mersenne published Cogitata Physica-Mathematica where he stated 2 P -1 was prime for P = 2, 3, 5, 7, 13, 17, 19, 31, 67, 127 and 257

4 Lucas-Lehmer François Edouard Anatole Lucas 1876 proved that the number is prime using his own methods Derrick Lehmer –1930 he refined Lucas’s method

5 Make History December rd Known Mersenne Prime Found!! Dr. Curtis Cooper and Dr. Steven Boone Professors at Central Missouri State University 2 30,402,457 -1

6 Prime Number Competitions Electronic Frontier Foundation $50,000 to the first individual or group who discovers a prime number with at least 1,000,000 decimal digits (awarded Apr. 6, 2000) $100,000 to the first individual or group who discovers a prime number with at least 10,000,000 decimal digits $150,000 to the first individual or group who discovers a prime number with at least 100,000,000 decimal digits $250,000 to the first individual or group who discovers a prime number with at least 1,000,000,000 decimal digits

7 rankprimedigitswhowhenreference G92005Mersenne G82005Mersenne G72004Mersenne G62003Mersenne G52001Mersenne SB SB G41999Mersenne SB SB92005

8 Mersenne Prime Algorithm Only used for numbers that are in the form 2 P -1 For P > 2 2 P -1 is prime if and only if S p-2 is zero in this sequence: S 0 = 4 S N = (S N ) mod (2 P -1)

9 Example to Show is Prime 2 7 – 1 = 127 S 0 = 4 S 1 = (4 * 4 - 2) mod 127 = 14 S 2 = (14 * ) mod 127 = 67 S 3 = (67 * ) mod 127 = 42 S 4 = (42 * ) mod 127 = 111 S 5 = (111 * ) mod 127 = 0

10 Computations needed: -Squaring (not a problem…) -Add/Subtract (not a problem…) -Modulo (2 n – 1) multiplication (?) Algorithmic description We knew the necessary computations, but how to translate that to gates?

11 Mechanisms behind the math If done with brute force, modulo 2 n -1 could have been ugly. –Would need to square and find the remainder via division. Luckily, for that specific computation, math is on our side, the 2 n -1 constraint saves us from division, as will be seen. A quick search on produced inspiration. Reto Zimmermann. Efficient VLSI Implementation of Modulo (2 n +- 1) Addition and Multiplication. Computer Arithmetic, 1999; p

12 Useful Math: Multiplication Just like any other multiplication, a modulo multiplication can be computed by (modulo) summing the partial products. So modulo multiplication is multiplication using a modulo adder. From the Zimmerman paper

13 Mod Calc Mod add Count Subtract 2 Block Diagram P Out 16 1 FSM start 1 done Register 16 Compare Counter Next Partial Product 16 Register 16 2 S1 = (4 * 4) mod = 14 Loop xP-2 S5 = (111 * ) mod 127 = 0... S2 = (14 * 14) mod = 67 Loop x16

14 Design Process The Process So far: - Found Mathematical Means (core algorithm) - Found Computational Means (modulo multiplier, adder) From the above, a high level C program was written in a manner that would easily translate to verilog and gates, or at least more standard operations int mod_square_minus(int value, int p, int offset) { int acc, i; int mod = (1 << p) - 1; for(acc=offset, i=0; i<(sizeof(int)*8-1); i++) { int a = (value >> i) & 1; int temp; if (a) { if (i-p > 0) temp = value << (i-p); else temp = value >> (p-i); acc = acc + temp + ((value << i) & ((1 << p) - 1)); } if (acc >= mod) acc = acc - mod; } return acc; } This easily translated into behavorial verilog, and readily turned into a gate- level implementation. Essentially it was written in a more low-level manner.

15 Design Process The rest of the design can simply be thought of as a wrapper for the modulo multiplier. The following slides contain Verilog code that was directly taken from the C code below. module mod_mult(out, itrCount, x, y, mod, p, reset, en, clk); input [15:0] x, y, mod, p; output [15:0] out; input reset, en, clk; wire [15:0] pp, ma0, temp; output [3:0] itrCount; counter mycount(itrCount, reset, en, clk); partial_product ppg(pp, x, y, itrCount, mod, p); mod_add modAdder(out, pp, temp, mod); dff_16_lp partial(clk, out, temp, reset, en); endmodule Top level of multiplier

16 module partial_product(out, x, y, i, mod, p); output [15:0] out; input [15:0] x, y, mod, p; input [3:0] i; wire [15:0] diff1, diff2, added, result, corrected, final; wire [15:0] high, low, shifted, toadd; wire cout1, cout2, ithbith, toobig; sub_16 difference1(diff1, cout1, {12'b0, i}, p); sub_16 difference2(diff2, cout2, p, {12'b0, i}); shift_left shiftL(high, y, diff1[3:0]); shift_right shiftR(low, y, diff2[3:0]); mux16 choose(high, low, shifted, cout1); shift_left shiftL2(toadd, y, i); and16 bigand(added, toadd, mod); fulladder_16 addhighlow(.out(result),.xin(added),.yin(shifted),.cin({1'b0}),.cout(nowhere)); sub_16 correct(.out(corrected),.cout(toobig),.xin(mod),.yin(result)); mux16 correctionMux(.out(final),.high(corrected),.low(result),.sel(toobig)); shift_right ibit({15'b0, ithbit}, x, i); select16 checkfor0(.out(out),.x(result),.sel(ithbit)); endmodule Partial Product Unit w/ modulo reduction

17 module mod_add(out, x, y, mod); input [15:0] x, y, mod; output [15:0] out; wire cout, isDouble, cin; wire [15:0] plus, lowbits, done, mod_bar, check; fulladder_16 add(.out(plus),.xin(x),.yin(y),.cin(cin),.cout()); invert_16 inverter(mod_bar, mod); and16 hihnbits(check, plus, mod_bar); and16 lownbits(done, plus, mod); or8 (cin, check[0], check[1], check[2], check[3], check[4], check[5], check[6], check[7], check[8], check[9], check[10], check[11], check[12], check[13], check[14], check[15]); compare_16 checkfordouble(isDouble, done, 16'b1111_1111_1111_1111); mux16 fixdouble(.out(out),.high(16'b0),.low(done),.sel(isDouble)); endmodule Modulo Adder

18 Final Design Process Notes Lessons learned: Never tweak the schematics without retesting the verilog first. Timing issues can be subtle. Verilog is better for catching them and quickly fixing/retesting than schematics. Considering total time spent during this phase, roughly half was on the “core” and the FSM, the rest on the “wrapper”.

19 Road to verification : C 2 Examples of the high-level C implementations: Tyrion:~/Desktop/15525 nstohs$./prime4 7 round 1: (4 * 4 - 2) mod 127 = 14 round 2: (14 * ) mod 127 = 67 round 3: (67 * ) mod 127 = 42 round 4: (42 * ) mod 127 = 111 round 5: (111 * ) mod 127 = is prime Tyrion:~/Desktop/15525 nstohs$./prime4 11 round 1: (4 * 4 - 2) mod 2047 = 14 round 2: (14 * ) mod 2047 = 194 round 3: (194 * ) mod 2047 = 788 round 4: (788 * ) mod 2047 = 701 round 5: (701 * ) mod 2047 = 119 round 6: (119 * ) mod 2047 = 1877 round 7: (1877 * ) mod 2047 = 240 round 8: (240 * ) mod 2047 = 282 round 9: (282 * ) mod 2047 = is not prime

20 Road to verification: Verilog Samples of Verilog Verification output: Partial Product Unit p = ppOut= 56, x= 14, y= 14, i= 2, mod= 127, p= ppOut= 112, x= 14, y= 14, i= 3, mod= 127, p= ppOut= 0, x= 14, y= 14, i= 4, mod= 127, p= ppOut= 0, x= 14, y= 14, i= 5, mod= 127, p= 7 Top Level p = 7 itrOut= x itrOut= 4 itrOut= 14 itrOut= 67 itrOut= 42 itrOut= 111 itrOut= 0 Top Level p = 11 itrOut= x itrOut= 4 itrOut= 14 itrOut= 194 itrOut= 788 itrOut= 701 itrOut= 119 itrOut= 1877 … Tests were either specific tests on important units such as Partial_Product …or top level tests. Note that these are the same results generated from the C code

21 Road to verification: Schematic I Schematic Test of our modular adder Mod 127 = 69

22 Road to verification: Schematic II Plot of the top level output after a single iteration, p=7 Output after a single iteration is 14, the expected value.

23 Road to verification: Schematic III

24 Road to verification: Intermission Disk Space required for a full-length schematic test of p=7 : 6 GB Time required for a full-length schematic test of p=7 : 5 hours Disk Space required for a full-length extractedRC test of p=7 : 20 GB Time required for a full-length extractedRC test of p=7 : 8 hours Simulations become lengthy due to tests needing to be “deep” to be useful.

25 Layout: ExtractedRC – Full Run

26 Timing To determine the bounds of our clock, Pathmill was used once major portions of the schematic was complete. The critical path through our design is one loop through the modular multiplier, which runs through the modular adder and partial products module. The pathmill delay of the modular adder was 9ns, and 5.2 ns through the partial products module. This already puts our total delay at 14.2 ns, putting our schematic delay at 70 MHz. For extractedRC, due in part to simulation issues, a conservative 50 MHz was chosen as the final clock.

27 Issues extractedRC of partial_product module Registers switch –Custom design to DFFs with muxes Switching from parallel calculations to series –Transistor count vs. clock cycles Syncing up design between people –Transferring files –Different design styles LONG simulation times Floorplanning –Too much emphasis on aspect ratios and not enough on wiring –Couldn’t decide on one set floorplan

28 Floorplan v1.0

29 Floorplan v2.0

30 Final Floorplan

31 Pin Specifications PinType# of Pins Vdd!In/Out1 Gnd!In/Out1 p In16 clkIn1 startIn1 DoneOut1 outOut1 Total-22

32 Initial Module Specifications ModuleTransistor Count Area (µm²) Transistor Density FSM mod_p2,4407, mod_add1,2829, partial_product8,67665, count1,6566, sub_167043, Registers1,8486, compare Total16,94297,700.17

33 Final Module Specifications ModuleTransistor Count Area (µm²) Transistor Density FSM1521, mod_p1,2808, mod_add1,1685, partial_product7,52054, count1,4248, sub_165762, Registers8966, compare Total13,70286, Aspect Ratio

34 Chip Specifications Transistor Count: 13,702 Size: µm x µm Area: 86,621µm² Aspect Ratio: 1.01:1 Density: 0.16 transistors/µm²

35 Final Floorplan

36 Final Floorplan

37 Partial Product shift_rightshift_left shift_rightshift_left adder 16-bit and Select 16 Sub_16 mux

38 Poly Layer Density: 7.14%

39 Active Layer Density: 8.76%

40 Metal1 Layer Density: 23.86%

41 Metal2 Layer Density: 19.97%

42 Metal3 Layer Density: 11.30%

43 Metal4 Layer Density: 10.34%

44 Conclusions Plan for buffers -Will be hard to put them in after the fact Your design will change dramatically from start to finish so be flexible Communication is key Do layout in parallel