Download presentation
Presentation is loading. Please wait.
1
Random Number Generator Dmitriy Solmonov W1-1 David Levitt W1-2 Jesse Guss W1-3 Sirisha Pillalamarri W1-4 Matt Russo W1-5 Design Manager – Thiago Hersan March 1, 2006 Component Layout and Floorplan Project Objective: Create a Cryptologically Secure Pseudo-Random Number Generator
2
Need for Encryption Explain how a good random number can make data transfer that much more secure.
3
Random Number? Pseudo-random number generator Uses RC4 encryption algorithm –Cryptographically secure Internally Updated Seed –not in programmer's visible state –hacker
4
Usage
5
Demand Potential markets –Defense and Intelligence Organizations –Gambling –Component of future secure mobile communications
6
The IBAA Algorithm #define ALPHA (8) #define SIZE (1<<ALPHA) #define ind(x) ((x)&(0x1F)) #define barrel(a) (((a)<<19)^((a)13)) /*beta=32, shift=19*/ … y=y1+b; m[ind(i)]=y; b=m[ind(y>>ALPHA]+x; r[ind(i)]=b; for(i=0;i<SIZE;++i){ X=m[ind(i)]; A=barrel(a)+m[ind(I +16)]; Y1=m[ind(x)]+a; Y=y1+b; M[ind(i)]=y; B=m[ind(y)>>ALPHA]+x; R[ind(i)]=b; }
7
Algorithm Animation TBC
8
Algorithm to Architecture Explain progression from C code to choice of hardware.
9
Algorithm to Architecture Explain the choice for a 2-Stage Pipeline with multiple cycles per stage.
10
Algorithm to Architecture Explain why 4 cycles per stage yields the best throughput under hardware assumptions
11
SRAM (M) SRAM (R) FSM Adder Counter Control Logic Register Counter (M1, M2, M3) Registers Adder (X) Reg (B) Reg (Y) Reg Adder (Y1) Reg Adder (A) Reg typedef unsigned int u4; /* unsigned four bytes, 32 bits */ #define ALPHA (8) #define SIZE (1<<ALPHA) #define ind(x) ((x)&(SIZE-1)) #define barrel(a) (((a) >13)) /* beta=32,shift=19 */ static void ibaa(m,r,aa,bb) u4 *m; /* Memory: array of SIZE ALPHA-bit terms */ u4 *r; /* Results: the sequence, same size as m */ u4 *aa; /* Accumulator: a single value */ u4 *bb; /* the previous result */ { register u4 a,b,x,y,i; a = *aa; b = *bb; for (i=0; i<SIZE; ++i) { x = m[i]; a = barrel(a) + m[ind(i+(SIZE/2))]; /* set a */ m[i] = y = m[ind(x)] + a + b; /* set m */ r[i] = b = m[ind(y>>ALPHA)] + x; /* set r */ } *bb = b; *aa = a; } (M4) Reg
12
Floorplan Evolution: #1
13
Floorplan #2
14
Final Floorplan
15
Animation showing what happens on every cycle of the loop.
16
DFM & ME The Rules –Everything is on a grid –Everything is mono-directional –All metal widths are the same –Contacts same width as metals
17
Why DFM Easier to perform RET Manufacturability A must for the new generation of transistor sizes.
18
Pros Regular Layout Enforced Standardization More Accurate Resolution Contacts match metal widths
19
Example: Group Propagate
20
CONS Harder to “cut-corners” More time-involving Increased Area Decreased Speed More Metal Layers Learning Curve
21
Adder Four adders execute 256 times each to generate one number. Hybrid carry skip, carry look ahead, conditional sum, … Fast and low power. Chirca, Schulte, Glossner, et al. “A Static Low-Power, High-Performance 32- bit Carry Skip Adder” http://mesa.ece.wisc.edu/publications/cp_2004-12.pdf
22
32-Bit Adder Block Diagram CS4CS18CS6CS4 A[3:0]B[3:0]A[9:4]B[9:4] A[27:10]B[27:10] A[31:28]B[31:28] S[31:28]S[27:10]S[9:4]S[3:0] C[0]C’[4]C[10]C’[28]C[32]
23
First CS4 Block 32-Bit Adder
24
CS18 Block 32-Bit Adder
25
32 Bit Fast Adder
26
Adder Performance Discuss trade off’s in speed and power.
27
SRAM Single Bus Cell Double Bus Cell
28
SRAM Single Bus
29
Dual Bus SRAM
30
Discuss Speed and Power SRAM power consumption Why we can’t do better with the SRAM
31
Verification Tested architectural verilog against C code for matching 1024-bit number results. Tested architectural verilog against structural verilog for matching port outputs.
32
Verification Verified Schematic against Verilog implementation in cadence –Made sure that output was the same –Checked delays and voltage levels Verified layout vs. schematic –Checked levels with parasitics –Performed LVS test
34
Poly Density 7.06%
35
Metal Density 19.59 %
36
Metal2 Density 18.85%
37
Metal3 Density 19.24%
38
Metal5 Density 4.75% Metal4 Density 8.91%
39
Critical Delay
40
Specs Pins –40 input pins (including clock, vdd, gnd) –32 output pins (the random number) 475 MHz chip speed 436 KHz throughput
41
Part Trans Count AreaDensity Prop Delay Schematic #s ExtractRC #s Power (1x) @ 500MHz Power (Average) @ 500 MHz Adders (4) 5,856 (1,464 ea.) 25,200 um2 (6,300um2 ea.) 0.232 1.45 ns 1.56 ns 600 uW 620 uW 140 uW 148 uW SRAM (M&R) 17,736 (M=10,458 R=7,278) 51,000 um2 (M=35,000 R=16,000 0.348 (M=0.293 R=0.456) 735ps 845ps W: 510 uW W: 3.25 mW R: 190 uW R: 1.40 mW 270 uW 1.86 mW Registers (10) 6400 (640 ea.) 38,400um2 (3,840um2 ea.) 0.167 220 ps 275 ps 530 uW 590 uW 130 uW 145 uW Total 33225 182,000 um2 0.1832.1 ns 475 MHz -----4.1 mW Putting it All Together
42
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.