ECE 645 Spring 2007 PROJECT 2 Specification. Topic Options.

Slides:



Advertisements
Similar presentations
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Advertisements

ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
High Level Languages: A Comparison By Joel Best. 2 Sources The Challenges of Synthesizing Hardware from C-Like Languages  by Stephen A. Edwards High-Level.
Introduction to VHDL (Lecture #5) ECE 331 – Digital System Design The slides included herein were taken from the materials accompanying Fundamentals of.
6/20/2015 5:05 AMNumerical Algorithms1 x x1x
Dr. Turki F. Al-Somani VHDL synthesis and simulation – Part 3 Microcomputer Systems Design (Embedded Systems)
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts,
ECE 448: Spring 12 Lab 4 – Part 2 Finite State Machines Basys2 FPGA Board.
Delevopment Tools Beyond HDL
Montgomery Multipliers & Exponentiation Units
Introduction to FPGA AVI SINGH. Prerequisites Digital Circuit Design - Logic Gates, FlipFlops, Counters, Mux-Demux Familiarity with a procedural programming.
ECE 545 Project 1 Part IV Key Scheduling Final Integration List of Deliverables.
Simulink ® Interface Course 13 Active-HDL Interfaces.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
1 HandleC ) prepared by: Mitra Khorram Abadi professor: Dr. Maziar Goudarzi A language based on ISO-C, extended for hardware design ( HandleC ) prepared.
SHA-3 Candidate Evaluation 1. FPGA Benchmarking - Phase Round-2 SHA-3 Candidates implemented by 33 graduate students following the same design.
ECE 545 – Introduction to VHDL ECE 645—Project 2 Project Options.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
VHDL Project Specification Naser Mohammadzadeh. Schedule  due date: Tir 18 th 2.
ECE 449: Computer Design Lab Coordinator: Kris Gaj TAs: Tuesday session: Pawel Chodowiec Thursday session: Nghi Nguyen.
George Mason University ECE 545 – Introduction to VHDL ECE 545 Lecture 5 Finite State Machines.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Lecture 18 FPGA Boards & FPGA-based Supercomputers High Level Language (HLL) Design Methodology.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
ECE 545 Project 2 Specification. Schedule of Projects (1) Project 1 RTL design for FPGAs (20 points) Due date: Tuesday, November 22, midnight (firm) Checkpoints:
RSA and its Mathematics Behind July Topics  Modular Arithmetic  Greatest Common Divisor  Euler’s Identity  RSA algorithm  Security in RSA.
George Mason University ECE 449 – Computer Design Lab Welcome to the ECE 449 Computer Design Lab Spring 2004.
George Mason University Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code ECE 448 Lecture 6.
Introductory project. Development systems Design Entry –Foundation ISE –Third party tools Mentor Graphics: FPGA Advantage Celoxica: DK Design Suite Design.
Evaluating and Improving an OpenMP-based Circuit Design Tool Tim Beatty, Dr. Ken Kent, Dr. Eric Aubanel Faculty of Computer Science University of New Brunswick.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
Hardware languages "Programming"-language for modelling of (digital) hardware 1 Two main languages: VHDL (Very High Speed Integrated Circuit Hardware Description.
Lecture 6.1: Misc. Topics: Number Theory CS 250, Discrete Structures, Fall 2011 Nitesh Saxena.
FPGA-based Supercomputers
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
RTL Design Methodology Transition from Pseudocode & Interface
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
VHDL and Hardware Tools CS 184, Spring 4/6/5. Hardware Design for Architecture What goes into the hardware level of architecture design? Evaluate design.
Lecture 5B Block Diagrams HASH Example.
Lecture 3 RTL Design Methodology Transition from Pseudocode & Interface to a Corresponding Block Diagram.
IAY 0600 Digital Systems Design Register Transfer Level Design (GCD example) Lab. 7 Alexander Sudnitson Tallinn University of Technology.
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm Miaoqing Huang 1, Kris Gaj 2, Soonhak Kwon 3, Tarek El-Ghazawi 1 1 The George.
ECE 545 Project 1 Introduction & Specification Part I.
1 Introduction to Engineering Spring 2007 Lecture 18: Digital Tools 2.
Number-Theoretic Algorithms
RTL Design Methodology Transition from Pseudocode & Interface
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code.
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
RTL Design Methodology
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
Reconfigurable Computing
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
RTL Design Methodology
Discrete Math for CS CMPSC 360 LECTURE 12 Last time: Stable matching
RTL Design Methodology Transition from Pseudocode & Interface
RTL Design Methodology
RTL Design Methodology Transition from Pseudocode & Interface
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code.
RTL Design Methodology
RTL Design Methodology Transition from Pseudocode & Interface
RTL Design Methodology
RTL Design Methodology Transition from Pseudocode & Interface
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL code ECE 448 – FPGA and ASIC Design.
Digital Designs – What does it take
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
Presentation transcript:

ECE 645 Spring 2007 PROJECT 2 Specification

Topic Options

Public Key (Asymmetric) Cryptosystems Public key of Bob - K B Private key of Bob - k B Alice Bob Network Encryption Decryption

RSA as a trap-door one-way function M C = f(M) = M e mod N C M = f -1 (C) = C d mod N PUBLIC KEY PRIVATE KEY N = P  Q P, Q - large prime numbers e  d  1 mod ((P-1)(Q-1))

RSA keys PUBLIC KEY PRIVATE KEY { e, N } { d, P, Q } N = P  Q e  d  1 mod ((P-1)(Q-1)) P, Q - large prime numbers

Early Factoring Device – Lehmer Sieve Bicycle chain sieve [D. H. Lehmer, 1928] Computer Museum, Mountain View, CA

Supercomputer Cray-1 from 1980’s Computer Museum, Mountain View, CA

FPGA based supercomputers Machine Released SRC 6 from SRC Computers Cray XD1 from from Cray SGI Altix from SGI SRC 7 from SRC Computers, Inc,

Ruhr University, Bochum, University of Kiel, Germany, Spartan 3 FPGAs Clock frequency 100 MHz Cost: € 8980 COPACOBANA

Factoring 1024-bit RSA keys using Number Field Sieve (NFS) Polynomial Selection Linear Algebra Square Root RelationCollection Sieving Cofactoring 200 bit numbers & 350 bit Trial division ECM, p-1 method, rho method

Topic 1 Trial Division Sieve

Topic 1: Trial Division Sieve (1) Given: Inputs: Variables: 1.Integers N 1, N 2, N 3,.... each of the size of k-bits Constants: 2. Factor base = set of all primes smaller smaller than a certain bound B = { p 1 =2, p 2 =3, p 3 =5,..., p t ≤ B } Parameters of interest: 4 ≤ k ≤ ≤ B ≤ 10 5

Topic 1: Trial Division Sieve (2) Required: Outputs: For each integer N i : A list of primes from the factor base that divides N i, and the number of times each prime divides N i. For example if N i = p 1 e1 · p 2 e2 · p 3 e3 · M i, where M i is not divisible by any prime belonging to a factor base, then the output is {p 1, e1}, {p 2, e2}, {p 3, e3}

Topic 1: Trial Division Sieve (3) Example: Constants: k=10, B=5 Factor base = {2, 3, 5} Variables: N 1 = 408 = 2 3 · 3 · 17 N 2 = 630 = 2 · 3 2 · 5 · 7 Outputs: {2, 3}, {3, 1} {2, 1}, {3, 2}, {5, 1}

Topic 1: Trial Division Sieve (4) Optimization Criteria: Maximum number of integers N i fully processed per unit of time for a given k and B.

Topic 2 Greatest Common Divisor & Multiplicative Inverse

Topic 2: Greatest Common Divisor and Multiplicative Inverse(2) Given: Inputs: a, N: k-bit integers; a < N Outputs: y = gcd(a, N) x = a -1 mod N i.e., integer 1 ≤ x < N, such that a  x (mod N) = 1 Parameters of interest: 4 ≤ k ≤ 1024

Greatest common divisor Greatest common divisor of a and b, denoted by gcd(a, b), is the largest positive integer that divides both a and b. d = gcd (a, b) iff 1) d | a and d | b 2) if c | a and c | b then c  d

gcd (8, 44) = gcd (-15, 65) = gcd (45, 30) = gcd (31, 15) = gcd (121, 169) =

Quotient and remainder Given integers a and n, n>0  ! q, r  Z such that a = q  n + r and 0  r < n q – quotient r – remainder (of a divided by n) q = a n = a div n r = a - q  n = a – a n  n = = a mod n

Euclid’s Algorithm for computing gcd(a,b) i … t-1 t r i r -2 = max(a, b) r -1 = min(a, b) r 0 r 1 … r t-1 = gcd(a, b) r t =0 q i q -1 q 0 q 1 … q t-1 q i = r i-1 riri r i+1 = r i-1 - q i  r i r i+1 = r i-1 mod r i

Euclid’s Algorithm Example: gcd(36, 126) i r i r -2 = max(a, b) =126 r -1 = min(a, b) =36 r 0 = 18 = gcd(36, 126) r 1 = 0 q i q -1 = 3 q 0 = 2 q 1 q i = r i-1 riri r i+1 = r i-1 - q i  r i r i+1 = r i-1 mod r i

Multiplicative inverse modulo n The multiplicative inverse of a modulo n is an integer [!!!] x such that a  x  1 (mod n) The multiplicative inverse of a modulo n is denoted by a -1 mod n (in some books a or a * ). According to this notation: a  a -1  1 (mod n)

Extended Euclid’s Algorithm (1) i … t-1 t r i r -2 = n r -1 = a r 0 r 1 … r t-1 r t =0 x i x -2 =0 x -1 =1 x 0 x 1 … x t-1 x t q i q -1 =  n/a  q 0 q 1 … q t-1 q i = r i-1 riri r i+1 = r i-1 - q i  r i x i+1 = x i-1 - q i  x i y i+1 = y i-1 - q i  y i y i y -2 =1 y -1 =0 y 0 y 1 … y t-1 y t r i = x i  a + y i  n r t-1 = x t-1  a + y t-1  n

Extended Euclid’s Algorithm (2) r t-1 = x t-1  a + y t-1  n r t-1 = x t-1  a + y t-1  n  x t-1  a (mod n) If r t-1 = gcd (a, n) = 1 then x t-1  a  1 (mod n) and as a result x t-1 = a -1 mod n

Extended Euclid’s Algorithm for computing z = a -1 mod n i … t-1 t r i r -2 = n r -1 = a r 0 r 1 … r t-1 = 1 r t =0 x i x -2 =0 x -1 =1 x 0 x 1 … x t-1 = a -1 mod n x t =  n q i q -1 =  n/a  q 0 q 1 … q t-1 q i = r i-1 riri r i+1 = r i-1 - q i  r i x i+1 = x i-1 - q i  x i If r t-1  1 the inverse does not exist Note:

Extended Euclid’s Algorithm Example z = mod 117 i r i r -2 = 117 r -1 = 20 r 0 = 17 r 1 = 3 r 2 = 2 r 3 = 1 r 4 = 0 x i x -2 = 0 x -1 = 1 x 0 =-5 x 1 = 6 x 2 = -35 x 3 = 41 = mod 117 x 4 = -117 q i q -1 = 5 q 0 = 1 q 1 = 5 q 2 = 1 q 3 = 2 q i = r i-1 riri r i+1 = r i-1 - q i  r i x i+1 = x i-1 - q i  x i Check: 20  41 mod 117 = 1

Topic 3 RSA Encryption & Decryption with Montgomery Multipliers based on Carry Save Adders

RSA as a trap-door one-way function M C = f(M) = M e mod N C M = f -1 (C) = C d mod N PUBLIC KEY PRIVATE KEY N = P  Q P, Q - large prime numbers e  d  1 mod ((P-1)(Q-1))

Right-to-left binary exponentiation Left-to-right binary exponentiation Exponentiation: Y = X E mod N E = (e L-1, e L-2, …, e 1, e 0 ) 2 Y = 1; S = X; for i=0 to L-1 { if (e i == 1) Y = Y  S mod N; S = S 2 mod N; } Y = 1; for i=L-1 downto 0 { Y = Y 2 mod N; if (e i == 1) Y = Y  X mod N; }

Montgomery Modular Multiplication (1) C = A  B mod M A Integer domain Montgomery domain A’ = A  2 k mod M B B’ = B  2 k mod M C’ = MP(A’, B’, M) = = A’  B’  2 -k mod M = = (A  2 k )  (B  2 k )  2 -k mod M = = A  B  2 k mod M C’ = C  2 k mod M C = A  B A, B, M – k-bit numbers

Montgomery Modular Multiplication (2) A’ = MP(A, 2 2k mod M, M) C = MP(C’, 1, M) A A’ C C’

Montgomery Modular Multiplication (3) x 2n-1 x0x0... x1x1 x 2n-2 x 2n-3 xnxn... 2k bits X = A’B’ + q0Mq0M x 2n-1... x1x1 x 2n-2 x 2n-3 xnxn q 1 Mb x 2n-1... x 2n-2 x 2n-3 00 x2x C’ k bits C’ 2 k = X + zM C’ 2 k  X = A’B’ C’  A’B’ 2 -k

Fast modular exponentiation using Chinese Remainder Theorem = MPMP CPCP P dPdP mod = MQMQ CQCQ Q dQdQ C P = C mod P d P = d mod (P-1) C Q = C mod Q d Q = d mod (Q-1) = mod C M d N M = M P ·R Q + M Q ·R P mod N where R P = (P -1 mod Q) ·P = P Q-1 mod N R Q = (Q -1 mod P) ·Q= Q P-1 mod N

Time of exponentiation without and with Chinese Remainder Theorem SOFTWARE HARDWARE Without CRT With CRT t EXP (k) = c s  k 3 t EXP-CRT (k)  2  c s  ( ) 3 = t EXP (k) 1 4 Without CRT With CRT t EXP (k) = c h  k 2 t EXP-CRT (k)  c h  ( ) 2 = t EXP (k) 1 4 k 2 k 2

Topic 4 RSA Encryption & Decryption with Word-Based Montgomery Multipliers

Data dependency graph of a classical architecture by Tenca & Koc

Data dependency graph of a new design from GWU & GMU

Block diagram of the new architecture

Block diagram of the main Processing Element

Topic 5 p-1 Method of Factoring

p-1 algorithm Inputs : N– number to be factored a– arbitrary integer such that gcd(a, N)=1 B 1 – smoothness bound for Phase1 Outputs: q - factor of N, 1 < q ≤ N or FAIL

p-1 algorithm – Phase 1 precomputations postcomputations main computations out of scope for this project

p-1 Phase 1 – Numerical example N = = 1279·1361 a = 2 B 1 = 20 k = 2 4 ·3 2 ·5·7·11·13·17·19 = q 0 =a k mod N = mod = q = gcd (  1; ) = 1361 Why did the method work? q-1 = 1360 = 2·5·17 | k a k mod q = a (q-1)·m mod q = 1 q | a k -1

Design Methodology Options

by Mike Babst DSPlogic

Methodology 1 RTL VHDL Classical VHDL-based Design Methdology

Structure of a Typical Digital System Execution Unit (Datapath) Control Unit (Control) Data Inputs Data Outputs Control Inputs Control Outputs Control Signals

Hardware Design with RTL VHDL Pseudocode Execution Unit Control Unit Block diagram Block diagram ASM VHDL code Interface

Steps of the Design Process 1.Text description 2.Interface 3.Pseudocode 4.Block diagram of the Execution Unit 5.Interface with the division into Execution Unit and Control Unit 6.ASM chart and/or block diagram of the Control Unit 7.RTL VHDL code 8.Testbench 9.Debugging 10.Synthesis and implementation 11.Experimental testing (not required in this course)

Project 2 - Platform & tools Target devices: Xilinx FPGAs Tools: VHDL Simulation: Aldec Active HDL or Xilinx ModelSim VHDL Synthesis: Synplify Pro or Xilinx XST Implementation: Xilinx ISE or Xilinx WebPack All tools available in S&T 2, rooms 203 & 265. Xilinx tools available for free for home use. Aldec Active HDL student edition available for home use.

Methodology 2 Graphical Data Flow Language DSPlogic RCToolbox

See the presentation by Mike Babst, PhD DSPlogic available through WebCT

Project 2 - Platform & tools Target devices: Xilinx FPGAs Tools: Design Entry & Debugging: DSPlogic RC Toolbox MathWorks Simulink MathWorks Matlab Synthesis and Implementation: Xilinx System Generator Xilinx ISE All tools available in S&T 2, room 220.

Two hands-on sessions given by Dr. Babst during the first two weeks after the selection of the project

Reconfigurable computers supported by DSPlogic toolset Machine Released Cray XD1 from from Cray SGI Altix from SGI 2005

Interface  P memory  P memory... PP PP I/O Interface FPGA memory FPGA memory... FPGA... I/O Microprocessor systemReconfigurable system What is a Reconfigurable Computer?

Methodology 3 HLL Compilers Celoxica Handel C

Design Flow Executable Specification Handel-C Synthesis Place & Route VHDL EDIF

Handel-C / ANSI-C Comparisons Preprocessors ie. #define Structures ANSI-C Constructs for, while, if, switch Functions Arrays Pointers Arithmetic operators Bitwise logical operators Logical operators ANSI-C Standard Library Side Effects ie. X = Y++ Recursion Floating Point Handel-C Standard Library Parallelism Arbitrary width variables RAM, ROM SignalsChannels Interfaces Enhanced bit manipulation ANSI-CHANDEL-C

Handel-C Language (1) A subset of ANSI-C Sequential software style with a “par” construct to implement parallelism A channel “chan” statement allows for communication and synchronization between parallel branches Level of design abstraction is above RTL but below behavioral

Handel-C Language (2) Each assignment and delay statement take one clock cycle Automatic generation of the state machine from an algorithmic description of the circuit in terms of parallel and sequential blocks Automatic scheduling of parallel and sequential blocks, that is the code following a group is scheduled only after that whole group has completed

Handel-C Language (3) Automatic generation of clocks, clock enables and resets Combinational logic may be implemented using for example bus, port and signal types It is possible to design at a level where some Handel-C statements look similar to Verilog, but the overal program structure is different

Platform & tools – HLL Compilers Target devices: Xilinx FPGAs Tools: Design Entry & Debugging: Celoxica DK4 Design Suite (integrated environment providing Handel C compiler, debugging, simulation, and synthesis to EDIF and VHDL) Synthesis and Implementation: Xilinx ISE All tools available in S&T 2, rooms 203 & 265.

VHDL macro declaration in Handel-C ENTITY parmult IS port ( clk: IN std_logic; a: IN std_logic_VECTOR(7 downto 0); b: IN std_logic_VECTOR(7 downto 0); q: OUT std_logic_VECTOR(15 downto 0)); END parmult; interface parmult (unsigned 16 q) parmult_instance (unsigned 1 clk, unsigned 8 a, unsigned 2 b) with {busformat = "B(I)"};

unsigned 8 x1, x2; unsigned resultX; interface parmult (unsigned 16 q) parmult_instance1 (unsigned 1 clk = __clock, unsigned 8 a = x1, unsigned 8 b = x2 ) with {busformat = "B(I)"}; VHDL macro instantiation in Handel-C

Celoxica RC10 board supporting Handel C libraries used in the GMU ECE 448 FPGA and ASIC Design with VHDL

Literature Additional literature with the detailed description of all algorithms available for each project.

Project Organization 1-3 person teams allowed 2 person teams preferred by Friday midnight the latest Please submit your - ranking of 4 topics - ranking of 3 design methodologies