Download presentation
Presentation is loading. Please wait.
Published byBarnard Gaines Modified over 9 years ago
1
Kris Gaj Office hours: Monday, 7:30-8:30 PM Thursday, 7:30-8:30 PM Research and teaching interests: cryptography computer arithmetic VLSI design and testing Contact: Science & Technology II, room 223 kgaj@gmu.edu, kgaj01@yahoo.com, (703) 993-1575
2
ECE 645 Part of: MS in EE MS in CpE Digital Systems Design – required course Other concentration areas – elective course Certificate in VLSI Design/Manufacturing PhD in IT PhD in ECE
3
Spring 2006 Enrollment as of January 23, 2006 MS in CpE 7 MS in EE 6 BS in CpE 1 PhD in ECE 1 PhD in IT 1 MS in ISA 1 NDG 1
4
DIGITAL SYSTEMS DESIGN Concentration advisor: Ken Hintz 1. ECE 545 Introduction to VHDL – K. Gaj, K. Hintz, project, VHDL, Aldec/Synplicity/Xilinx and ModelSim/Synopsys 2. ECE 645 Computer Arithmetic: HW and SW Implementation – K. Gaj, project, VHDL, Aldec/Synplicity/Xilinx and ModelSim/Synopsys 3. ECE 586 Digital Integrated Circuits – D. Ioannou 4. ECE 681 VLSI Design Automation – T. Storey, project/lab, back-end design with Synopsys tools
5
algorithmic Design level register-transfer gate transistor layout devices Courses Computer Arithmetic Introduction to VHDL Digital Integrated Circuits ECE 545 ECE 645 ECE 586 ECE 684 MOS Device Electronics VLSI Design Automation ECE 681 Semiconductor Device Fundamentals ECE 584
6
Prerequisites Permission of the instructor, granted assuming that you know VHDL or Verilog,High level programming language (preferably C) ECE 545 Introduction to VHDL or
7
Course web page ECE web page Courses Course web pages ECE 645 http://teal.gmu.edu/courses/ECE645/index.htm
8
Computer Arithmetic LectureProject Project 1 20 % Project 2 30 % Homework 15 % Midterm exam 1 (in class) 20 % Midterm exam 2 (take-home) 15 %
9
Advanced digital circuit design course covering addition and subtraction multiplication division and modular reduction exponentiation Efficient Integers unsigned and signed Real numbers fixed point single and double precision floating point Elements of the Galois field GF(2 n ) polynomial base
10
Lecture topics (1) 1. Applications of computer arithmetic algorithms 2. Number representation Unsigned Integers Signed Integers Fixed-point real numbers Floating-point real numbers Elements of the Galois Field GF(2 n ) INTRODUCTION
11
1. Basic addition, subtraction, and counting 2. Carry-lookahead, carry-select, and hybrid adders 3. Adders based on Parallel Prefix Networks ADDITION AND SUBTRACTION
12
MULTIOPERAND ADDITION 1. Carry-save adders 2. Wallace and Dadda Trees 3. Adding multiple signed numbers
13
MULTIPLICATION 1. Tree and array multipliers 2. Sequential multipliers 3. Multiplication of signed numbers and squaring
14
DIVISION 1.Basic restoring and non-restoring sequential dividers 2. SRT and high-radix dividers 3. Array dividers
15
FLOATING POINT AND GALOIS FIELD ARITHMETIC 1.Floating-point units 2. Galois Field GF(2 n ) units
16
University of California, Santa Barbara, Behrooz Parhami, ECE252B: Computer Arithmetic. University of Massachusetts, Amherst, Israel Koren, ECE666: Digital Computer Arithmetic Lehigh University, Michael Schulte, ECE496: High-Speed Computer Arithmetic. Worcester Polytechnic Institute, Berk Sunar, EE-579 V Computer Arithmetic Circuits. Stanford University, Michael Flynn, EE486: Advanced Computer Arithmetic. University of California, Davies, Vojin Oklobdzija, ECE278: Computer Arithmetic for Digital Implementation. Similar courses at other universities
17
New in this course real-life project based on VHDL or Verilog HDL operations in the Galois Field (with the application in cryptography and communications)
18
Possible topics for a Scholarly Paper or Research Project for the CpE & EE students Advanced Computer Arithmetic Square root Exponential and logarithmic functions Trigonometric functions Hyperbolic functions Fault-Tolerant Arithmetic Low-Power Arithmetic High-Throughput Arithmetic
19
Three Curriculum Options MS Thesis Option Research Project Option Scholarly Paper Option 2 core courses 4 required courses 2 elective courses 3 elective courses 4 elective courses ECE 799 Master’s Thesis (6 cr. hrs) ECE 798 Research Project Scholarly paper
20
Literature (1) Required textbook: Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design, Oxford University Press, 2000. Milos D. Ercegovac and Tomas Lang Digital Arithmetic, Morgan Kaufmann Publishers, 2004. Isreal Koren, Computer Arithmetic Algorithms, 2nd edition, A. K. Peters, Natick, MA, 2002. Recommended textbooks:
21
Literature (2) 1. Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998. 2. Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004. VHDL books (used in ECE 545 in Fall 2005)
22
Literature (3) Supplementary books: 1.E. E. Swartzlander, Jr., Computer Arithmetic, vols. I and II, IEEE Computer Society Press, 1990. 2. Alfred J. Menezes, Paul C. van Oorschot, and Scott A. Vanstone, Handbook of Applied Cryptology, Chapter 14, Efficient Implementation, CRC Press, Inc., 1998. 3. Christof Paar, Efficient VLSI Architectures for Bit Parallel Computation in Galois Fields, VDI Verlag, 1994.
23
Literature (3) Proceedings of conferences ARITH - International Symposium on Computer Arithmetic ASIL - Asilomar Conference on Signals, Systems, and Computers ICCD - International Conference on Computer Design CHES - Workshop on Cryptographic Hardware and Embedded Systems Journals and periodicals IEEE Transactions on Computers, in particular special issues on computer arithmetic: 8/70, 6/73, 7/77, 4/83, 8/90, 8/92, 8/94. IEEE Transactions on Circuits and Systems IEEE Transactions on Very Large Scale Integration IEE Proceedings: Computer and Digital Techniques Journal of VLSI Signal Processing
24
Homework reading assignments (main textbook + articles) analysis of hardware and software algorithms and implementations design of small hardware units using VHDL or Verilog Optional assignments Possibility of trading analysis vs. design vs. coding
25
Midterm exams Exam 1 - 2 hrs 30 minutes, in class multiple choice + short problems Exam 2 – 48 hrs, take-home analysis and design of arithmetic units using VHDL or Verilog HDL Practice exams on the web Exam 1 - Monday, March 27 Exam 2 - Saturday-Sunday, May 6-7 Tentative days of exams:
26
Project (1) Project I (20% of grade) Design and comparative analysis of fast adders (several hundred bits long) Final report due Monday, March 20 Optimization criteria: minimum latency maximum throughput minimum area minimum product latency · area maximum ratio throughput/area scalability Similar for all studentsDone individually
27
Project II (30% of grade) Fast multiplication squaring division modular reduction, or modular exponentiation Project (2) or Fast addition or multiplication Long unsigned or signed integers Floating-point numbers
28
Written report & oral presentation Monday, May 15 Real life application Requirements derived from the analysis of the application Typically both hardware and software design Several project topics proposed on the web You can choose project topic by yourself Can be done in a group of 1-3 students Project II (rules)
29
Cooperation (but not exchange of code) between teams is encouraged Every team works on a slightly different problem Project topics should be more complex for larger teams Project II (rules)
30
Project Hardware Software VHDL (or Verilog) code Latency and/or throughput Area High level language (C preferred) Execution time Memory requirements Scalability
31
Degrees of freedom and possible trade-offs speedarea power testability ECE 645 ECE 682 ECE 586, 681
32
speed area latency throughput Degrees of freedom and possible trade-offs
33
Timing parameters definitionunitspipelining latency throughput delay clock period clock frequency time input output #output bits/time unit time point point rising edge rising edge of clock 1 clock period ns Mbits/s ns MHz bad good
34
Project technologies semi-custom Application Specific Integrated Circuits and Field Programmable Gate Arrays
35
Levels of design description Algorithmic level Register Transfer Level Logic (gate) level Circuit (transistor) level Physical (layout) level Level of description most suitable for synthesis
36
Register Transfer Logic (RTL) Design Description Combinational Logic Combinational Logic … Clock Registers
37
RTL Block Synthesis* *Simplified design flow Estimated Area Estimated Timing
38
VHDL Design Styles Components and interconnects structural VHDL Design Styles dataflow Concurrent statements behavioral (algorithmic) Registers State machines Test benches Sequential statements Subset most suitable for use in this course
39
CAD software available at GMU (1) Aldec Active-HDL (under Windows) ModelSim (under Unix) available from all PCs in the ECE educational labs using an X-terminal emulator available remotely from home using a fast Internet connection available in the FPGA Lab, S&T II, room 203 VHDL simulators student edition can be purchased on an individualstudent edition basis ($59.95 + S&H)
40
CAD software available at GMU (2) Synplicity Synplify Pro (under Windows) Synopsys Design Compiler (under Unix) available from all PCs in the ECE educational labs using an X-terminal emulator available remotely from home using a fast Internet connection available in the FPGA Lab, S&T II, room 203 Tools used for logic synthesis Xilinx XST (under Windows) FPGA synthesis ASIC synthesis
41
CAD software available at GMU (3) Xilinx ISE (under Windows) available in the FPGA Lab, S&T II, room 203 Tools used for implementation (mapping, placing & routing) in the FPGA technology
42
How to learn VHDL for synthesis by yourself? Lecture slides for ECE 545 from Fall 2005 Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998. Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004. Individual or small-group hands-on sessions with the TA Practice, Practice, Practice!!!
43
Testbench testbench design entity Architecture 1 Architecture 2 Architecture N.. Non-synthesizable Synthesizable
44
Design Environment Test Vectors (Inputs) Actual Results vs. Expected Results Comparison HDL Design (VHDL or Verilog) Reference Model ( C )
45
Primary applications (1) Execution units of general purpose microprocessors Integer units Floating point units Integers (8, 16, 32, 64 bits) Real numbers (32, 64 bits)
46
Primary applications (2) Digital signal and digital image processing Real numbers (fixed-point or floating point) e.g., digital filters Discrete Fourier Transform Discrete Hilbert Transform General purpose DSP processors Specialized circuits
47
Primary applications (3) Coding Elements of the Galois fields GF(2 n ) (4-64 bits) Error detection codes Error correcting codes
48
Secret-key (Symmetric) Cryptosystems key of Alice and Bob - K AB Alice Bob Network Encryption Decryption
49
Primary applications (4) Cryptography Integers (16, 32 bits) Secret key cryptography IDEA, RC6, MarsTwofish, Rijndael Elements of the Galois field GF(2 n ) (4, 8 bits)
50
RC6 MARS Twofish MUL32, 2 x ROL32, S-box 9x32 Main operations Auxiliary operations XOR, ADD/SUB32 2 x SQR32, 2 x ROL32 XOR, ADD/SUB32 96 S-box 4x4, 24 MUL GF(2 8 ) XOR ADD32 Rijndael Serpent 8 x 32 S-box 4x4 XOR 16 S-box 8x8 24 MUL GF(2 8 ) XOR
51
Public Key (Asymmetric) Cryptosystems Public key of Bob - K B Private key of Bob - k B Alice Bob Network Encryption Decryption
52
RSA as a trap-door one-way function M C = f(M) = M e mod N C M = f -1 (C) = C d mod N PUBLIC KEY PRIVATE KEY N = P Q P, Q - large prime numbers e d 1 mod ((P-1)(Q-1))
53
RSA keys PUBLIC KEY PRIVATE KEY { e, N } { d, P, Q } N = P Q e d 1 mod ((P-1)(Q-1)) P, Q - large prime numbers
54
Primary applications (5) Cryptography Long integers (1000-2000 bits) Public key cryptography RSA, DSS, Diffie-Hellman Elliptic Curve Cryptosystems Elements of the Galois field GF(2 n ) (150-250 bits)
55
Topic 1 Application: modern secret-key ciphers, candidates for the new Advanced Encryption Standard (AES): MARS developed by IBM RC6 developed at MIT Function: 32-bit unsigned multiplication and squaring modulo 2 32 Optimization: maximum throughput minimum latency minimum area Environment: hardware, software for 8-bit processors C = A · B mod 2 32, C = A 2 mod 2 32
56
Topic 2 Application: digital filters Function: 64-bit signed multiplier-accumulator (MAC) accumulating at least 256 partial products Environment: hardware, software for a general purpose DSP or microprocessor Optimization: Hardware - maximum throughput limited area Software – minimum execution time, limited memory C = A i · B i i=1 256
57
Topic 3 Application: general purpose microprocessor Function: multiplication of two 64-bit signed numbers + division of a 128-bit number by a 64-bit number Environment: hardware, software for a 64-bit processor without multiplication and division built in Optimization: Hardware – minimum latency maximum throughput limited area Software – minimum execution time, limited memory C = A · B C=A / B
58
Topic 4 Application: modern public-key ciphers RSA Diffie-Hellman Elliptic Curve Cryptosystems Function: modular exponentiation C=M E mod N M, N – arbitrary 768-bit numbers, E=2 16 +1 Optimization: Hardware - minimum latency limited area Software – minimum execution time, limited memory Environment: hardware, software for 32-bit or 8-bit processors C = A E mod N
59
Topic 5 Application: general purpose microprocessor or digital signal processor Function: floating point addition and multiplication according to ANSI/IEEE 754 Environment: hardware, software for a 32-bit processor without floating point operations Optimization: Hardware – minimum latency maximum throughput limited area Software – minimum execution time, limited memory Z = X+Y Z = X · Y
60
Famous computer arithmetic bugs and flaws
61
Learn to deal with approximations In digital arithmetic one has to come to grips with approximation and questions like: –When is approximation good enough –What margin of error is acceptable Be aware of the applications you are designing the arithmetic circuit or program for. Analyze the implications of your approximation.
62
Calculators u = 10 times v = 2 1/1024 = 1.000 677 131= 1.000 677 131 x = (((u 2 ) 2 )…) 2 = 1.999 999 963 10 times x’ = u 1024 = 1.999 999 973 y = (((v 2 ) 2 )…) 2 = 1.999 999 983 10 times y’ = v 1024 = 1.999 999 994 Hidden digits in the internal representation of numbers Different algorithms give slightly different results Very good accuracy
63
Consequences of bad approximations Example: Failure of Patriot Missile (1991 Feb. 25) Source http://www.math.psu.edu/dna/455.f96/disasters.html American Patriot Missile battery in Dharan, Saudi Arabia, failed to intercept incoming Iraqi Scud missile The Scud struck an American Army barracks, killing 28 Cause, per GAO/IMTEC-92-26 report: “software problem” (inaccurate calculation of the time since boot) Specifics of the problem: time in tenths of second as measured by the system’s internal clock was multiplied by 1/10 to get the time in seconds Internal registers were 24 bits wide 1/10 = 0.0001 1001 1001 1001 1001 100 (chopped to 24 b) Error 0.1100 1100 2 –23 9.5 10 –8 Error in 100-hr operation period 9.5 10 –8 100 60 60 10 = 0.34 s Distance traveled by Scud = (0.34 s) (1676 m/s) 570 m This put the Scud outside the Patriot’s “range gate” Ironically, the fact that the bad time calculation had been improved in some (but not all) code parts contributed to the problem, since it meant that inaccuracies did not cancel out
64
Example: Explosion of Ariane Rocket (1996 June 4) Source http://www.math.psu.edu/dna/455.f96/disasters.html Unmanned Ariane 5 rocket launched by the European Space Agency veered off its flight path, broke up, and exploded only 30 seconds after lift-off (altitude of 3700 m) The $500 million rocket (with cargo) was on its 1st voyage after a decade of development costing $7 billion Cause: “software error in the inertial reference system” Specifics of the problem: a 64 bit floating point number relating to the horizontal velocity of the rocket was being converted to a 16 bit signed integer An SRI* software exception arose during conversion because the 64-bit floating point number had a value greater than what could be represented by a 16-bit signed integer (max 32 767) Consequences of bad approximations
65
Pentium bug (1) October 1994 Thomas Nicely, Lynchburg Collage, Virginia finds an error in his computer calculations, and traces it back to the Pentium processor Tim Coe, Vitesse Semiconductor presents an example with the worst-case error c = 4 195 835/3 145 727 Pentium = 1.333 739 06... Correct result = 1.333 820 44... November 7, 1994 Late 1994 First press announcement, Electronic Engineering Times
66
Pentium bug (2) Intel admits “subtle flaw” Intel’s white paper about the bug and its possible consequences Intel - average spreadsheet user affected once in 27,000 years IBM - average spreadsheet user affected once every 24 days Replacements based on customer needs Announcement of no-question-asked replacements November 30, 1994 December 20, 1994
67
Pentium bug (3) Error traced back to the look-up table used by the radix-4 SRT division algorithm 2048 cells, 1066 non-zero values {-2, -1, 1, 2} 5 non-zero values not downloaded correctly to the lookup table due to an error in the C script
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.