SCOTT MILLER, AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE ReCoEng Lab DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF.

Slides:



Advertisements
Similar presentations
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. YuGuy G.F. Lemieux September 15, 2005.
Advertisements

Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Reconfigurable Computing (EN2911X, Fall07) Lecture 04: Programmable Logic Technology (2/3) Prof. Sherief Reda Division of Engineering, Brown University.
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock Slide 1 Spezielle.
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
1 EFFICIENT ADDERS TO SPEEDUP MODULAR MULTIPLICATION FOR CRYPTOGRAPHY Adnan Gutub Hassan Tahhan Computer Engineering Department KFUPM, Dhahran, SAUDI ARABIA.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
From Sequences of Dependent Instructions to Functions An Approach for Improving Performance without ILP or Speculation Ben Rudzyn.
Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Parallel Adder Recap To add two n-bit numbers together, n full-adders should be cascaded. Each full-adder represents a column in the long addition. The.
ECE 331 – Digital System Design
UNIVERSITY OF MASSACHUSETTS Dept
6/11/2015 Adaptive Hardware Design for Digital Signal Processing Advisor: Dr. Thomas L. Stewart By: Prabjot Kaur Alex Tan.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
VLSI Arithmetic Adders Prof. Vojin G. Oklobdzija University of California
Design of a Reconfigurable Hardware For Efficient Implementation of Secret Key and Public Key Cryptography.
Evolution of implementation technologies
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Shifters. n Adders and ALUs.
Programmable logic and FPGA
CHES20021 Scalable and Unified Hardware to Compute Montgomery Inverse in GF(p) and GF(2 n ) A. Gutub, A. Tenca, E. Savas, and C. Koc Information Security.
February 4, 2002 John Wawrzynek
Fall 2008EE VLSI Design I - © Kia Bazargan 1 EE 5323 – VLSI Design I Kia Bazargan University of Minnesota Adders.
M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient.
1 DSP Implementation on FPGA Ahmed Elhossini ENGG*6090 : Reconfigurable Computing Systems Winter 2006.
Dr. Konstantinos Tatas ACOE201 – Computer Architecture I – Laboratory Exercises Background and Introduction.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University.
VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California
Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Efficient.
3-1 Chapter 3 - Arithmetic Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles of Computer Architecture.
Power Reduction for FPGA using Multiple Vdd/Vth
Asynchronous Datapath Design Adders Comparators Multipliers Registers Completion Detection Bus Pipeline …..
Ch.9 CPLD/FPGA Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
Enhancing FPGA Performance for Arithmetic Circuits Philip Brisk 1 Ajay K. Verma 1 Paolo Ienne 1 Hadi Parandeh-Afshar 1,2 1 2 University of Tehran Department.
Chapter 4 – Arithmetic Functions and HDLs Logic and Computer Design Fundamentals.
Arithmetic Building Blocks
EECS Components and Design Techniques for Digital Systems Lec 16 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
J. Christiansen, CERN - EP/MIC
Lecture 4 Multiplier using FPGA 2007/09/28 Prof. C.M. Kyung.
Reconfigurable Computing - Type conversions and the standard libraries John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots.
July 2005Computer Architecture, The Arithmetic/Logic UnitSlide 1 Part III The Arithmetic/Logic Unit.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics n Shifters. n Adders and ALUs.
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders ECE 645: Lecture 5.
Paper Review Presentation Paper Title: Hardware Assisted Two Dimensional Ultra Fast Placement Presented by: Mahdi Elghazali Course: Reconfigurable Computing.
COMP541 Arithmetic Circuits
A Physical Resource Management Approach to Minimizing FPGA Partial Reconfiguration Overhead Heng Tan and Ronald F. DeMara University of Central Florida.
1 Leakage Power Analysis of a 90nm FPGA Authors: Tim Tuan (Xilinx), Bocheng Lai (UCLA) Presenter: Sang-Kyo Han (ECE, University of Maryland) Published.
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA Project Guide: Smt. Latha Dept of E & C JSSATE, Bangalore. From: N GURURAJ M-Tech,
Building a Faster Adder
COMP541 Arithmetic Circuits
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
CPEN Digital System Design
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
Application of Addition Algorithms Joe Cavallaro.
Carry-Lookahead, Carry-Select, & Hybrid Adders ECE 645: Lecture 2.
Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Improving.
Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of.
EEL 5722 FPGA Design Fall 2003 Digit-Serial DSP Functions Part I.
1 The ALU l ALU includes combinational logic. –Combinational logic  a change in inputs directly causes a change in output, after a characteristic delay.
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
Reconfigurable Computing - Performance Issues John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
ECE 331 – Digital System Design
Verilog to Routing CAD Tool Optimization
EFFICIENT ADDERS TO SPEEDUP MODULAR MULTIPLICATION FOR CRYPTOGRAPHY
Presentation transcript:

SCOTT MILLER, AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE ReCoEng Lab DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF VICTORIA VICTORIA, B.C., CANADA VLSI Implementation of a Cryptography-Oriented Reconfigurable Array DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Outline Motivation and Problem Statement Overview of Current FPGAs Limitations for Cryptography Carry Lookahead Addition CryptoRA Tile Implementation, Split LUT Simulation Framework Results Conclusions DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Motivation Problem Cryptography on mobile, embedded systems ASICs are expensive  Recurring engineering, quick obsolescence Poor long-integer arithmetic support in current FPGAs Design Constraints Low added complexity No (negligible) impact on reconfigurability “Cheap” solution DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Overview of FPGAs Grid of computing units Mesh of configurable interconnection busses Emulate any digital logic function Global Interconnect slow CLB DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Overview of FPGAs DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada Xilinx Virtex-II 4-input LUT Support for ripple- carry and carry- lookahead adders

Carry-Lookahead Addition Ripple Carry Adders have serial delay Carry Lookahead calculate carries in parallel Can use hierarchies of CLA adders to speed-up long-operand calculations OPERANDS FOR CLA = Generate = Propagate = Propagate = Nothing

FPGAs: Limitations for Cryptography Poor support for long-integer arithmetic  Long ripple-carry chains (with global interconnects) ‏  Fast-adders still require multiple stages of global-interconnects Same difficulties for comparison operations  Required in most common ECC and RSA algorithms DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

FPGAs: Limitations for Cryptography DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Proposed Solution: CryptoRA Based on Xilinx architecture Additional fast-path provided for simultaneous Carry, Propagate signals Extends fast-path across in rows as well as columns Splits LUT to handle subtraction, etc. DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

CryptoRA: Split LUT DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

VLSI Modeling DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Simulation Framework All designs simulated in 65nm technology  Simulated with Cadence Spectre simulator  Average taken of 10 Monte Carlo runs with process variation and mismatched included Simulated simplified CLB models  Many components outside the scope of this research  Respective loads for omitted modules were included Timing simulated at every point of interest in the LUT -> Fast chain path to find all timing trade-offs DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Results: Split LUT DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Results: Split LUT DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Results - Discussion Performance boost of added carry-chain and additional fast-path cannot be directly quantified  Dependence on physical FPGA itself, and operand word-length Hierarchical carry-lookahead adders show promise with the new chains for increased performance Example calculations are given in the paper Performance comes at 2.5% area increase over smallest reference structure DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Conclusions Split LUT structure enhances performance at minor (2.5%) area penalty Increased speed in carry chain and avoiding global interconnect improves long-integer operation performance Line-loading overhead from extra fast-chains is very small This device shows promise for performing cryptographic operations. DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada

Thank You for Listening Any Questions? Scott Miller DSD Parma, Italy ReCoEng Lab, University of Victoria, Canada