FPGA Implementation of Multicore AES 128/192/256

Slides:



Advertisements
Similar presentations
Basic HDL Coding Techniques
Advertisements

Spartan-3 FPGA HDL Coding Techniques
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
Internal Logic Analyzer Final presentation-part B
Internal Logic Analyzer Final presentation-part A
Graduate Computer Architecture I Lecture 15: Intro to Reconfigurable Devices.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
ECE 2372 Modern Digital System Design
ECE 545 Project 1 Part IV Key Scheduling Final Integration List of Deliverables.
Matrix Multiplication on FPGA Final presentation One semester – winter 2014/15 By : Dana Abergel and Alex Fonariov Supervisor : Mony Orbach High Speed.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
VHDL Project Specification Naser Mohammadzadeh. Schedule  due date: Tir 18 th 2.
Senior Project Presentation: Designers: Shreya Prasad & Heather Smith Advisor: Dr. Vinod Prasad May 6th, 2003 Internal Hardware Design of a Microcontroller.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
LZRW3 Data Compression Core Dual semester project April 2013 Project part A final presentation Shahar Zuta Netanel Yamin Advisor: Moshe porian.
ECE 448: Lab 6 DSP and FPGA Embedded Resources (Digital Downconverter)
LZRW3 Decompressor dual semester project Part A Mid Presentation Students: Peleg Rosen Tal Czeizler Advisors: Moshe Porian Netanel Yamin
High Speed Digital Systems Lab. Agenda  High Level Architecture.  Part A.  DSP Overview. Matrix Inverse. SCD  Verification Methods. Verification Methods.
ECE 545 Project 2 Specification. Project 2 (15 points) – due Tuesday, December 19, noon Application: cryptography OR digital signal processing optimized.
Timing and Constraints “The software is the lens through which the user views the FPGA.” -Bill Carter.
Field Programmable Port Extender (FPX) 1 Modular Design Techniques for the FPX.
FPGA Implementation of RC6 including key schedule Hunar Qadir Fouad Ramia.
RTL Design Methodology Transition from Pseudocode & Interface
November 29, 2011 Final Presentation. Team Members Troy Huguet Computer Engineer Post-Route Testing Parker Jacobs Computer Engineer Post-Route Testing.
Encryption / Decryption on FPGA Final Presentation Written by: Daniel Farcovich ID Saar Vigodskey ID Advisor: Mony Orbach Summer.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
ECE 545 Project 1 Introduction & Specification Part I.
Full Design. DESIGN CONCEPTS The main idea behind this design was to create an architecture capable of performing run-time load balancing in order to.
Encryption / Decryption on FPGA Midterm Presentation Written by: Daniel Farcovich ID Saar Vigodskey ID Advisor: Mony Orbach Summer.
SUBJECT : DIGITAL ELECTRONICS CLASS : SEM 3(B) TOPIC : INTRODUCTION OF VHDL.
Introduction to the FPGA and Labs
Lab 4 HW/SW Compression and Decompression of Captured Image
IAY 0600 Digital Systems Design
EE694v - Verification - Lect 12
Design and Analysis of Low-Power novel implementation of encryption standard algorithm by hybrid method using SHA3 and parallel AES.
Presenter: Darshika G. Perera Assistant Professor
ASIC Design Methodology
The 8085 Microprocessor Architecture
An FPGA Implementation of a Brushless DC Motor Speed Controller
Xin Fang, Pei Luo, Yunsi Fei, and Miriam Leeser
Dept. of Electrical and Computer Engineering
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Timing Model Start Simulation Delay Update Signals Execute Processes
The 8085 Microprocessor Architecture
Instructor: Dr. Phillip Jones
Interfacing Memory Interfacing.
Implementing Combinational and Sequential Logic in VHDL
Topics HDL coding for synthesis. Verilog. VHDL..
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
AT91 Memory Interface This training module describes the External Bus Interface (EBI), which generatesthe signals that control the access to the external.
XC4000E Series Xilinx XC4000 Series Architecture 8/98
Timing Analysis 11/21/2018.
IAY 0800 Digitaalsüsteemide disain
ChipScope Pro Software
Dynamic High-Performance Multi-Mode Architectures for AES Encryption
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Test Fixture (Testbench)
The 8085 Microprocessor Architecture
DIGITAL ON/OFF AM MODULATOR AMIT R SHARMA & AKRAM SHAZAD.
8253 – PROGRAMMABLE INTERVAL TIMER (PIT). What is a Timer? Timer is a specialized type of device that is used to measure timing intervals. Timers can.
ECE 352 Digital System Fundamentals
ChipScope Pro Software
"Computer Design" by Sunggu Lee
ECE 352 Digital System Fundamentals
COMP755 Advanced Operating Systems
Chapter 10 Introduction to VHDL
Performing Security Auditing In Hardware
♪ Embedded System Design: Synthesizing Music Using Programmable Logic
Presentation transcript:

FPGA Implementation of Multicore AES 128/192/256 By: William Whitehouse Hi my name is William Whitehouse. I am presenting my project an FPGA implementation of Multi-Core AES 128/192/256.

Objectives Design an FPGA that is capable of AES encryption and decryption with 128, 192, and 256 bit keys. Optimize encryption/decryption throughput by implementing multiple cores Synthesize multi-core design and generate a simulated throughput The purpose of this project was to implement a field programmable gate array (FPGA) prototype that implements the Advance Encryption Standard (AES) with key sizes of 128-bits, 192-bits, and 256-bits. As a proof of concept, multiple AES encryption and decryption cores were used in the design to improve the data throughput. Finally, the key size selectable multi-core AES design was synthesized to generate a simulated throughput for comparison with a single core design.

AES Core Modules Used AES Core Modules developed by Jerzy Gbur Contains: key_expansion.vhd – 128, 192, or 256 bit key expansion module aes_enc.vhd – AES encryption module aes_dec.vhd – AES decryption module I utilized the AES Core Modules developed by Jerzy Gbur that are available on opencores.org. The original modules had a generic parameter that determines the key to be used (0 – 128bit, 1 – 192bit, and 2 – 256bit) statically , so I modified the cores to make the key size a variable input so it could be changed when a new key was to be created. These cores were chosen because of the ability to use all three key sizes and to instantiate multiple cores easily.

Single AES Core Timing Key Size 128-bit 192-bit 256-bit Key Valid (cycles) 16 24 32 Key Expansion (cycles) 133 123 157 Enc/Dec Data Valid (cycles) Enc/Dec (cycles) 99 119 139 The AES core uses the 8-bit data bus and valid signals to capture the key data. The number of key valid clocks depends on the size of the key. After the entire key is received key expansion is performed at a rate dependent on the key size. Once the key expansion is complete the encryption and decryption cores are ready. The encryption/decryption cores accept one byte of data per cycle. Once all 128 bits are received the encryption/decryption begins. The time to encrypt/decrypt is also dependent on the key size. The cores output the encrypted or decrypted data one byte at a time.

Single Core Wave This waveform illustrates the key expansion, encryption, and decryption timing for each of the key sizes for a single core AES design. Notice the enc_ready signal is de-asserted as soon as the first encryption data input is received, the next 128 bit data encryption can not start until this single core has completed the first encryption. The encryption output is looped back into the decryption input and the decrypted output is checked to verify it matches the encryption input. The 192-bit and 256-bit key expansion, encryption, and decryption are also verified with the loopback.

Top Level Diagram This is a top level diagram of the design. The aes_top.vhd uses constants to specify how many AES Encryption and AES decryption cores are implemented. The encryption and decryption cores use the same key but are separated so encryption and decryption can be done simultaneously. Combinational and sequential logic in the aes_top module directs the incoming encryption and decryption data to cores that are not busy. Once a core drives the output the top level module sends that data on the output bus along with the output valid signal. After the last output data is driven the core becomes “not busy” and can then receive input data to encrypt/decrypt.

Multi-Core Timing Key Size 128-bit 192-bit 256-bit Initial Enc/Dec (cycles) 99 119 139 Enc/Dec (cycles) 17 Max Cores Utilized 8 9 11 The multi-core design requires at least one idle clock between the 16-bytes of input data for encryption or decryption. The time to encrypt/decrypt the first 16 bytes is still the same as the single core design. After the initial encrypt/decrypt data will be output every 17 clock cycles (16 valid data and 1 idle). Each 128 bits of encrypted or decrypted data output matches the order of the data input. The maximum number of cores utilized depends on the current key size. Since 11 cores are needed to reach the maximum encryption/decryption rate for 256 bit key size, that is the number used to synthesize and generate simulation throughput.

Multi-core Enc/Dec Wave This waveform illustrates the benefit of using multiple AES cores. The key expansion phase for any number of cores is constant. When using a single core the throughput is limited by the amount of time it takes for encryption/decryption. However, with 11 cores the throughput (after the initial encrypt/decrypt) is limited by the rate data is input (17 cycles if data is input continuously). Notice how the enc_ready and dec_ready signals stay asserted after the key expansion is completed for 128-bit and 192-bit key sizes. For the 256-bit key the ready signals are de-asserted for 1 clock cycle, but due to the idle clock necessary for data input, this does not slow down the encrypt/decrypt rate.

AES Design Synthesis Target FPGA: Xilinx Virtex 5 XC5VLX110 Single Core 11 Core Slice Registers 693 7282 Slice LUTs 2416 26121 Block RAMs 4 44 Max Freq. (MHz) 297.95 297.42 Max Throughput (MBps) 39 (128-bit) 33 (192-bit) 29 (256-bit) 266 The target FPGA for design synthesis was the Xilinx Virtex-5 XC5VLX110 FPGA because it is optimized for high-performance logic and has enough flip-flops to only be 40% utilized by a 11 enc/dec core AES design. This allowed the max frequency (297MHz) for both the single and the multi-core design to be limited by the logic and not by size constraints of the target FPGA. The throughput for the single core design is limited by the encryption/decryption timing of a single core. Because the key size determines the timing of a 128-bit data encryption or decryption the max throughput is also determined by the key size. The 11-core design has about a 7x improvement over the single core throughput. Each encryption or decryption core adds about 330 slice registers, 1200 slice LUTs, and 2 block RAMs. So it is possible to sacrifice some of the throughput (by using fewer cores) to meet the resource constraints of a smaller FPGA. It is also possible to limit the design to encryption or decryption only and cut the full 11-core design size in half.

Growth Opportunities Add more Enc/Dec inputs and 11-core modules Optimize the design for size Allow the key to be different for each core Use different AES Core Some areas for improvement on this multi-core AES design are: Increase the bandwidth by having multiple encryption/decryption IO and to support multiple 11-core modules (effectively multiplying the throughput, but also the size). Each encryption and decryption core has its own key expansion module. That could be extracted and put in the top level to significantly decrease the resource utilization. However, this would require more logic in the top level so that each core could simultaneously access different addresses of the key expansion block RAM. Each encrypt and decrypt core shares the same key and key size. The design could add inputs to select the key size and key data for each of the cores individually. This may not be a necessary use case, but the ability could be added to the current design. This multi-core AES design could be improved by utilizing an AES core that uses fewer clock cycles for a single encryption/decryption, however, that may decrease the maximum frequency.

Questions Design is located at: http://code.google.com/p/multicore-aes-fpga/ If you have any questions please email me at: wdwhiteh@iastate.edu If you have any questions feel free to email me. I used Google code to maintain version control during the development of this project so all the VHDL design files are available online under an open source license.