ECE 656M Embedded Systems Design And Prototyping Term 3, 2011-2012.

Slides:



Advertisements
Similar presentations
EELE 367 – Logic Design Module 2 – Modern Digital Design Flow Agenda 1.History of Digital Design Approach 2.HDLs 3.Design Abstraction 4.Modern Design Steps.
Advertisements

George Mason University FPGA Design Flow ECE 448 Lecture 9.
FPGA Devices & FPGA Design Flow
ECE 448 Lecture 7 FPGA Devices
February 4, 2002 John Wawrzynek
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts,
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448.
ECE 448 FPGA and ASIC Design with VHDL
Digital System Design EEE344 Lecture 1 INTRODUCTION TO THE COURSE
ECE 448: Spring 12 Lab 4 – Part 2 Finite State Machines Basys2 FPGA Board.
ECE web page  Courses  Course web pages
ECE web page  Courses  Course web pages
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
ECE 448 FPGA and ASIC Design with VHDL Spring 2008.
Data Flow Modeling of Combinational Logic Simple Testbenches
ECE 448 FPGA and ASIC Design with VHDL
ECE 545 Project 1 Part IV Key Scheduling Final Integration List of Deliverables.
Ch.9 CPLD/FPGA Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
SHA-3 Candidate Evaluation 1. FPGA Benchmarking - Phase Round-2 SHA-3 Candidates implemented by 33 graduate students following the same design.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
VHDL Project Specification Naser Mohammadzadeh. Schedule  due date: Tir 18 th 2.
ECE 448 FPGA and ASIC Design with VHDL Spring 2010.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Welcome to the ECE 449 Computer Design Lab Spring 2005.
ECE 448: Spring 11 Lab 3 Part 1 Sequential Logic for Synthesis.
ECE 448 Lecture 6 FPGA devices
ECE 545 Project 2 Specification. Schedule of Projects (1) Project 1 RTL design for FPGAs (20 points) Due date: Tuesday, November 22, midnight (firm) Checkpoints:
George Mason University ECE 449 – Computer Design Lab Introduction to FPGA Devices & Tools.
ECE 448 FPGA and ASIC Design with VHDL Spring 2011.
ECE 545 Lecture 7 FPGA Design Flow.
ECE 545 Project 2 Specification. Project 2 (15 points) – due Tuesday, December 19, noon Application: cryptography OR digital signal processing optimized.
George Mason University ECE 449 – Computer Design Lab Welcome to the ECE 449 Computer Design Lab Spring 2004.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
George Mason University ECE 448 FPGA and ASIC Design with VHDL FPGA Design Flow ECE 448 Lecture 7.
Sept. 2005EE37E Adv. Digital Electronics Lesson 1 (Part 2) FPGA Architectures.
Introduction to FPGA Tools
ECE 448 FPGA and ASIC Design with VHDL
George Mason University FPGA Design Flow ECE 545 Lecture 10.
04/26/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Introduction to VHDL 12.2: VHDL versus Verilog (Separate File)
Survey of Reconfigurable Logic Technologies
ECE 448 FPGA and ASIC Design with VHDL
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices ECE 448 Lecture 5.
ECE 448 FPGA and ASIC Design with VHDL Spring 2009.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Design Flow based on Aldec Active-HDL FPGA Board.
ECE 545 Project 1 Introduction & Specification Part I.
Course web page: ECE 545 Introduction to VHDL ECE web page  Courses  Course web pages  ECE 545.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices & FPGA Design Flow ECE 448 Lecture 6.
FPGA Devices & FPGA Design Flow
Introduction to the FPGA and Labs
ASIC Design Methodology
ECE web page  Courses  Course web pages
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
Data Flow Modeling of Combinational Logic
ECE 448 Lecture 5 FPGA Devices
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
Sequential Logic for Synthesis Based on Aldec Active-HDL
THE ECE 554 XILINX DESIGN PROCESS
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL Code.
ECE 448 Lecture 6 Finite State Machines State Diagrams, State Tables, Algorithmic State Machine (ASM) Charts, and VHDL code ECE 448 – FPGA and ASIC Design.
Digital Designs – What does it take
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
THE ECE 554 XILINX DESIGN PROCESS
Presentation transcript:

ECE 656M Embedded Systems Design And Prototyping Term 3,

Cesar A. Llorente Research and teaching interests: reconfigurable computing machine vision energy systems Contact: Electronics and Communications Engineering College of Engineering Contact:

ECE 545 Lecture Projects Project 1 30 % Project 2 20 % Homework 10 % exams Quiz 20 % in class Final 20 % take home

Lecture (1) Lecture 1 - Introduction to Embedded Systems Lecture 2 – Introduction to VHDL Combinational Logic. Packages and Components. Hands-on Session 1: XST Synthesis and Simulation Lecture 3 – Behavioral Modeling of Sequential Logic. Registers, Counters, Shift Registers. Simple Testbenches. Lecture 4 - Introduction to FPGA Devices & Tools Hands-on Session 2: Tools for FPGA Synthesis and Implemenation Lecture 5 - Finite State Machines Lecture 6 - Algorithmic State Machines. Memories: RAM, ROM. Lecture 7 – Advanced Testbenches. File I/O. Lecture 8 - Mixed Style RTL Modeling Quiz 1

Lecture (2)

Textbooks Required Textbooks: Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004 Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998 Supplementary Textbooks: Stephen Brown and Zvonko Vranesic, Fundamentals of Digital Logic with VHDL Design, 2nd Edition, McGraw-Hill, 2005 Peter J. Ashenden, The Designer's Guide to VHDL, 2nd Edition, San Francisco:Morgan Kaufman, 1996, 2002

Quiz 2 hours 30 minutes in class design-oriented open-books, open-notes Tentative date:

Final Examination take-home full design, including logic synthesis and timing analysis for FPGAs Tentative date:

Project technologies FPGA: Field Programmable Gate Arrays

World of Integrated Circuits Integrated Circuits Full-Custom ASICs Semi-Custom ASICs User Programmable PLDFPGA PALPLAPML LUT (Look-Up Table) MUXGates

designs must be sent for expensive and time consuming fabrication in semiconductor foundry bought off the shelf and reconfigured by designers themselves Two competing implementation approaches ASIC Application Specific Integrated Circuit FPGA Field Programmable Gate Array designed all the way from behavioral description to physical layout no physical layout design; design ends with a bitstream used to configure a device

Which Way to Go? Off-the-shelf Low development cost Short time to market Reconfigurability High performance ASICsFPGAs Low power Low cost in high volumes

Source: [Brown99] What is an FPGA Chip ? Field Programmable Gate Array A chip that can be configured by user to implement different digital hardware Configurable Logic Blocks and Programmable Switch Matrices Bitstream to configure: function of each block & the interconnection between logic blocks I/O Block

CLB Structure

COUT D Q CK S R EC D Q CK R EC O G4 G3 G2 G1 Look-Up Table Carry & Control Logic O YB Y F4 F3 F2 F1 XB X Look-Up Table F5IN BY SR S Carry & Control Logic CIN CLK CE SLICE CLB Slice

LUT (Look-Up Table) Functionality Look-Up tables are primary elements for logic implementation Each LUT can implement any function of 4 inputs

Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor Flash & antifuse FPGAs Actel Corp. Quick Logic Corp. Share over 60% of the market

Xilinx FPGA Families Old families –XC3000, XC4000, XC5200 old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs. Low-cost families –Spartan/XL – derived from XC4000 –Spartan-II – derived from Virtex –Spartan-IIE – derived from Virtex-E –Spartan-3 High-performance families –Virtex (0.22µm) –Virtex-E, Virtex-EM (0.18µm) –Virtex-II, Virtex-II PRO (0.13µm) –Virtex-4 (0.09µm)

Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds….. Library IEEE; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core; Specification VHDL description (Your VHDL Source Files) Functional simulation Post-synthesis simulation Synthesis

Design process (2) Implementation (Mapping, Placing & Routing) Configuration Timing simulation On chip testing

Design Process control from Active-HDL

Simulation Tools Many others…

architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1; with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; VHDL description Circuit netlist Logic Synthesis

Synthesis Tools … and others

Features of synthesis tools Interpret RTL code Produce synthesized circuit netlist in a standard EDIF format Give preliminary performance estimates Some can display circuit schematics corresponding to EDIF netlist

Implementation After synthesis the entire implementation process is performed by FPGA vendor tools

Mapping LUT2 LUT3 LUT4 LUT5 LUT1 FF1 FF2 LUT0

Placing CLB SLICES FPGA

Routing Programmable Connections FPGA

Design Process control from Active-HDL

Top Level ASIC Digital Design Flow RTL Design Place+Route Physical Verification Synthesis Design Inception Design Complete Macro Development

RTL Design Design Function Digital Tool RTL Design Testbench Developement Mixed Mode Simulation FPGA Verification (users discression) Lint Checking (users discression) Code Coverage (users discression) Formal Verification Cadence NC Verilog Mentor Graphis ModelSim Cadence NC Verilog Mentor Graphics ModelSim Cadence AMS Designer Xilinx ISE Cadence Hal Cadence ICT Agilent ADS Matlab Design Inception Synthesis +Macro Development System Interface Simulation Cadence Conformal Synthesis+Macro Development

Synthesis + Macro Development Design Function Digital Tool Synthesis Static Timing Analysis Logical Equivalency DFT Place+Route Gate-Level Simulation RTL Synopsys DC Cadence RC Synopsys PrimeTime Cadence Conformal Synopsys DFT Compiler Cadence RC Place+Route Cadence NC Verilog Mentor Graphics Modelsim RTL Macro Generation Macro Verification Macro Rules Generation/ Library Generation Mentor Graphics Calibre Artisan/ Cadence DFII Artisan Verification

Place + Route Floorplan Macro Placement/Std Cell Placement -Based Optimization Clock Tree Synthesis Route RC Extraction Signal Integrity Design Function Digital Tool Static Timing Analysis Cadence NanoRoute Cadence Fire&Ice QX Cadence CeltIC/Voltage Storm Synopsys Prime- Time Verification Cadence Encounter Synthesis ATPG Mentor Graphics FastScan Cadence Encounter Metal Fill Spare Cells/Decoupling Cap Filler Cells Cadence Encounter

Physical Verification Design Function Digital Tool GDSII Preparation / Schematic Preparation DRC LVS ERC Simulation Preparation Back Annotated Simulation LayoutChip Finishing Cadence DFII Cadence NC VerilogCadence Virtuoso Placed+Routed Design Placed+Routed Design Design Complete Mentor Graphics Calibre Top-Level Simulation Synopsys Nanosim Cadence AMS Designer

CAD software available at DLSU (1) Xilinx ISE 12.3 (under Windows) VCS (under Linux) available in the STRC111 Intel Microprocessors Lab VHDL simulators Free Student Edition: ISE WebPack available in the STRC111 Intel Microprocessors Lab

CAD software available at DLSU (2) Tools used for logic synthesis Xilinx XST / EDK /SDK (under Windows) FPGA synthesis available in the STRC111 Intel Microprocessors Lab

CAD software available at DLSU (3) Xilinx XST (under Windows) FPGA synthesis available in the STRC111 Intel Microprocessors Lab Tools used for implementation (mapping, placing & routing) in the FPGA technology

Projects – Overview Project 1 (35 points) January – February (~6 weeks) Project 2 (35 points) March (~4 weeks ) Application: Game Application using Microblaze Processor Technology: FPGA Target: synthesizable code, downloadable code Application: Game Software using state machines Technology: FPGA Target: synthesizable code, downloadable code

Projects 1, 2 choice between two project topics  cryptography (e.g., encryption, authentication, hash)  digital signal processing (e.g., digital filter, FFT, image processing, etc.) both topics specified by the instructor initial specification in the form of a - pseudocode and/or flowchart - detailed interface design and source code is required to be scalable, i.e., work for different parameters and operand sizes, specified at the time of synthesis

Encryption Input: (A, B, C, D) Table S[0..2r+3] B = B + S[0] D = D + S[1] for i= 1 to r do { t= (B*(2B+1)) <<< log 2 w u= (D*(2D+1)) <<< log 2 w A= ((A  t) <<< u) + S[2i] C= ((C  u) <<< t) + S[2i+1] (A, B, C, D) = (B, C, D, A) } A = A + S[2r+2] C = C + S[2r+3] Output: (A, B, C, D) Decryption Input: (A, B, C, D) Table S[0..2r+3] C = C – S[2r+3] A = A – S[2r+2] for i= r downto 1 do { (A, B, C, D) = (D, A, B, C) u= (D*(2D+1)) <<< log 2 w t= (B*(2B+1)) <<< log 2 w C= ((C – S[2i+1]) >>> t)  u A= ((A – S[2i]) >>> u)  t } D = D – S[1] B = B – S[0] Output (A, B, C, D) Example: Last year’s project – RC6 cipher

Encryption/decryption unit with control & i/o interface clock reset enc_dec data_in data_available data_read m S_i key_available key_read Key memory unit data_out write full m round number round key(s) Required interface w ready

Projects 1, 2 Optimization Criteria Maximum ratio Throughput / Circuit Area or Minimum product Latency  Circuit Area

Primary timing parameters Latency Throughput Circuit Time to process a single block of data XiXi YiYi Number of bits processed in a unit of time Circuit XiXi X i+1 X i+2 YiYi Y i+1 Y i+2 Throughput = Block_size · Number_of_blocks_processed_simultaneously Latency

Infinite Impulse Response (IIR) Filter Equations (1) Transfer function

Two investigated architectures Architecture 1: Direct II Form

Architecture 2: Cascade of second-order systems (b) F i (z)

Example of coefficients: Butterworth filter Order O=10, Passband Fp=0.3 Architecture 1: Direct II Form Architecture 2: Cascade of second-order systems a[1..10] = b[1..10] =

IIR Filter with control unit & i/o interface clock reset data_in wi a_i ab_write data_out wo Required interface wc b_i wc process ready valid

Project 2b from FALL 2005 to be modified in FALL 2006

Using high-level behavioral VHDL describe an 8-bit microcontroller MC68HC11E1, working in the expanded mode, with the following simplifications: 1.Inputs and outputs of the microcontroller are reduced to E (clock), RESETn (reset active low), RW (read/write), AS (address strobe), ADDR15..8 (also denoted as PB7..0), ADDR7..0/DATA7..0 (multiplexed address & data, also denoted as PC7..0), PORTD and PORTE. Microcontroller

2. Internal registers are reduced to the registers A, IX, SP, CC (Condition Codes NZVC), and PC. 3. The only parts of 68HC11E1 implemented in your model are: a. CPU b. RAM (512 B in the range $0000-$01FF) c. parallel I/O (PORTD and PORTE) 4. Internally generated clock E has a frequency 2 MHz. 5. Internal I/O registers are limited to PORTD at the memory address $1008 DDRD at the memory address $1009 PORTE at the memory address $100A

6. Instruction set of the microcontroller is reduced to the following instructions a.Data transfer instructions LDAA, LDX, LDS, STAA, STX a.Arithmetic instructions CLRA, NEGA, ADDA, SUBA, ASRA, ASLA a.Logic instructions ANDA, ORAA, EORA a.Data test instructions CMPA, CPX, TSTA a.Control instructions BEQ, BGT, BHI, BSR, JSR, RTS, JMP a.Stack instructions PSHA, PULA, PSHX, PULX

7. Addressing modes of the microcontroller are reduced to the following modes a. immediate b. extended c. indexed d. inherent e. relative 8. Main program is stored in the external RAM starting at the address $ After reset, PC is set to the address $0000 (internal RAM of MC68HC11) where the instruction JMP $4000 is located.

Microcontroller system The implemented microcontroller system should consist of: 1.Microcontroller MC68HC11E1 2.8 kB RAM, such as HC373 8-bit latch 4.74HC138 decoder chip 5.Auxiliary gates, if needed

Write Cycle

Features of the model 1.Your model should allow cycle accurate modeling of the circuit behavior. 2.Your model should contain debugging features equivalent to the debugging features of the DLX model, discussed in class and described in Ashenden, Chapter Generic parameters passed to the model should include a. name of the file with the contents of the external RAM b. clk-to-output delay c. debugging mode 1.Your model should report all undefined opcodes, treat them as NOP, and proceed to the next RAM address.

Testing and debugging The behavior of your model should be carefully verified using a testbench instantiating your model with a. the external RAM containing a valid program composed of a substantial subset of instructions implemented in the model b. debugging mode set to the most detailed mode (trace_each_step)

Deliverables 1.All source code files. 2.Contents of the external RAM used for the model verification, in the hexadecimal notation, and expressed using the corresponding 68HC11 assembly language mnemonics. 1.The detailed log/report generated by your model for a given contents of RAM, and with the debugging mode set to trace_each_step.

All Projects - Organization Projects divided into phases Intermediate code submitted through WebCT at selected checkpoints and evaluated by the instructor and/or TA Penalty points for falling behind the schedule (below 50% of the work that supposed to be done by a certain deadline) Feedback provided to students on a fair and best effort basis Final report and codes submitted by WebCT and graded using a full scale Contest for the best results (bonus points awarded to the winners) Penalty and bonus points added to the final grade

Honor Code Rules All students are expected to write and debug their codes individually Students are encouraged to help and support each other in all problems related to the - operation of the CAD tools, - basic understanding of the problem.