Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE 656M Embedded Systems Design And Prototyping Term 3, 2011-2012.

Similar presentations


Presentation on theme: "ECE 656M Embedded Systems Design And Prototyping Term 3, 2011-2012."— Presentation transcript:

1 ECE 656M Embedded Systems Design And Prototyping Term 3, 2011-2012

2 Cesar A. Llorente Research and teaching interests: reconfigurable computing machine vision energy systems Contact: Electronics and Communications Engineering College of Engineering Contact: cesar.llorente@dlsu.edu.ph

3 ECE 545 Lecture Projects Project 1 30 % Project 2 20 % Homework 10 % exams Quiz 20 % in class Final 20 % take home

4 Lecture (1) Lecture 1 - Introduction to Embedded Systems Lecture 2 – Introduction to VHDL Combinational Logic. Packages and Components. Hands-on Session 1: XST Synthesis and Simulation Lecture 3 – Behavioral Modeling of Sequential Logic. Registers, Counters, Shift Registers. Simple Testbenches. Lecture 4 - Introduction to FPGA Devices & Tools Hands-on Session 2: Tools for FPGA Synthesis and Implemenation Lecture 5 - Finite State Machines Lecture 6 - Algorithmic State Machines. Memories: RAM, ROM. Lecture 7 – Advanced Testbenches. File I/O. Lecture 8 - Mixed Style RTL Modeling Quiz 1

5 Lecture (2)

6 Textbooks Required Textbooks: Volnei A. Pedroni, Circuit Design with VHDL, The MIT Press, 2004 Sundar Rajan, Essential VHDL: RTL Synthesis Done Right, S & G Publishing, 1998 Supplementary Textbooks: Stephen Brown and Zvonko Vranesic, Fundamentals of Digital Logic with VHDL Design, 2nd Edition, McGraw-Hill, 2005 Peter J. Ashenden, The Designer's Guide to VHDL, 2nd Edition, San Francisco:Morgan Kaufman, 1996, 2002

7 Quiz 2 hours 30 minutes in class design-oriented open-books, open-notes Tentative date:

8 Final Examination take-home full design, including logic synthesis and timing analysis for FPGAs Tentative date:

9 Project technologies FPGA: Field Programmable Gate Arrays

10 World of Integrated Circuits Integrated Circuits Full-Custom ASICs Semi-Custom ASICs User Programmable PLDFPGA PALPLAPML LUT (Look-Up Table) MUXGates

11 designs must be sent for expensive and time consuming fabrication in semiconductor foundry bought off the shelf and reconfigured by designers themselves Two competing implementation approaches ASIC Application Specific Integrated Circuit FPGA Field Programmable Gate Array designed all the way from behavioral description to physical layout no physical layout design; design ends with a bitstream used to configure a device

12 Which Way to Go? Off-the-shelf Low development cost Short time to market Reconfigurability High performance ASICsFPGAs Low power Low cost in high volumes

13 Source: [Brown99] What is an FPGA Chip ? Field Programmable Gate Array A chip that can be configured by user to implement different digital hardware Configurable Logic Blocks and Programmable Switch Matrices Bitstream to configure: function of each block & the interconnection between logic blocks I/O Block

14 CLB Structure

15 COUT D Q CK S R EC D Q CK R EC O G4 G3 G2 G1 Look-Up Table Carry & Control Logic O YB Y F4 F3 F2 F1 XB X Look-Up Table F5IN BY SR S Carry & Control Logic CIN CLK CE SLICE CLB Slice

16 LUT (Look-Up Table) Functionality Look-Up tables are primary elements for logic implementation Each LUT can implement any function of 4 inputs

17 Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor Flash & antifuse FPGAs Actel Corp. Quick Logic Corp. Share over 60% of the market

18 Xilinx FPGA Families Old families –XC3000, XC4000, XC5200 old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs. Low-cost families –Spartan/XL – derived from XC4000 –Spartan-II – derived from Virtex –Spartan-IIE – derived from Virtex-E –Spartan-3 High-performance families –Virtex (0.22µm) –Virtex-E, Virtex-EM (0.18µm) –Virtex-II, Virtex-II PRO (0.13µm) –Virtex-4 (0.09µm)

19 Design process (1) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds….. Library IEEE; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core; Specification VHDL description (Your VHDL Source Files) Functional simulation Post-synthesis simulation Synthesis

20 Design process (2) Implementation (Mapping, Placing & Routing) Configuration Timing simulation On chip testing

21 Design Process control from Active-HDL

22 Simulation Tools Many others…

23

24

25 architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1; with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; VHDL description Circuit netlist Logic Synthesis

26 Synthesis Tools … and others

27 Features of synthesis tools Interpret RTL code Produce synthesized circuit netlist in a standard EDIF format Give preliminary performance estimates Some can display circuit schematics corresponding to EDIF netlist

28 Implementation After synthesis the entire implementation process is performed by FPGA vendor tools

29

30 Mapping LUT2 LUT3 LUT4 LUT5 LUT1 FF1 FF2 LUT0

31 Placing CLB SLICES FPGA

32 Routing Programmable Connections FPGA

33 Design Process control from Active-HDL

34 Top Level ASIC Digital Design Flow RTL Design Place+Route Physical Verification Synthesis Design Inception Design Complete Macro Development

35 RTL Design Design Function Digital Tool RTL Design Testbench Developement Mixed Mode Simulation FPGA Verification (users discression) Lint Checking (users discression) Code Coverage (users discression) Formal Verification Cadence NC Verilog Mentor Graphis ModelSim Cadence NC Verilog Mentor Graphics ModelSim Cadence AMS Designer Xilinx ISE Cadence Hal Cadence ICT Agilent ADS Matlab Design Inception Synthesis +Macro Development System Interface Simulation Cadence Conformal Synthesis+Macro Development

36 Synthesis + Macro Development Design Function Digital Tool Synthesis Static Timing Analysis Logical Equivalency DFT Place+Route Gate-Level Simulation RTL Synopsys DC Cadence RC Synopsys PrimeTime Cadence Conformal Synopsys DFT Compiler Cadence RC Place+Route Cadence NC Verilog Mentor Graphics Modelsim RTL Macro Generation Macro Verification Macro Rules Generation/ Library Generation Mentor Graphics Calibre Artisan/ Cadence DFII Artisan Verification

37 Place + Route Floorplan Macro Placement/Std Cell Placement -Based Optimization Clock Tree Synthesis Route RC Extraction Signal Integrity Design Function Digital Tool Static Timing Analysis Cadence NanoRoute Cadence Fire&Ice QX Cadence CeltIC/Voltage Storm Synopsys Prime- Time Verification Cadence Encounter Synthesis ATPG Mentor Graphics FastScan Cadence Encounter Metal Fill Spare Cells/Decoupling Cap Filler Cells Cadence Encounter

38 Physical Verification Design Function Digital Tool GDSII Preparation / Schematic Preparation DRC LVS ERC Simulation Preparation Back Annotated Simulation LayoutChip Finishing Cadence DFII Cadence NC VerilogCadence Virtuoso Placed+Routed Design Placed+Routed Design Design Complete Mentor Graphics Calibre Top-Level Simulation Synopsys Nanosim Cadence AMS Designer

39 CAD software available at DLSU (1) Xilinx ISE 12.3 (under Windows) VCS (under Linux) available in the STRC111 Intel Microprocessors Lab VHDL simulators Free Student Edition: ISE WebPack available in the STRC111 Intel Microprocessors Lab

40 CAD software available at DLSU (2) Tools used for logic synthesis Xilinx XST / EDK /SDK (under Windows) FPGA synthesis available in the STRC111 Intel Microprocessors Lab

41 CAD software available at DLSU (3) Xilinx XST (under Windows) FPGA synthesis available in the STRC111 Intel Microprocessors Lab Tools used for implementation (mapping, placing & routing) in the FPGA technology

42 Projects – Overview Project 1 (35 points) January – February (~6 weeks) Project 2 (35 points) March (~4 weeks ) Application: Game Application using Microblaze Processor Technology: FPGA Target: synthesizable code, downloadable code Application: Game Software using state machines Technology: FPGA Target: synthesizable code, downloadable code

43 Projects 1, 2 choice between two project topics  cryptography (e.g., encryption, authentication, hash)  digital signal processing (e.g., digital filter, FFT, image processing, etc.) both topics specified by the instructor initial specification in the form of a - pseudocode and/or flowchart - detailed interface design and source code is required to be scalable, i.e., work for different parameters and operand sizes, specified at the time of synthesis

44 Encryption Input: (A, B, C, D) Table S[0..2r+3] B = B + S[0] D = D + S[1] for i= 1 to r do { t= (B*(2B+1)) <<< log 2 w u= (D*(2D+1)) <<< log 2 w A= ((A  t) <<< u) + S[2i] C= ((C  u) <<< t) + S[2i+1] (A, B, C, D) = (B, C, D, A) } A = A + S[2r+2] C = C + S[2r+3] Output: (A, B, C, D) Decryption Input: (A, B, C, D) Table S[0..2r+3] C = C – S[2r+3] A = A – S[2r+2] for i= r downto 1 do { (A, B, C, D) = (D, A, B, C) u= (D*(2D+1)) <<< log 2 w t= (B*(2B+1)) <<< log 2 w C= ((C – S[2i+1]) >>> t)  u A= ((A – S[2i]) >>> u)  t } D = D – S[1] B = B – S[0] Output (A, B, C, D) Example: Last year’s project – RC6 cipher

45 Encryption/decryption unit with control & i/o interface clock reset enc_dec data_in data_available data_read m S_i key_available key_read Key memory unit data_out write full m round number round key(s) Required interface w ready

46 Projects 1, 2 Optimization Criteria Maximum ratio Throughput / Circuit Area or Minimum product Latency  Circuit Area

47 Primary timing parameters Latency Throughput Circuit Time to process a single block of data XiXi YiYi Number of bits processed in a unit of time Circuit XiXi X i+1 X i+2 YiYi Y i+1 Y i+2 Throughput = Block_size · Number_of_blocks_processed_simultaneously Latency

48 Infinite Impulse Response (IIR) Filter Equations (1) Transfer function

49 Two investigated architectures Architecture 1: Direct II Form

50 Architecture 2: Cascade of second-order systems (b) F i (z)

51 Example of coefficients: Butterworth filter Order O=10, Passband Fp=0.3 Architecture 1: Direct II Form Architecture 2: Cascade of second-order systems a[1..10] = b[1..10] =

52 IIR Filter with control unit & i/o interface clock reset data_in wi a_i ab_write data_out wo Required interface wc b_i wc process ready valid

53 Project 2b from FALL 2005 to be modified in FALL 2006

54 Using high-level behavioral VHDL describe an 8-bit microcontroller MC68HC11E1, working in the expanded mode, with the following simplifications: 1.Inputs and outputs of the microcontroller are reduced to E (clock), RESETn (reset active low), RW (read/write), AS (address strobe), ADDR15..8 (also denoted as PB7..0), ADDR7..0/DATA7..0 (multiplexed address & data, also denoted as PC7..0), PORTD and PORTE. Microcontroller

55 2. Internal registers are reduced to the registers A, IX, SP, CC (Condition Codes NZVC), and PC. 3. The only parts of 68HC11E1 implemented in your model are: a. CPU b. RAM (512 B in the range $0000-$01FF) c. parallel I/O (PORTD and PORTE) 4. Internally generated clock E has a frequency 2 MHz. 5. Internal I/O registers are limited to PORTD at the memory address $1008 DDRD at the memory address $1009 PORTE at the memory address $100A

56 6. Instruction set of the microcontroller is reduced to the following instructions a.Data transfer instructions LDAA, LDX, LDS, STAA, STX a.Arithmetic instructions CLRA, NEGA, ADDA, SUBA, ASRA, ASLA a.Logic instructions ANDA, ORAA, EORA a.Data test instructions CMPA, CPX, TSTA a.Control instructions BEQ, BGT, BHI, BSR, JSR, RTS, JMP a.Stack instructions PSHA, PULA, PSHX, PULX

57 7. Addressing modes of the microcontroller are reduced to the following modes a. immediate b. extended c. indexed d. inherent e. relative 8. Main program is stored in the external RAM starting at the address $4000. 9. After reset, PC is set to the address $0000 (internal RAM of MC68HC11) where the instruction JMP $4000 is located.

58 Microcontroller system The implemented microcontroller system should consist of: 1.Microcontroller MC68HC11E1 2.8 kB RAM, such as 6164 3.74HC373 8-bit latch 4.74HC138 decoder chip 5.Auxiliary gates, if needed

59

60 Write Cycle

61 Features of the model 1.Your model should allow cycle accurate modeling of the circuit behavior. 2.Your model should contain debugging features equivalent to the debugging features of the DLX model, discussed in class and described in Ashenden, Chapter 15. 3.Generic parameters passed to the model should include a. name of the file with the contents of the external RAM b. clk-to-output delay c. debugging mode 1.Your model should report all undefined opcodes, treat them as NOP, and proceed to the next RAM address.

62 Testing and debugging The behavior of your model should be carefully verified using a testbench instantiating your model with a. the external RAM containing a valid program composed of a substantial subset of instructions implemented in the model b. debugging mode set to the most detailed mode (trace_each_step)

63 Deliverables 1.All source code files. 2.Contents of the external RAM used for the model verification, in the hexadecimal notation, and expressed using the corresponding 68HC11 assembly language mnemonics. 1.The detailed log/report generated by your model for a given contents of RAM, and with the debugging mode set to trace_each_step.

64 All Projects - Organization Projects divided into phases Intermediate code submitted through WebCT at selected checkpoints and evaluated by the instructor and/or TA Penalty points for falling behind the schedule (below 50% of the work that supposed to be done by a certain deadline) Feedback provided to students on a fair and best effort basis Final report and codes submitted by WebCT and graded using a full scale Contest for the best results (bonus points awarded to the winners) Penalty and bonus points added to the final grade

65 Honor Code Rules All students are expected to write and debug their codes individually Students are encouraged to help and support each other in all problems related to the - operation of the CAD tools, - basic understanding of the problem.


Download ppt "ECE 656M Embedded Systems Design And Prototyping Term 3, 2011-2012."

Similar presentations


Ads by Google