EE3A1 Computer Hardware and Digital Design Lecture 10 Hardware Design Flows
Introduction We want to turn a customer requirement into an electronic system 2 approaches: Hardwired algorithms. Customised hardware: solves only one problem Application Specific Integrated Circuit (ASIC) Computation in software. General purpose hardware (microprocessor) Can solve any problem Customise through software.
Computing in hardware and software A simple example: Shopping list Total bill = P1 x Q1 + P2 x Q2 + P3 x Q3
Turn our problem into silicon How? What are the choices?
What types of silicon chip? Microprocessor Serial operation ASIC Application Specific Integrated Circuit: Special purpose silicon chip Parallel operation
Solve in ASIC Build special purpose circuit Needs: 7 3 5 4 2 6 Build special purpose circuit Needs: 3 multipliers 1 adder 1 time step Calculated in one go Calculated in parallel 21 20 18 59
Solve on a microprocessor Break problem into sequence of simple steps (program) For ( i=1 to 3 ) { result = result + pi * qi } Micro performs steps one after the other Slow, but can solve any problem
Solve on a microprocessor Uses simple circuit ( arithmetic logic unit ) Does only one simple operation at a time Memory 7 3 5 4 2 6 P1xQ1
Solve on a microprocessor Uses simple circuit ( arithmetic logic unit ) Does only one simple operation at a time 7 3 5 4 2 6 Memory P1xQ1 Issue address of instruction P1xQ1 Inst Read Instruction is returned
Solve on a microprocessor Uses simple circuit ( arithmetic logic unit ) Does only one simple operation at a time 7 3 5 4 2 6 Memory P1xQ1 Issue address of P1 7xQ1 7 xQ1 P1xQ1 P1? Read Value of P1 is returned
Solve on a microprocessor Uses simple circuit ( arithmetic logic unit ) Does only one simple operation at a time And so on … Takes many cycles even for one instruction
What if our problem is large? Suppose we change from 3 items to 300 items Microprocessor needs The same sized circuit 100 times more time ASIC needs: The same length of time ( very fast ) 100 times bigger circuit ( needs many transistors ) Can we make a circuit that big?
Moore’s law Until recently, such big circuits were not possible Now they are possible ASICs have become a big business
What if our problem changes? Suppose we add a fourth item to our shopping list Micro: Re-write program Quick & cheap to do For ( i=1 to 3 ) { result = result + pi * qi } For ( i=1 to n ) { result = result + pi * qi } ASIC: Must build completely new circuit Slow & expensive to do What if we find a bug? Micro: download patch to customers ASIC: disaster - product recall
What types of silicon chip? Microprocessor is: programmable ( easy to upgrade, bug fix ) flexible: can be modified to do anything slow ASIC Application Specific Integrated Circuit: special purpose silicon chip: fast (because parallel) fixed purpose - cannot be modified Must choose flexibility or speed
Reconfigurable hardware FPGA ( Field programmable gate array ) New type of hardware Function is programmed by sending bits to it Function is easily and quickly modified Can be modified just like software: Can fix bugs Can update function during product lifetime
Design flows Normally done by skilled human Normally done by computer program
ASIC VHDL is converted to gates Each gate has a mask design (standard cell)
ASIC CAD tool stitches together gate definitions to give mask definition of whole chip:
Programmable Logic Devices Sum-of-products devices, i.e. AND of OR Customize by setting state of switch boxes List of switch box states is fuse map.
Programmable Logic Devices Various families: PEEL PLA PAL All are: Cheap Very quick to design Inflexible and slow
Complex PLDs CPLD Improves flexibility by using many PLDs with programmable connections between them
FPGA architecture Configurable logic gates Configurable wiring Change chip’s function by changing config data
Reconfigurable logic gates Gate function is determined by data stored in memory One bit is selected by inputs
Reconfigurable logic gates If In1=0 and In2=0, 0th memory bit is output
Reconfigurable logic gates 1 1 If In1=0 and In2=0, 0th memory bit is output If In1=0 and In2=1, 1st memory bit is output
Reconfigurable logic gates 1 1 If In1=0 and In2=0, 0th memory bit is output If In1=0 and In2=1, 1st memory bit is output If In1=1 and In2=0, 2nd memory bit is output
Reconfigurable logic gates 1 1 If In1=0 and In2=0, 0th memory bit is output If In1=0 and In2=1, 1st memory bit is output If In1=1 and In2=0, 2nd memory bit is output If In1=1 and In2=1, 3rd memory bit is output
Reconfigurable logic gates Change gate function by changing stored memory bits Truth table of required function is stored in memory
Field Programmable Gate arrays Better gate: output can be registered if required 100000 Configuration data is scanned in during boot or reset
Field Programmable Gate arrays Better gate: output can be registered if required 1 AND gate: no flip-flop at output
Field Programmable Gate arrays Better gate: output can be registered if required 100011 Change function by giving new configuration data
Field Programmable Gate arrays Better gate: output can be registered if required 1 AND gate with flip-flop at output
Example Carry unit of full-adder Synthesise VHDL to basic logic gates
Example Technology mapping transform to equivalent circuit, that uses only resources that we have available Output of synthesis tool may use resources we don’t have available 0001 0111 Our simple logic gates have only 2 inputs Compute values for truth tables
Example 0001 0111 Put the function into the FPGA
Example 0001 0111 0001 0111 AND gates 0001 g3 1 1 g4 1 g5
Example 0001 0111 0001 0111 OR gates 0001 1 g6 1 g7 g3 g4 g5 1 1 1
Example Configure switch boxes to give required wiring 0001 0111 0001 1 1 g3 g4 g5 1 1 1 x y cin
Example Configure switch boxes to give required wiring 0001 0111 0001 1 1 g3 g4 g5 1 1 1 x y cin
Example Configure switch boxes to give required wiring 0001 0111 0001 1 1 g3 g4 g5 1 1 1 x y cin
Example Configure switch boxes to give required wiring 0001 0111 0001 1 1 cout g3 g4 g5 1 1 1 x y cin
FPGA Configuration Scan in serial bitstream at boot or reset time to give chip its function 1000XXXX0001011110001110 g3 g4 g5 g7 g6 x y cin cout 1 1 1 1 1
Bitstream would also need to scan in configuration data for switch boxes FPGA Configuration 1000XXXX0001011110001110 g3 g4 g5 g7 g6 x y cin cout 1 1 1 1 1
Configuration bitstream Bitstream determines: Gate function Switch box routing Routing is slow and inefficient Many more wire segments available than any one design will actually use Wastes space Passing signal through switch boxes is slow
Better CLB If the FPGA has a bigger CLB (more inputs, more memory) the design is more efficient (uses less wiring) Example: Whole carry unit fits (easily) into 4-input 1 CLB
Intellectual Property (IP) Cores Designs for sub-systems Sold to chip designers Exist as intellectual property Makes FPGA design very easy Also available for ASICs Two types Commonly used sub-systems (e.g. multipliers, FIFOs,…) complicated and high value sub-systems (e.g. video decoder, network interface)
Intellectual Property (IP) Cores Same business model as software Individual can set up small company to produce cores Very low start-up costs No manufacturing costs Upgrades and bug-fixes can download across Web (But piracy is a problem)
Protection of IP Can we encrypt this bitstream for distribution, and then decrypt it as it enters the chip? 1000XXXX0001011110001110 ????????????????????????????? g3 g4 g5 g7 g6 x y cin cout Decrypt 1 1 1 1 1
Evaluation of Hardware Technologies ASICs are very high performance, but very costly to prototype. Mask set costs £500,000. If I sell 1,000 chips cost price is £500 each If I sell 1,000,000 chips, cost price is 50p each ASICs are only suitable for huge production runs: consumer mass market. ASICs cannot be bug-fixed or upgraded. PLDs are very cheap and quick, but slow and inflexible. FPGAs are very good, but quite expensive