Download presentation
Presentation is loading. Please wait.
Published byArthur Wells Modified over 9 years ago
2
EE141 1 Electronic Design Automation [ Adopted from Jan M. Rabaey, Alessandra Nardi, Abhay Dixit, Meeta Bhate, Kedar Rajpathak, Tulika Mitra]
3
EE141 2 Electronic Design Automation Design Analysis Structural and Behavioral Modeling in HDL Design Verification Cell Based Design Programmable Logic Devices and FPGA Design Synthesis
4
EE141 3 Design methodologies analysis and verification - simulation is input dependent - verification requires understanding of circuit operation implementation and synthesis testability tools and techniques
5
EE141 4 Terminology HDL: Hardware Description Language E.g.: Verilog, VHDL RTL: Register Transfer Level. It is HDL written for logic synthesis: clock cycle to clock cycle operations and events are explicitly defined architecture of the design is implicit in RTL code Behavioral HDL It is HDL written for behavioral synthesis: it describes the intent and the algorithm behind the design Behavioral synthesis It tries to optimize the design at architectural level with constraints for clock, latency and throughput. The output is RTL level design with clocks, registers and buses. Logic synthesis It automatically converts RTL code into gates without modifying the implied architecture. It tries to optimize the gate-level implementation to meet performance (timing, area….) goals
6
EE141 5 Verification at different levels of abstraction Goal: Goal: Ensure the design meets its functional and timing requirements at each of these levels of abstraction In general this process consists of the following conceptual steps: 1. Creating the design at a higher level of abstraction 2. Verifying the design at that level of abstraction 3. Translating the design to a lower level of abstraction 4. Verifying the consistency between steps 1 and 3 5. Steps 2, 3, and 4 are repeated until tapeout
7
EE141 6 Verification at different levels of abstraction Behavior al HDL System Simulators HDL Simulators Code Coverage Gate-level Simulators Static Timing Analysis Layout vs Schematic (LVS) RTLGate- level Physical Domain Verification
8
EE141 7 Verification Techniques functionaltiming Simulation (functional and timing) Behavioral RTL Gate-level (pre-layout and post-layout) Switch-level Transistor-level functional Formal Verification (functional) timing Static Timing Analysis (timing) Goal: Goal: Ensure the design meets its functional and timing requirements at each of these levels of abstraction
9
EE141 8 Classification of Simulators Logic Simulators Emulator-basedSchematic-basedHDL-based Event-drivenCycle-basedGateSystem
10
EE141 9 Classification of Simulators HDL-based HDL-based: Design and testbench described using HDL Event-driven Cycle-based Schematic-based Schematic-based: Design is entered graphically using a schematic editor Emulators Emulators: Design is mapped into FPGA hardware for prototype simulation. Used to perform hardware/software co-simulation.
11
EE141 10 Instruction Set Architecture Design (Microarchitecture Design-I) System-Level Design RTL Level Design (Microarchitecture Design II) Compiler Design Code Optimizer Hardware Design Switch Level Design Circuit Level design ISA Simulator System Level Simulator RTL Level Simulator Switch Level Simulator Circuit Level Simulator Arch./Compiler Design Toolset Processor Architecture Simulation and Design Flow Diagram HDL (VHDL or Verilog) Code Generator
12
EE141 11 (Some) EDA Tools and Vendors Behavioral synthesis Behavioral compiler Synopsys Logic Synthesis Design Compiler Synopsys BuildGates Ambit Design Systems Galileo (FPGA) Examplar (Mentor Graphics) FPGAExpress (FPGA) Synopsys Synplify (FPGA) Synplicity
13
EE141 12 (Some) EDA Tools and Vendors Logic Simulation Scirocco (VHDL) Synopsys Verilog-XL (Verilog) Cadence Design Systems Leapfrog (VHDL) Cadence Design Systems VCS (Verilog) Chronologic (Synopsys) Cycle-based simulation SpeedSim (VHDL) Quickturn PureSpeed (Verilog) Viewlogic (Synopsys) Cobra Cadence Design Systems Cyclone Synopsys
14
EE141 13 Circuit Simulation Formulation of circuit equations STA, MNA Solution of linear equations LU factorization, QR factorization, Krylov Methods Solution of nonlinear equations Newton’s method Solution of ordinary differential equations One-step and Multi-step methods AC Analysis and Noise Simulation Techniques for RF Shooting-Newton Harmonic-Balance
15
EE141 14 Analog Circuits – A World Apart Analog circuits’ behavior specified in terms of complex functions: time-domain, frequency-domain, distortion, noise, power spectra…. Required accuracy of models much higher than digital …emerging paradigm: Field Programmable Analog Array for prototyping (and more)
16
EE141 15 Design analysis and simulation Spice - exact but time consuming discrete time steps circuit models timing simulation with partitioning and relaxation method
17
EE141 16 Timing Simulation
18
EE141 17 Switch-Level Simulation Linear Switch-Level Simulation RSIM (Terman), nRSIM (Chu), IRSIM (Horowitz) Model transistor as switched, linear resistor Ternary (0, 1, X) node states Elmore (RC product) model of circuit delay a a 1 X 0 Voltage Logic Value
19
EE141 18 Switch-Level Simulation Switch level model of inverter
20
EE141 19 Event-driven Simulation Event: change in logic value at a node, at a certain instant of time (V,T) Event-driven: only considers active nodes Efficient Performs both timing and functional verification All nodes are visible Glitches are detected Most heavily used and well-suited for all types of designs
21
EE141 20 Event-driven Simulation Event: change in logic value, at a certain instant of time (V,T) 1 0 1 0 1 0 1 D=2 a b c Events: Input: b(1)=1 Output: none 1 0 1 0 1 0 1 D=2 a b c Events: Input: b(1)=1 Output: c(3)=0 3
22
EE141 21 Event-driven Simulation Uses a timewheel to manage the relationship between components Timewheel Timewheel = list of all events not processed yet, sorted in time (complete ordering) When event is generated, it is put in the appropriate point in the timewheel to ensure causality
23
EE141 22 Event-driven Simulation d b(1)=1 d(5)=1 D=1 1 0 1 0 1 D=2 a b c 5 0 1 e 0 1 3 c(3)=0 d(5)=1 0 1 4 e(4)=0 6 e(6)=1
24
EE141 23 Cycle-based Simulation Take advantage of the fact that most digital designs are largely synchronous Synchronous circuit: state elements change value on active edge of clock Only boundary nodes are evaluated Internal Node Boundary Node LatchesLatches LatchesLatches
25
EE141 24 Cycle-based Simulation Compute steady-state response of the circuit at each clock cycle at each boundary node LatchesLatches LatchesLatches Internal Node
26
EE141 25 Cycle-based versus Event-driven Cycle-based: Only boundary nodes No delay information Event-driven: Each internal node Need scheduling and functions may be evaluated multiple times Cycle-based is 10x-100x faster than event-driven (and less memory usage) Cycle-based does not detect glitches and setup/hold time violations, while event-driven does
27
EE141 26 Simulation: Perfomance vs Abstraction.001x SPICE Event-driven Simulator Cycle-based Simulator 1x10x Performance and Capacity Abstraction Timing Simulator
28
EE141 27 Perspective on accuracy and speed
29
EE141 28 Gate level simulation faster than switch level functional simulation VHDL description used
30
EE141 29 Epics PowerMill *** Current information is calculated from 5.00 ns to 400.00 ns *** average current on VDD : -0.442845 mA peak current on VDD : -10.269000 mA at 211.10 ns rms current on VDD : +0.877633 mA average current on GND : +0.364263 mA peak current on GND : +6.720000 mA at 49.50 ns rms current on GND : +0.645861 mA
31
EE141 30 PowerMill Performance
32
EE141 31 Avant!s STAR-ADM (previously Anagram) Mixed analog/digital simulation Claims to be more accurate for deep sub-micron design than switch-level simulation
33
EE141 32 Performance Benchmarks
34
EE141 33 Structural model in VHDL
35
EE141 34 Behavioral model in VHDL
36
EE141 35 High level behavioral VHDL
37
EE141 36
38
EE141 37 Digital Systems Verification Overview Cycle-based and event-driven simulation Formal methods Timing Analysis Hardware Description Languages (Verilog- VHDL) System C
39
EE141 38 Verification - Simulation Consistency: same testbench at each level of abstraction Behavioral Gate-level Design (Post-layout) Gate-level Design (Pre-layout) RTL Design TestbenchSimulation
40
EE141 39 Formal Verification Can be used to verify a design against a reference design as it progresses through the different levels of abstraction Verifies functionality without test vectors Three main categories: Model Checking: compare a design to an existing set of logical properties (that are a direct representation of the specifications of the design). Properties have to be specified by the user (far from a “push-button” methodology) Theorem Proving: requires that the design is represented using a “formal” specification language. Present-day HDL are not suitable for this purpose. Equivalence Checking: it is the most widely used. It performs an exhaustive check on the two designs to ensure they behave identically under all possible conditions.
41
EE141 40 Digital Systems Verification Timing Analysis Not only has the design to “function properly”….it also has always tighter timing constraints Design timing properties have to be verified Static Timing Analysis is the main method
42
EE141 41 Static Timing Analysis Suitable for synchronous design Verify timing without testvectors Conservative with respect to dynamic timing analysis Latches Combinational Logic
43
EE141 42 Static Timing Analysis Inputs: Netlist, library models of the cells and constraints (clock period, skew, setup and hold time…) Outputs: Delay through the combinational logic Basic concepts: Look for the longest topological path Discard it if it is false
44
EE141 43 Conventional Simulation Methodology Limitations Increase in size of design significantly impact the verification methodology in general Simulation requires a very large number of test vectors for reasonable coverage of functionality Test vector generation is a significant effort Simulation run-time starts becoming a bottleneck New techniques: Static Timing Analysis Cycle-based simulation Formal Verification
45
EE141 44 New Verification Paradigm Functional: cycle-based simulation and/or formal verification Timing: Static Timing Analysis Gate-level netlist RTL Testbench Logic Synthesis Cycle-based Sim. Event-driven Sim. Static Timing Analysis Formal Verification
46
EE141 45 More issues System Level Simulation (Hardware/Software Codesign) CoCentric System Studio Synopsys Virtual Component Co-Design (VCC) Cadence Design Syst. Mixed-Signal Simulation Verilog-AMS
47
EE141 46 Design verification checking number of inversions between two C 2 MOS gates checking pull-up and pull down ratio in pseudo-NMOS gates checking minimum driver size to maintain rise and fall times checking charge sharing to satisfy noise- margins Electrical verification
48
EE141 47 Design verification Spice too long simulation time RC delay estimated using Penfield- Rubinstein-Horowitz method identification of critical path (avoid false paths) Timing verification
49
EE141 48 Design verification components described behaviorally circuit model obtained from component models resulting circuit behavior computed with design specifications no generally acceptable verifier exists Formal verification
50
EE141 49 Physical issues verification (DSM) Interconnects Signal Integrity P/G integrity Substrate coupling Crosstalk Parasitic Extraction Reduced Order Modeling Manufacturability and Reliability Power Estimation
51
EE141 50 (Some) EDA Tools and Vendors Formal Verification Formality Synopsys FormalCheck Cadence Design Systems DesignVerifyer Chrysalis Static Timing Analysis PrimeTime Synopsys (gate-level) PathMill Synopsys (transistor-level) Pearl Cadence Design Systems
52
EE141 51 Summary Conventional design and verification flow review Verification Techniques Simulation –Behavioral, RTL, Gate-level, Switch-level, Transistor- level Formal Verification Static Timing Analysis Emerging verification paradigm Functional: cycle-based, formal verification Timing: Static Timing Analysis
53
EE141 52 “Prototyping” Techniques in Design Stages time HardwareDesignChanges SoftwareSimulation Emulation PrototypeReplication Flexibility Performance Cost
54
EE141 53 Implementation approaches
55
EE141 54 Custom circuit design labor intensive high time-to-market cost amortized over a large volume reuse as a library cell was popular in early designs layout editor, DRC, circuit extraction
56
EE141 55 Layout editor transistor symbols relative positioning compaction stick diagram description design rules automatically satisfied automatic pitch matching 1. Polygon based (Magic) 2. Symbolic layout
57
EE141 56 Automatic pitch matching
58
EE141 57 Design rule checking on-line DRC - rules checked and errors flagged during layout batch DRC - post design verification
59
EE141 58 Circuit extraction Circuit schematic derived from layout Transistors are build with proper geometry Parasitic capacitances and resistances evaluated Extraction of inductance requires 3D analysis
60
EE141 59
61
EE141 60 Cell-based design reduced cost reduced time reduced integration density reduced performance standard cell compiled cells module generators macro cell place and route
62
EE141 61 Standard cell library contains basic logic cells - inverter, AND/NAND, OR/NOR, XOR/NXOR, flip-flop, AOI, MUX, adder, compactor, counter, decoder, encoder, fan-in and fan-out specified schematic uses cells from library layout automatically generated
63
EE141 62 Standard cell cells have equal heights cell rows separated by routing channels
64
EE141 63 Standard cell design
65
EE141 64 Standard cell layout and description
66
EE141 65 An Example Cell A 2-input dynamic C with a clearing input and an inverting output (C_dic2)
67
EE141 66 Single vs Double Height Cells arbiter
68
EE141 67 Single vs Double Height Cells… arbiter_dblht
69
EE141 68 Existing Design Flow Synopsys Synthesis Cadence Silicon Ensemble Cadence Composer Schematic Cadence Virtuoso Layout Existing Datapath Layout Behavioral Verilog Code Your Libraries LVS Verilog-XL Behavioral Verilog Structural Verilog Circuit Layout
70
EE141 69 New Design Flow Cadence to Synopsys Interface Cadence Silicon Ensemble Cadence Composer Schematic Cadence Virtuoso Layout Existing Datapath Layout Standard Cell Libraries LVS Verilog-XL Structural Verilog Circuit Layout
71
EE141 70 Placed and Routed
72
EE141 71 What are Standard Cell Libraries Standard-cell libraries are fixed set of well-characterized logic blocks. Basic logic functions are used several times on the same integrated circuit. It will have leaf cells ranging from simple gates to latches and flip-flops. These can then be used to build arithmetic blocks like adders and multipliers. ASIC designers commonly employ the use of standard cell libraries due to their robustness and flexibility resulting in quick turnaround times.
73
EE141 72 Advantages of standard cell libraries Designers save time and money by reducing the product development cycle time. Reduce risk by using predesigned, pretested and precharacterised standard cell libraries. Optimisation is possible.
74
EE141 73 Disadvantages of standard cell libraries Time and expenses of designing or buying the standard cell library. Time needed to fabricate all layers of ASIC for each new design when the standard cell library must be ported to a new fabrication process, the physical layout of all the cells need to be changed. There are no naming conventions There are no standards for cell behavior
75
EE141 74 Standard cell large design cost amortized over a large number of designs large number of different cells with different fan-ins large fan-out for cells to be used in different designs synthesis tools made standard cell design popular standard cell design outperform PLA in area and speed standard cell benefit from multi level logic synthesis
76
EE141 75 Classification of standard cell libraries Classical Libraries: --- Theses are the most common logic elements like gates, flip-flops, multiplexers, PAL, memories etc. IP (Intellectual Property) offerings: --- These include products like gate arrays and CPLDs which are IP offerings by many companies. Each one providing its own features and facilities in the product.
77
EE141 76 Fragment of an ASIC Library
78
EE141 77 MOSIS compatible cell libraries are provided by many organisations, commercial and non-commercial both Commercial organisations are: Mentor Graphics, Cadence, Artisan, Avant, Barcelona Design, Tanner Research, LEDA systems etc. Non Commercial Organisations are:MSU’s SCMOS Library, LASI, Ballistic, Magic etc. MOSIS compatible design tools and cell libraries
79
EE141 78 Standard Cells Provided by Mentor Graphics There are over 200 standard cells available for the 2.0 um, 1.2 um, and 0.8 um technology. -- 2, 3, and 4-input AND, NAND, OR, NOR, AO -- 2-input XOR and XNOR gates -- 2-1 MUX gate -- multiple drive strength buffers, inverters and tri-state buffers -- four D-type flip-flops: dff, dffs, dffr, and dffsr -- four D-type latches: latch, latchs, latchr, latchsr All of these cells have quickpart models with timing for full, backannotated simulation after layout
80
EE141 79 MGC Digital Libraries
81
EE141 80 MGC Digital Libraries (Contd..)
82
EE141 81 MGC Digital Libraries (Contd..)
83
EE141 82 MGC Digital Libraries (Contd..)
84
EE141 83 MGC Digital Libraries (Contd..)
85
EE141 84 New Trends in Standard Cell Libraries In a bid to improve the performance of standard-cell designs, vendors of place-and-route and synthesis tools and cell libraries are teaming up to develop a technique that is likely to lead to the death of the standard cell itself. Prolific Inc. (Newark, Calif.) has launched a tool called Liquid Libraries that will create tuned cells on the fly and insert them into the libraries used by place and route tools Hot on the heels of Prolific's launch, Cadabra Design Automation (Santa Clara, Calif.) is working on a new flow that would ultimately move library generation as far forward as the synthesis phase, giving logic designers the ability to tune parts of a design for low power consumption or speed
86
EE141 85 New Trends Contd.. Prolific is working with Cadence Design Systems, Magma Design Automation, Monterey Design Systems and Sapphire Design Automation.. Cadabra is working with Avanti, Cadence, Synopsis and Magma. The first fruit of the Cadabra project will be the power and performance optimization (PPO) flow.
87
EE141 86 Important Links to Standard Cell Libraries a) Tanner Inc. www.tanner.com. b) Prolific Inc. www. prolificinc.com c) MOSIS organisation www.mosis.org MOSIS/Tech Support/MOSIS Compatible Design Tools d) Cadabra Design www.cadabratech.com Automation Inc. e) Cadence Design Systems www.cadence.com Inc.
88
EE141 87 Compiled cell cell layout generated on the fly transistor or gate level netlist used with transistor size specified layout densities approach that of human designers Circuit schematics with transistor sizing
89
EE141 88 Compiled cell Generated layout
90
EE141 89 Module generators logic level cells not efficient for subcircuit design - shifters, adders, multipliers, data paths, PLAs, counters, memories Macrocell generators - use design parameters like number of bits data path compilers - use bit slice modules and repeat them N times - generate interconnections between modules
91
EE141 90 Datapath compilers Feedtroughs used to improve routing
92
EE141 91 Datapath compilers Datapath compiler results
93
EE141 92 Macrocell place and route channel routing - metal 2 horizontal segments metal 1 vertical segments over the block routing (3-6 metal layers used)
94
EE141 93
95
EE141 94 Array-based design implementation mask programmable arrays fuse based FPGAs nonvolatile FPGAs RAM based FPGAs To avoid slow fabrication process which takes 3-4 weeks :
96
EE141 95 Mask programmable arrays gate-array - similar to standard cell sea-of-gate - routed over the cells (high density) - wires added to make logic gates challenge in design is to utilize the maximum cell capacity utilization < 75% for random logic design
97
EE141 96 Fuse-based PLD’s
98
EE141 97 Fuse-based FPGA’s Actel sea-of-gate and standard cell approach
99
EE141 98 Fuse-based FPGA’s Example : XOR gate obtained by setting : A=1, B=0, C=0, D=1, SA=SB=In1, S0=S1=In2
100
EE141 99 Fuse-based FPGA’s Anti-fuse provides short (low resistance) when blown out
101
EE141 100 Nonvolatile FPGA’s programming similar to PROM erasable programmable logic devices - EPLD electrically erasable - EEPLD design partitioned into macrocells flip-flops used to make sequential circuits software used to program interconnections to optimize use of hardware input specified from schematics, truth tables, state graphs, VHDL code
102
EE141 101
103
EE141 102 RAM based (volatile) FPGA’s programming is fast and can be repeated many times no high voltage needed integration density is high information lost when the power goes off
104
EE141 103 XILINX FPGA’s configurable logic blocks CLBs used five input two output combinational blocks two D flip flops are edge or level triggered functionality and multiplexers controlled by RAM RAM can be used as look-up table or a register file
105
EE141 104 XILINX FPGA’s
106
EE141 105 XILINX FPGA’s each cell connected to 4 neighbors routing channels provide local or global connections switching matrices(RAM controlled) are used for switching between channels
107
EE141 106 XILINX FPGA’s
108
EE141 107 XILINX FPGA’s (XC4025) 32 × 32 CLBs 25000 gates 422 k bites of RAM operates at 250 MHz 32 kbit adder uses 62 CLBs
109
EE141 108 XILINX FPGA’s (XC4025)
110
EE141 109 Universal Nanoscale Architecture Beyond lithographic limits Crossed Wire nanoarrays Implement PLAs, memory, and xbars NSC’02: to appear IEEE TR Nano
111
EE141 110 Design synthesis
112
EE141 111 Circuit synthesis derivation of the transistors schematics from logic functions - complementary CMOS - pass transistor - dynamic - DCVSL (differential cascode voltage switch logic) transistor sizing - performance modeling using RC equivalent circuits- layout generation synthesis not popular due to designers reluctance
113
EE141 112 Logic synthesis state transition diagrams, FSM, schematics, Boolean equations, truth tables, and HDL used synthesis - combinational or sequential - multi level, PLA, or FPGA logic optimization for - area, speed, power - technology mapping
114
EE141 113 Logic optimization Expresso - two level minimization tool (UCB) state minimization and state encoding MIS - multilevel logic synthesis (UCB) Example : S = (A B) C i C o = AB + AC i + BC i
115
EE141 114 Architecture synthesis behavioral or high level synthesis optimizing translation e.g. pipelining Cathedral and HYPER tools HYPER tutorial and synthesis example: http://infopad.eecs.berkeley.edu/~hyper
116
EE141 115 Architecture Synthesis Automatic Tool Application Customized Architecture Power Size Performance Timing
117
EE141 116 Architecture synthesis example
118
EE141 117 Architecture synthesis
119
EE141 118 Tensilica: Xtensa Architecture Copyright: Tensilica
120
EE141 119 Design Framework Copyright: Tensilica
121
EE141 120 Silicon Choices ASIC implementation of Processor Fast but not flexible Time intensive design process Example: Tensilica, HP-STMicroelectronics Processor core in ASIC but instruction set extensions in reconfigurable logic Medium speed but flexible Fast design process Example: Triscend Configurable System-on-Chip
122
EE141 121 Triscend Configurable SoC Copyright: Triscend
123
EE141 122 Reconfigurable Computing 101 Higher performance than software with higher level of flexibility than hardware e.g. Field Programmable Gate Arrays (FPGA) Logic Blocks –Array of computational elements whose functionality is determined through multiple SRAM configuration bits Interconnection –Logic blocks are connected using programmable routing resources Any custom circuit can be mapped to FPGA by computing logic functions within logic blocks and using configurable routing to connect the logic blocks together Dynamically reconfigurable Logic Logic reconfiguration during application execution Temporal partitioning of software reduces logic area Overhead for reconfiguration
124
EE141 123 Use of Reconfigurable Computing Two choices Map both control and datapath to RC Map only datapath to RC Granularity of reconfigurable logic Bit Multiple bits ALU
125
EE141 124 RC Coupled to I/O System Bus Most common form of commercial RC Overhead of data transfer between CPU and RC Requires large granularity of computation on RC CPU RC I/O Memory Local Bus PCI Bus Local Bus
126
EE141 125 RC Coupled to Local Bus Pilchard from Chinese University of Hong Kong Still requires large granularity of computation CPU RC I/O Memory Local Bus PCI Bus Local Bus
127
EE141 126 RC Coupled to CPU as Coprocessor Tight coupling between CPU and RC RC can execute ISA extensions CPU and RC cannot share register file CPU RC I/O Memory Local Bus PCI Bus Local Bus
128
EE141 127 PICO Architecture Copyright: Bob Rau et. al.
129
EE141 128 Design Framework Copyright: Bob Rau et. al.
130
EE141 129 Design Flow Copyright: Bob Rau et. al.
131
EE141 130 Hardware/software Co-design Well studied problem. Then what’s new? High Level Synthesis (HLS) Time-to-market constraint forces automated generation of reconfigurable bitstream from high level specification or software Automated generation of interface Spatial and temporal partitioning Partitioning among multiple configurable devices Map a function that exceeds the available space of reconfigurable device using time sharing Requires new compilation techniques
132
EE141 131 High Level Synthesis-1 High level hardware description language Start from software programming language and add support for Parallelism via threads Message passing Examples: Handel-C, SystemC Make current HDL more abstract Superlog, System Verilog Still requires user to find parallelism
133
EE141 132 High Level Synthesis-2 High Level Synthesis-2 Combine research in two different fields: compiler and design automation Traditional HLS techniques target ASIC implementation RC does not have the layout freedom Objective of RC is to minimize execution time Temporal partitioning if insufficient area Hardware library of operators or structures commonly used by software programs
134
EE141 133 High Level Synthesis-3 Concentrate on loops Leverage parallelizing compiler technology combined with high level synthesis Parallelize computation Optimize external memory access Loop transformation: Area versus Performance –Unroll and Jam –Loop unrolling –Software pipelining –Loop-invariant code motion –Data layout Hardware specific optimizations Bitwidth reduction
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.