Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 MSc - Microprocessors Dr. Konstantinos Tatas

Similar presentations


Presentation on theme: "1 MSc - Microprocessors Dr. Konstantinos Tatas"— Presentation transcript:

1 1 MSc - Microprocessors Dr. Konstantinos Tatas com.tk@fit.ac.cy

2 2 Useful Information Instructor: Lecturer K. Tatas Instructor: Lecturer K. Tatas –Office hours: TBA –E-mail: com.tk@fit.ac.cy com.tk@fit.ac.cy –http://staff.fit.ac.cy/com.tk http://staff.fit.ac.cy/com.tk Lecture periods/week: 4 Lecture periods/week: 4 Duration: 10 weeks Duration: 10 weeks ECTS: 7 (175 hours) ECTS: 7 (175 hours)

3 3 Course Objectives By the end of the course students should be able to: By the end of the course students should be able to: –Evaluate the complex trade-offs involved in embedded system design –Write detailed embedded system requirements and specification documents –Write executable specifications using UML/SystemC –Develop applications using ARM Developer Suite –Write efficient ARM assembly and C programs in ARM and Thumb mode –Analyze program performance using traces –Use code transformations to improve performance/code size/power consumption.

4 4 Course Outline (1/2) Week 1: Introduction to microprocessors for general purpose and embedded systems – Embedded microprocessor evolution – Design metrics and constraints (performance, power, cost, time-to-market) and design optimization challenges Key embedded system technologies – Integrated Circuit technology – Microprocessor technology – CAD tool technology Week 1: Introduction to microprocessors for general purpose and embedded systems – Embedded microprocessor evolution – Design metrics and constraints (performance, power, cost, time-to-market) and design optimization challenges Key embedded system technologies – Integrated Circuit technology – Microprocessor technology – CAD tool technology Week 2: Embedded system specification and modeling – Object- oriented specification (UML/C++/SystemC) – Assignment 1 Week 2: Embedded system specification and modeling – Object- oriented specification (UML/C++/SystemC) – Assignment 1 Week 3: Computer Architecture – Instruction sets – RISC vs. CISC – pipelining - The ARM microprocessor architecture - ARM assembly – ARM mode – Thumb mode - ARM and Thumb instruction set - ARM conditional execution Week 3: Computer Architecture – Instruction sets – RISC vs. CISC – pipelining - The ARM microprocessor architecture - ARM assembly – ARM mode – Thumb mode - ARM and Thumb instruction set - ARM conditional execution Week 4: Processor I/O – Serial I/O – Busy/wait I/O – Interrupts – Exceptions – Traps – ARM memory mapped I/O - Caches – Memory Management Units – Protection Units – ARM cache and MMU – Assignment 2 Week 4: Processor I/O – Serial I/O – Busy/wait I/O – Interrupts – Exceptions – Traps – ARM memory mapped I/O - Caches – Memory Management Units – Protection Units – ARM cache and MMU – Assignment 2

5 5 Course Outline (2/2) Week 5: Programme design and analysis – DFGs – CDFGs – Compilers – Assemblers – Linkers – Basic compiler optimizations/code transformations – Measuring programme speed – Trace-driven performance analysis – Energy optimization – programme size optimization Week 5: Programme design and analysis – DFGs – CDFGs – Compilers – Assemblers – Linkers – Basic compiler optimizations/code transformations – Measuring programme speed – Trace-driven performance analysis – Energy optimization – programme size optimization Week 6: Code transformations – Loop unrolling – loop merging – loop tiling – performance optimizing transformations Week 6: Code transformations – Loop unrolling – loop merging – loop tiling – performance optimizing transformations Week 7: Test Week 7: Test Week 8: Assignment 3 Week 8: Assignment 3 Week 9: Week 9: Week 10: Revision Week 10: Revision

6 6 Course Assessment Final exam: 60% Final exam: 60% Coursework: 40% Coursework: 40% –Assignment 1: 8% –Assignment 2: 8% –Assignment 3: 8% –Test: 10% –Lab exercises: 6%

7 7 Outline VLSI and microprocessor evolution Microprocessors in embedded systems Design challenge – optimizing design metrics Technologies – –Processor technologies – –IC technologies – –Design technologies

8 8 ENIAC – The first electronic computer (1946)

9 9 Cross-Section of CMOS Technology

10 10 Evolution in Complexity (1)

11 11 Evolution in Complexity (2)

12 12 Moore’s Law

13 13 Moore’s Law

14 14 Intel 4004 Intel 4004 Micro-Processor

15 15

16 16

17 17

18 18

19 19 Cell Processor for Playstation3

20 20 IBM POwer5

21 21 IBM PowerPC history

22 22 Technology Process Evolution Node years: 2007/65nm, 2010/45nm, 2013/33nm, 2016/23nm

23 23 Technology Process Evolution

24 24 Intel’s Technology Roadmap Mark Bohr: Intel 04

25 25 Raising the Level of Abstraction for Design Performance driven HWR 3G ENGINE BlueTooth Controller IR & RS232 Compression & Encryption Engine CPU Core DSP Core RTOS BlueTooth Driver Comp/Enc Driver IR/RS232 Driver FPGA Functional Differentiation Software Transistor Gate RTL IP Blocks & NoC Courtesy from: Walden C. Rhines – Mentor Graphics Corporation, DAC 2004

26 26 Bibliography Books – –W. Wolf, “Computers as Components” – –S. Furber, “ARM System-on-Chip Architecture” – –P. Panda, “Memory Issues in Embedded Systems-on-Chip” – –F. Vahid and T. Givargis, “Embedded System Design: A Unified Hardware/Software Introduction” – –F. Catthoor, “Data Access and Storage Management for Embedded Programmable Processors”

27 27 Microprocessors for Embedded systems Computing systems are everywhere Computing systems are everywhere Most of us think of “desktop” computers Most of us think of “desktop” computers –PC’s –Laptops –Mainframes –Servers But there’s another type of computing system But there’s another type of computing system –Far more common...

28 28 Embedded systems overview Embedded computing systems – –Computing systems embedded within electronic devices – –Hard to define. Nearly any computing system other than a desktop computer – –Billions of units produced yearly, versus millions of desktop units – –Perhaps 50 per household and per automobile Computers are in here... and here... and even here... Lots more of these, though they cost a lot less each.

29 29 A “short list” of embedded systems And the list goes on and on Anti-lock brakes Auto-focus cameras Automatic teller machines Automatic toll systems Automatic transmission Avionic systems Battery chargers Camcorders Cell phones Cell-phone base stations Cordless phones Cruise control Curbside check-in systems Digital cameras Disk drives Electronic card readers Electronic instruments Electronic toys/games Factory control Fax machines Fingerprint identifiers Home security systems Life-support systems Medical testing systems Modems MPEG decoders Network cards Network switches/routers On-board navigation Pagers Photocopiers Point-of-sale systems Portable video games Printers Satellite phones Scanners Smart ovens/dishwashers Speech recognizers Stereo systems Teleconferencing systems Televisions Temperature controllers Theft tracking systems TV set-top boxes VCR’s, DVD players Video game consoles Video phones Washers and dryers

30 30 Some common characteristics of embedded systems Single-functioned Single-functioned –Executes a single program, repeatedly Tightly-constrained Tightly-constrained –Low cost, low power, small, fast, etc. Reactive and real-time Reactive and real-time –Continually reacts to changes in the system’s environment –Must compute certain results in real-time without delay

31 31 An embedded system example – Digital camera Single-functioned -- always a digital camera Single-functioned -- always a digital camera Tightly-constrained -- Low cost, low power, small, fast Tightly-constrained -- Low cost, low power, small, fast Reactive and real-time -- only to a small extent Reactive and real-time -- only to a small extent Microcontroller CCD preprocessorPixel coprocessor A2D D2A JPEG codec DMA controller Memory controllerISA bus interfaceUARTLCD ctrl Display ctrl Multiplier/Accum Digital camera chip lens CCD

32 32 Embedded Software Development Requires as Much/More Design Effort Than Hardware

33 33 A System-on-a-Chip: Example Courtesy: Philips

34 34 Design at a crossroad System-on-a-Chip RAM 500 k Gates FPGA + 1 Gbit DRAM Preprocessing Multi- Spectral Imager  C system +2 Gbit DRAM Recog- nition Analog 64 SIMD Processor Array + SRAM Image Conditioning 100 GOPS Embedded applications where cost, performance, and energy are the real issues! Embedded applications where cost, performance, and energy are the real issues! DSP and control intensive DSP and control intensive Mixed-mode Mixed-mode Combines programmable and application-specific modules Combines programmable and application-specific modules Software plays crucial role Software plays crucial role

35 35 Disciplines involved in Embedded System Design Digital System Design Digital System Design Software Design Software Design Analog/Mixed-Signal/RF System Design Analog/Mixed-Signal/RF System Design Operating Systems Operating Systems Microprocessors/Computer Architecture Microprocessors/Computer Architecture Verification Verification Testing Testing etc etc

36 36 Languages traditionally used in Embedded System Design Specification/modeling Specification/modeling –UML –SDL –C/C++ Hardware design Hardware design –VHDL –Verilog Software design Software design –C/C++ –Java –Assembly Verification Verification –VHDL/Verilog –SystemVerilog –Tcl/tk –Vera

37 37 Design Challenges How much hardware do we need? How much hardware do we need? How do we meet (system) deadlines? How do we meet (system) deadlines? –Faster clock? How do we minimize power consumption? How do we minimize power consumption? –Slower clock? How do we design for upgradeability? How do we design for upgradeability? How do you know it really works? How do you know it really works? –Complex testing –Limited observability and controllability

38 38 Design challenge – optimizing design metrics Obvious design goal: Obvious design goal: –Construct an implementation with desired functionality Key design challenge: Key design challenge: –Simultaneously optimize numerous design metrics Design metric Design metric – A measurable feature of a system’s implementation –Optimizing design metrics is a key challenge

39 39 Design challenge – optimizing design metrics Common metrics Common metrics –Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost –NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system –Size: the physical space required by the system –Performance: the execution time or throughput of the system –Power: the amount of power consumed by the system –Flexibility: the ability to change the functionality of the system without incurring heavy NRE cost

40 40 Design challenge – optimizing design metrics Common metrics (continued) Common metrics (continued) –Time-to-prototype: the time needed to build a working version of the system –Time-to-market: the time required to develop a system to the point that it can be released and sold to customers –Maintainability: the ability to modify the system after its initial release –Correctness, safety, many more

41 41 Design metric competition -- improving one may worsen others Expertise with both software and hardware is needed to optimize design metrics Expertise with both software and hardware is needed to optimize design metrics –Not just a hardware or software expert, as is common –A designer must be comfortable with various technologies in order to choose the best for a given application and constraints SizePerformance Power NRE cost Microcontroller CCD preprocessorPixel coprocessor A2D D2A JPEG codec DMA controller Memory controllerISA bus interfaceUARTLCD ctrl Display ctrl Multiplier/Accum Digital camera chip lens CCD

42 42 Time-to-market: a demanding design metric Time required to develop a product to the point it can be sold to customers Time required to develop a product to the point it can be sold to customers Market window Market window –Period during which the product would have highest sales Average time-to- market constraint is about 8 months Average time-to- market constraint is about 8 months Delays can be costly Delays can be costly Revenues ($) Time (months)

43 43 Losses due to delayed market entry Simplified revenue model Simplified revenue model –Product life = 2W, peak at W –Time of market entry defines a triangle, representing market penetration –Triangle area equals revenue Loss Loss –The difference between the on-time and delayed triangle areas On-time Delayed entry Peak revenue Peak revenue from delayed entry Market rise Market fall W2W Time D On-time Delayed Revenues ($)

44 44 Losses due to delayed market entry (cont.) Area = 1/2 * base * height Area = 1/2 * base * height –On-time = 1/2 * 2W * W –Delayed = 1/2 * (W- D+W)*(W-D) Percentage revenue loss = (D(3W-D)/2W 2 )*100% Percentage revenue loss = (D(3W-D)/2W 2 )*100% Try some examples Try some examples –Lifetime 2W=52 wks, delay D=4 wks –(4*(3*26 –4)/2*26^2) = 22% –Lifetime 2W=52 wks, delay D=10 wks –(10*(3*26 –10)/2*26^2) = 50% –Delays are costly! On-time Delayed entry Peak revenue Peak revenue from delayed entry Market rise Market fall W2W Time D On-time Delayed Revenues ($)

45 45 The performance design metric Widely-used measure of system, widely-abused Widely-used measure of system, widely-abused –Clock frequency, instructions per second – not good measures –Digital camera example – a user cares about how fast it processes images, not clock speed or instructions per second Latency (response time) Latency (response time) –Time between task start and end –e.g., Camera’s A and B process images in 0.25 seconds Throughput Throughput –Tasks per second, e.g. Camera A processes 4 images per second –Throughput can be more than latency seems to imply due to concurrency, e.g. Camera B may process 8 images per second (by capturing a new image while previous image is being stored). Speedup of B over S = B’s performance / A’s performance Speedup of B over S = B’s performance / A’s performance –Throughput speedup = 8/4 = 2

46 46 Three key embedded system technologies Technology Technology –A manner of accomplishing a task, especially using technical processes, methods, or knowledge Three key technologies for embedded systems Three key technologies for embedded systems –Processor technology –IC technology –Design technology

47 47 Processor technology The architecture of the computation engine used to implement a system’s desired functionality The architecture of the computation engine used to implement a system’s desired functionality Processor does not have to be programmable Processor does not have to be programmable –“Processor” not equal to general-purpose processor Application-specific Registers Custom ALU DatapathController Program memory Assembly code for: total = 0 for i =1 to … Control logic and State register Data memory IRPC Single-purpose (“hardware”) DatapathController Control logic State register Data memory index total + IRPC Register file General ALU DatapathController Program memory Assembly code for: total = 0 for i =1 to … Control logic and State register Data memory General-purpose (“software”)

48 48 Processor technology Processors vary in their customization for the problem at hand Processors vary in their customization for the problem at hand total = 0 for i = 1 to N loop total += M[i] end loop General-purpose processor Single-purpose processor Application-specific processor Desired functionality

49 49 General-purpose processors Programmable device used in a variety of applications Programmable device used in a variety of applications –Also known as “microprocessor” Features Features –Program memory –General datapath with large register file and general ALU User benefits User benefits –Low time-to-market and NRE costs –High flexibility “Pentium” the most well-known, but there are hundreds of others “Pentium” the most well-known, but there are hundreds of others Datapath IRPC Register file General ALU Controller Program memory Assembly code for: total = 0 for i =1 to … Control logic and State register Data memory

50 50 Single-purpose processors Digital circuit designed to execute exactly one program Digital circuit designed to execute exactly one program –a.k.a. coprocessor, accelerator or peripheral Features Features –Contains only the components needed to execute a single program –No program memory Benefits Benefits –Fast –Low power –Small size Datapath Controller Control logic State register Data memory index total +

51 51 Application-specific processors Programmable processor optimized for a particular class of applications having common characteristics Programmable processor optimized for a particular class of applications having common characteristics –Compromise between general-purpose and single-purpose processors Features Features –Program memory –Optimized datapath –Special functional units Benefits Benefits –Some flexibility, good performance, size and power Datapath IRPC Registers Custom ALU Controller Program memory Assembly code for: total = 0 for i =1 to … Control logic and State register Data memory

52 52 IC technology The manner in which a digital (gate-level) implementation is mapped onto an IC The manner in which a digital (gate-level) implementation is mapped onto an IC –IC: Integrated circuit, or “chip” –IC technologies differ in their customization to a design –IC’s consist of numerous layers (perhaps 10 or more) IC technologies differ with respect to who builds each layer and when IC technologies differ with respect to who builds each layer and when sourcedrain channel oxide gate Silicon substrate IC packageIC

53 53 IC technology Design Approaches Custom Standard Cells Compiled Cells Macro Cells Cell-based Pre-diffused (Gate Arrays) Pre-wired (FPGA's) Array-based Semicustom IC Technology Implementation Approaches

54 54 Full-custom design All layers are optimized for an embedded system’s particular digital implementation All layers are optimized for an embedded system’s particular digital implementation –Placing transistors –Sizing transistors –Routing wires Benefits Benefits –Excellent performance, small size, low power Drawbacks Drawbacks –High NRE cost (e.g., $300k), long time-to-market

55 55 The Custom Approach Intel 4004 Courtesy Intel

56 56 Transition to Automation and Regular Structures Intel 4004 (‘71) Intel 8080 Intel 8085 Intel 8286 Intel 8486 Courtesy Intel

57 57

58 58 IC technology Design Approaches Custom Standard Cells Compiled Cells Macro Cells Cell-based Pre-diffused (Gate Arrays) Pre-wired (FPGA's) Array-based Semicustom IC Technology Implementation Approaches

59 59 Semi-custom Lower layers are fully or partially built Lower layers are fully or partially built –Designers are left with routing of wires and maybe placing some blocks Benefits Benefits –Good performance, good size, less NRE cost than a full-custom implementation (perhaps $10k to $100k) Drawbacks Drawbacks –Still require weeks to months to develop

60 60 Cell-based Design (or standard cells) Routing channel requirements are reduced by presence of more interconnect layers

61 61 Standard Cell — Example [Brodersen92]

62 62 Standard Cell - Example 3-input NAND cell (from ST Microelectronics): C = Load capacitance T = input rise/fall time

63 63 IC technology Design Approaches Custom Standard Cells Compiled Cells Macro Cells Cell-based Pre-diffused (Gate Arrays) Pre-wired (FPGA's) Array-based Semicustom IC Technology Implementation Approaches

64 64 Programmable Logic Devices All layers (diffusion, polysilicon, [multi-] metal) may exist – –Designers can purchase an IC – –Connections on the IC are either created or destroyed to implement desired functionality – –Field-Programmable Gate Array (FPGA) and recently Gate Arrays are very popular Benefits – –Low NRE costs, almost instant IC availability Drawbacks – –Bigger, expensive (perhaps $30 per unit), power hungry, slower

65 65 Gate Array — Sea-of- gates Uncommited Cell Committed Cell (4-input NOR)

66 66 Sea-of-gate Primitive Cells Using oxide-isolation Using gate-isolation

67 67 Sea-of-gates Random Logic Memory Subsystem LSI Logic LEA300K (0.6  m CMOS)

68 68 Prewired Arrays Classification of prewired arrays (or field- programmable devices): Based on Programming Technique Based on Programming Technique –Fuse-based (program-once) –Non-volatile EPROM based –RAM based Programmable Logic Style Programmable Logic Style –Array-Based –Look-up Table Programmable Interconnect Style Programmable Interconnect Style –Channel-routing –Mesh networks

69 69 Altera MAX From Smith97

70 70 Altera MAX Interconnect Architecture row channelcolumn channel LAB Array-based (MAX 3000-7000) Mesh-based (MAX 9000)

71 71 LUT-Based Logic Cell D 4 C 1....C 4 x xxxxx D 3 D 2 D 1 F 4 F 3 F 2 F 1 Logic function of xxx Logic function of xxx Logic function of xxx xx 4 x xx xxxx H P Bits control Bits control Multiplexer Controlled by Configuration Program x x x x xx x xxxx x xx xxxx xx x x Xilinx 4000 Series

72 72 Array-Based Programmable Wiring Vertical tracks Input/output pinProgrammed interconnection Interconnect Point Horizontal tracks Cell

73 73 Transistor Implementation of Mesh Courtesy Dehon and Wawrzyniek

74 74 RAM-based FPGA Xilinx XC4000ex

75 75 Design Technology The manner in which we convert our concept of desired system functionality into an implementation The manner in which we convert our concept of desired system functionality into an implementation Libraries/IP: Incorporates pre- designed implementation from lower abstraction level into higher level. System specification Behavioral specification RT specification Logic specification To final implementation Compilation/Synthesis: Automates exploration and insertion of implementation details for lower level. Test/Verification: Ensures correct functionality at each level, thus reducing costly iterations between levels. Compilation/ Synthesis Libraries/ IP Test/ Verification System synthesis Behavior synthesis RT synthesis Logic synthesis Hw/Sw/ OS Cores RT components Gates/ Cells Model simulat./ checkers Hw-Sw cosimulators HDL simulators Gate simulators

76 76 The co-design ladder In the past: In the past: –Hardware and software design technologies were very different –Recent maturation of synthesis enables a unified view of hardware and software Hardware/software “codesign” Hardware/software “codesign” Implementation Assembly instructions Machine instructions Register transfers Compilers (1960's,1970's) Assemblers, linkers (1950's, 1960's) Behavioral synthesis (1990's) RT synthesis (1980's, 1990's) Logic synthesis (1970's, 1980's) Microprocessor plus program bits: “software” VLSI, ASIC, or PLD implementation: “hardware” Logic gates Logic equations / FSM's Sequential program code (e.g., C, VHDL) The choice of hardware versus software for a particular function is simply a tradeoff among various design metrics, like performance, power, size, NRE cost, and especially flexibility; there is no fundamental difference between what hardware or software can implement.

77 77 Independence of processor and IC technologies Basic tradeoff Basic tradeoff –General vs. custom –With respect to processor technology or IC technology –The two technologies are independent General- purpose processor ASIP Single- purpose processor Semi-customPLDFull-custom General, providing improved: Customized, providing improved: Power efficiency Performance Size Cost (high volume) Flexibility Maintainability NRE cost Time- to-prototype Time-to-market Cost (low volume)

78 78 Design Decision Trade-offs

79 79 Generalised Design Flow

80 80 Architecture ReUse Silicon System Platform Silicon System Platform –Flexible architecture for hardware and software –Specific (programmable) components –Network architecture –Software modules –Rules and guidelines for design of HW and SW Has been successful in PC’s Has been successful in PC’s –Dominance of a few players who specify and control architecture Application-domain specific (difference in constraints) Application-domain specific (difference in constraints) –Speed (compute power) –Dissipation –Costs –Real / non-real time data

81 81 Platform-Based Design A platform is a restriction on the space of possible implementation choices, providing a well-defined abstraction of the underlying technology for the application developer A platform is a restriction on the space of possible implementation choices, providing a well-defined abstraction of the underlying technology for the application developer New platforms will be defined at the architecture-micro-architecture boundary New platforms will be defined at the architecture-micro-architecture boundary They will be component-based, and will provide a range of choices from structured-custom to fully programmable implementations They will be component-based, and will provide a range of choices from structured-custom to fully programmable implementations Key to such approaches is the representation of communication in the platform model Key to such approaches is the representation of communication in the platform model “Only the consumer gets freedom of choice; designers need freedom from choice” (Orfali, et al, 1996, p.522) Source:R.Newton

82 82 Platform-based Design – System-on-Chip Use of predefined Intellectual Property (IP) A platform-based system consists of a RISC processor, memories, busses and a common language Platform-based design poses the problem of partitioning a solution between hardware (HDL) and software (programming processors)

83 83 Platforms Enable Simplified SoC Design Customer demands –Fast turn-around time –Easy access to pre-qualified building blocks –Web enabled Design technology –Core platforms –‘ Big ’ IP –Emerging SoC bus standards –Embedded software –HW/SW co-verification Far Peripherals Near Peripherals Core

84 84 And Automation of IP Selection & Integration

85 85 Heterogeneous Programmable Platforms Xilinx Vertex-II Pro High-speed I/O Embedded PowerPc Embedded memories Hardwired multipliers FPGA Fabric

86 86 Xilinx’s products

87 87 Xilinx’s products

88 88 Comparison of CMOS design methods Design Method NREUnit CostPower Dissipation Complexity of Implement ation Time-to- Market PerformanceFlexibility μProcessor /DSP lowmediumhighlow high PLAlowmedium low mediumlow FPGAlowhighmedium Gate/Arraymedium lowmedium Cell Basedhighlow high low Custom Design highlow high Very highlow Platform Based highLow/mediu m lowhighMedium/l ow highmedium

89 89 Impact of Implementation Choices Energy Efficiency (in MOPS/mW) Flexibility (or application scope) 0.1-1 1-10 10-100 100-1000 None Fully flexible Somewhat flexible Hardwired custom Configurable/Parameterizable Domain-specific processor (e.g. DSP) Embedded microprocessor

90 90 Design Economics (1) The selling price of an IC  S total =C total /(1-m), Ctotal is manufacturing cost for a single IC, m desired profit margin Costs for produce an IC – –Non-recurring engineering costs (NREs) – –Recurring engineering costs – –Fixed costs

91 91 Design Economics (2) Non-recurring engineering costs (NREs) – –Engineering design cost – –Prototype manufacturing cost Recurring costs – –Process – –Package – –Test

92 92 NRE and unit cost metrics Costs: Costs: –Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost –NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system –total cost = NRE cost + unit cost * # of units –per-product cost = total cost / # of units = (NRE cost / # of units) + unit cost Example –NRE=$2000, unit=$100 –For 10 units –total cost = $2000 + 10*$100 = $3000 –per-product cost = $2000/10 + $100 = $300 Amortizing NRE cost over the units results in an additional $200 per unit

93 93 NRE and unit cost metrics Compare technologies by costs -- best depends on quantity Compare technologies by costs -- best depends on quantity –Technology A: NRE=$2,000, unit=$100 –Technology B: NRE=$30,000, unit=$30 –Technology C: NRE=$100,000, unit=$2 But, must also consider time-to-market

94 94 Wafer and die cost Die yield: number of good dies/total number of dies

95 95 Example Assuming: Assuming: –20 engineers are employed full-time for a year with a $50,000/year average salary –Additional 200,000 overhead costs of which 100,000 for total testing –A wafer cost of $200 per wafer –A $2 packaging cost per chip –10 dies/wafer –70% die yield –98% final test yield –A market for 100,000 items Calculate the minimum shelf price of the chip Calculate the minimum shelf price of the chip

96 96 Design productivity exponential increase Exponential increase over the past few decades Exponential increase over the past few decades 100,000 10,000 1,000 100 10 1 0.1 0.01 1983 1981 1987 1989 1991 1993 1985 1995 1997 1999 2001 2003 2005 2007 2009 Productivity (K) Trans./Staff – Mo.

97 97 The growing design- productivity gap Design Productivity Crisis (SRC 1997) Potential Design Complexity and Designer Productivity 2001200320052007 2009 2011 2013 2015 10,000 1,000 100 Density (Kgates / mm 2 ) ASIC clock (MHz) Gates Clock Moore’s Law: Standard cell density and speed Logic Transistor per Chip ( M ) Productivity ( K) Trans./Staff – Mo. 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999200120032005 2007 2009 100,000,000 0.01 0.1 1 10 100 1,000 10,000 Equivalent Added Complexity 1,000 100 10 1 0.1 0.01 0.001 10,000 21% / yr compounded Productivity Growth Rate x x x x x x x x 58% / yr compounded Complexity Growth Rate Logic Tr. / Chip Tr. / S.M.

98 98 Design productivity gap 1981 leading edge chip required 100 designer months – –10,000 transistors / 100 transistors/month 2002 leading edge chip requires 30,000 designer months – –150,000,000 / 5000 transistors/month Designer cost increase from $1M to $300M While designer productivity has grown at an impressive rate over the past decades, the rate of improvement has not kept pace with chip capacity

99 99 The mythical man-month The situation is even worse than the productivity gap indicates The situation is even worse than the productivity gap indicates In theory, adding designers to team reduces project completion time In theory, adding designers to team reduces project completion time In reality, productivity per designer decreases due to complexities of team management and communication In reality, productivity per designer decreases due to complexities of team management and communication In the software community, known as “the mythical man-month” (Brooks 1975) In the software community, known as “the mythical man-month” (Brooks 1975) At some point, can actually lengthen project completion time! (“Too many cooks”) At some point, can actually lengthen project completion time! (“Too many cooks”) 1M transistors, 1 designer=5000 trans/month Each additional designer reduces for 100 trans/month So 2 designers produce 4900 trans/month each 10000 20000 30000 40000 50000 60000 102030400 43 24 19 16 15 16 18 23 Team Individual Months until completion Number of designers

100 100 Summary Embedded systems are everywhere Embedded systems are everywhere Key challenge: optimization of design metrics Key challenge: optimization of design metrics –Design metrics compete with one another A unified view of hardware and software is necessary to improve productivity A unified view of hardware and software is necessary to improve productivity Three key technologies Three key technologies –Processor: general-purpose, application-specific, single- purpose –IC: Full-custom, semi-custom, PLD –Design: Compilation/synthesis, libraries/IP, test/verification


Download ppt "1 MSc - Microprocessors Dr. Konstantinos Tatas"

Similar presentations


Ads by Google