Instructor: Dr. Phillip Jones

Slides:



Advertisements
Similar presentations
Spartan-3 FPGA HDL Coding Techniques
Advertisements

1 - ECpE 583 (Reconfigurable Computing): XPS / MP3 Overview + Midterm Overview Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 15:
Extensible Networking Platform 1 Liquid Architecture Cycle Accurate Performance Measurement Richard Hough Phillip Jones, Scott Friedman, Roger Chamberlain,
Embedded Systems: Introduction. Course overview: Syllabus: text, references, grading, etc. Schedule: will be updated regularly; lectures, assignments.
Power Reduction for FPGA using Multiple Vdd/Vth
1 - ECpE 583 (Reconfigurable Computing): Course overview Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 1: Wed 8/24/2011 (Course.
1 - CPRE 583 (Reconfigurable Computing): FPGA Features and Convey Computer HC-1 Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
1 - CPRE 583 (Reconfigurable Computing): Exam 1 Review Session Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 13: Wed 10/5/2011.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Archs, VHDL 3 Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
1 - CPRE 583 (Reconfigurable Computing): Floating Point Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 14: Fri 10/12/2011 (Floating.
J. Christiansen, CERN - EP/MIC
Basic Sequential Components CT101 – Computing Systems Organization.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
1 - CPRE 583 (Reconfigurable Computing): Reconfiguration Management Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 5: Wed 10/14/2009.
1 - CPRE 583 (Reconfigurable Computing): Reconfiguration Management Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 11: Wed 9/28/2011.
1 - ECpE 583 (Reconfigurable Computing): Map, Place & route Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 24: Wed 12/8/2010 (Map,
1 - CPRE 583 (Reconfigurable Computing): System Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 13: Fri 10/8/2010.
1 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 20: Wed 11/2/2011 (Compute.
1 - CPRE 583 (Reconfigurable Computing): System Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 21: Fri 11/4/2011.
1 - ECpE 583 (Reconfigurable Computing): CoreGen Overview Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 18: Wed 10/26/2011 (CoreGen.
1 - CPRE 583 (Reconfigurable Computing): Evolvable Hardware Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 24: Fri 11/18/2011 (Evolvable.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
1 - CPRE 583 (Reconfigurable Computing): High-level Acceleration Approaches Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 23:
1 - CPRE 583 (Reconfigurable Computing): Compute Models Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 12: Wed 10/6/2010 (Compute.
1 - CPRE 583 (Reconfigurable Computing): Floating Point Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 18: Fri 10/27/2010 (Floating.
1 - ECpE 583 (Reconfigurable Computing): Project Introductions Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 16: Wed 10/14/2011.
1 - CPRE 583 (Reconfigurable Computing): Design Patterns Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 19: Fri 10/28/2011 (Design.
1 - ECpE 583 (Reconfigurable Computing): Midterm Overview Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 17: Wed 10/21/2011 (Midterm.
Introduction to the FPGA and Labs
Programmable Logic Devices
Sequential Logic Design
Presenter: Darshika G. Perera Assistant Professor
Backprojection Project Update January 2002
Hiba Tariq School of Engineering
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Andrea Acquaviva, Luca Benini, Bruno Riccò
An FPGA Implementation of a Brushless DC Motor Speed Controller
Instructor: Dr. Phillip Jones
Wayne Wolf Dept. of EE Princeton University
SEU Mitigation Techniques for Virtex FPGAs in Space Applications
Instructor: Dr. Phillip Jones
CPRE 583 Reconfigurable Computing Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Electronics for Physicists
Instructor: Dr. Phillip Jones
FPGA Implementation of Multicore AES 128/192/256
CPRE 583 Reconfigurable Computing
Instructor: Dr. Phillip Jones
Adaptive Thermoregulation for Applications on Reconfigurable Devices
Instructor: Dr. Phillip Jones
Lecture 41: Introduction to Reconfigurable Computing
Instructor: Dr. Phillip Jones
XC4000E Series Xilinx XC4000 Series Architecture 8/98
CPRE 583 Reconfigurable Computing
CPRE 583 Reconfigurable Computing Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
A High Performance SoC: PkunityTM
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Department of Electrical Engineering Joint work with Jiong Luo
Electronics for Physicists
Instructor: Dr. Phillip Jones
Instructor: Dr. Phillip Jones
Instructor: Michael Greenbaum
Programmable logic and FPGA
Presentation transcript:

Instructor: Dr. Phillip Jones CPRE 583 Reconfigurable Computing Lecture 26: Fri 12/9/2011 (Dr. Jones: PhD research) Instructor: Dr. Phillip Jones (phjones@iastate.edu) Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA http://class.ee.iastate.edu/cpre583/

Announcements/Reminders Presentation and demo Friday 12/16 (9:00am – 11:30am) 20 minutes for presentation 5 minutes Q&A Each team send me 2 questions on about their project Class will be held in Howe 1324!! Final write up and submission due Monday after demos at midnight HW3: will be assigned as extra credit Weekly Project Updates due: Friday’s (midnight)

Project Grading Breakdown 50% Final Project Demo 30% Final Project Report 20% of your project report grade will come from your 5-6 project updates. Friday’s midnight 20% Final Project Presentation

Projects Ideas: Relevant conferences FPL FPT FCCM FPGA DAC ICCAD Reconfig RTSS RTAS ISCA Micro Super Computing HPCA IPDPS

Projects: Target Timeline Teams Formed and Topic: Mon 10/10 Project idea in Power Point 3-5 slides Motivation (why is this interesting, useful) What will be the end result High-level picture of final product Project team list: Name, Responsibility High-level Plan/Proposal: Fri 10/14 Power Point 5-10 slides (presentation to class Wed 10/19) System block diagrams High-level algorithms (if any) Concerns Implementation Conceptual Related research papers (if any)

Projects: Target Timeline Work on projects: 10/19 - 12/9 Weekly update reports More information on updates will be given Presentations: Finals week Present / Demo what is done at this point 15-20 minutes (depends on number of projects) Final write up and Software/Hardware turned in: Day of final (TBD)

Initial Project Proposal Slides (5-10 slides) Project team list: Name, Responsibility (who is project leader) Team size: 3-4 (5 case-by-case) Project idea Motivation (why is this interesting, useful) What will be the end result High-level picture of final product High-level Plan Break project into mile stones Provide initial schedule: I would initially schedule aggressively to have project complete by Thanksgiving. Issues will pop up to cause the schedule to slip. System block diagrams High-level algorithms (if any) Concerns Implementation Conceptual Research papers related to you project idea

Weekly Project Updates The current state of your project write up Even in the early stages of the project you should be able to write a rough draft of the Introduction and Motivation section The current state of your Final Presentation Your Initial Project proposal presentation (Due Wed 10/19). Should make for a starting point for you Final presentation What things are work & not working What roadblocks are you running into

Adaptive Thermoregulation for Applications on Reconfigurable Devices Phillip Jones Applied Research Laboratory Washington University Saint Louis, Missouri, USA http://www.arl.wustl.edu/arl/~phjones Iowa State University Seminar April 2008 Funded by NSF Grant ITR 0313203

What are FPGAs? FPGA: Field Programmable Gate Array Sea of general purpose logic gates CLB Configurable Logic Block

What are FPGAs? FPGA: Field Programmable Gate Array Sea of general purpose logic gates CLB CLB CLB Configurable Logic Block CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB

What are FPGAs? FPGA: Field Programmable Gate Array Sea of general purpose logic gates CLB CLB Configurable Logic Block CLB CLB CLB CLB CLB CLB

FPGA Usage Models Partial Reconfiguration Fast Prototyping System on Experimental ISA Experimental Micro Architectures Run-time adaptation Run-time Customization CPU + Specialized HW - Sparc-V8 Leon Partial Reconfiguration Fast Prototyping System on Chip (SoC) Parallel Applications Full Reconfiguration Image Processing Computational Biology Remote Update Fault Tolerance

Some FPGA Details CLB CLB CLB CLB

Some FPGA Details CLB CLB CLB 4 input Look Up Table 0000 0001 1110 1111 ABCD Z Z A LUT B C D

Some FPGA Details CLB CLB CLB Z A LUT B C D ABCD Z 0000 0001 1110 1111 1 A AND Z 4 input Look Up Table B C D

Some FPGA Details CLB CLB CLB Z A LUT B C D ABCD Z 0000 0001 1110 1111 1 A OR Z 4 input Look Up Table B C D

Some FPGA Details CLB CLB CLB Z A LUT B C D ABCD Z B X000 X001 X110 1 Z 4 input Look Up Table C 2:1 Mux D

Some FPGA Details CLB CLB CLB Z A LUT B C D

Some FPGA Details CLB CLB PIP Programmable Interconnection Point CLB Z LUT DFF B C D

Some FPGA Details CLB CLB PIP Programmable Interconnection Point CLB Z LUT DFF B C D

Outline Why Thermal Management? Measuring Temperature Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

Why Thermal Management?

Why Thermal Management? Location? Hot Cold Regulated

Why Thermal Management? Mobile? Hot Cold Regulated

Why Thermal Management? Reconfigurability FPGA Plasma Physics Microcontroller

Why Thermal Management? Exceptional Events

Why Thermal Management? Exceptional Events

Local Experience Thermally aggressive application Disruption of air flow

Damaged Board (bottom view) Thermally aggressive application Disruption of air flow

Damaged Board (side view) Thermally aggressive application Disruption of air flow

Response to catastrophic thermal events Easy Fix Not Feasible!! Very Inconvenient

Solutions Over provision Use thermal feedback Large heat sinks and fans Restrict performance Limiting operating frequency Limit amount chip utilization Use thermal feedback Dynamic operating frequency Adaptive Computation Shutdown device My approach

Outline Why Thermal Management? Measuring Temperature Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

Measuring Temperature FPGA

Measuring Temperature FPGA A/D 60 C

Background: Measuring Temperature FPGA S. Lopez-Buedo, J. Garrido, and E. Boemo, . Thermal testing on reconfigurable computers,. IEEE Design and Test of Computers, vol. 17, pp. 84.91, 2000. Temperature 1. .0 .1 0. .0 1. Period

Background: Measuring Temperature FPGA Temperature 1. .0 1. .1 0. 1. .0 0. 1. Period

Background: Measuring Temperature FPGA Temperature 1. .0 1. .1 0. 1. .0 0. 1. Period

Background: Measuring Temperature FPGA S. Lopez-Buedo, J. Garrido, and E. Boemo, . Thermal testing on reconfigurable computers,. IEEE Design and Test of Computers, vol. 17, pp. 84.91, 2000. Temperature 1. .1 .0 Period Voltage

Background: Measuring Temperature FPGA Temperature 1. .1 .0 Period Voltage

Background: Measuring Temperature FPGA “Adaptive Thermoregulation for Applications on Reconfigurable Devices”, by Phillip H. Jones, James Moscola, Young H. Cho, and John W. Lockwood; Field Programmable Logic and Applications (FPL’07), Amsterdam, Netherlands Temperature 1. .1 .0 Period Voltage

Background: Measuring Temperature FPGA Mode 1 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

Background: Measuring Temperature FPGA Mode 1 Mode 2 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

Background: Measuring Temperature FPGA Mode 3 Mode 1 Mode 2 Core 1 Core 2 70C Temperature 40C Core 3 Core 4 Period 8,000 8,300 Frequency: Low Frequency: High

Background: Measuring Temperature FPGA Mode 3 Mode 1 Mode 2 Pause Sample Controller Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

Background: Measuring Temperature FPGA Mode 3 Mode 1 Mode 2 Pause Time out Counter Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

Background: Measuring Temperature FPGA Mode 3 Mode 1 Mode 2 Pause Time out Counter 2 5 3 1 4 5 2 3 1 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: Low Frequency: High

Background: Measuring Temperature FPGA Mode 3 Mode 1 Mode 2 Pause Time out Counter 3 2 5 1 4 5 3 1 2 3 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: Low Frequency: High

Background: Measuring Temperature FPGA Mode 2 1 3 Sample Mode Pause Time out Counter 2 1 5 4 3 5 2 3 3 1 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

Temperature Benchmark Circuits Desired Properties: Scalable Work over a wide range of frequencies Can easily increase or decrease circuit size Simple to analyze Regular structure Distributes evenly over chip Help reduce thermal gradients that may cause damage to the chip May serve as standard Further experimentation Repeatability of results “A Thermal Management and Profiling Method for Reconfigurable Hardware Applications”, by Phillip H. Jones, John W. Lockwood, and Young H. Cho; Field Programmable Logic and Applications (FPL’06), Madrid, Spain,

Temperature Benchmark Circuits LUT 00 70 05 75 DFF Core Block (CB): Array of 48 LUTs and 48 DFF

Temperature Benchmark Circuits RLOC: Row, Col 0 , 0 7 , 5 AND 00 70 05 75 DFF Core Block (CB): Array of 48 LUTs and 48 DFF Each LUT configured to be a 4-input AND gate 8 Input Gen Array of 18 core blocks (864 LUTs, 864 DFFs) (1 LUT, 1 DFF) Thermal workload unit: Computation Row CB 0 CB 17 CB 1 CB 16

Temperature Benchmark Circuits RLOC: Row, Col 0 , 0 7 , 5 AND 00 70 05 75 DFF Core Block (CB): Array of 48 LUTs and 48 DFF Each LUT configured to be a 4-input AND gate RLOC_ORIGIN: Row, Col 100% Activation Rate Thermal workload unit: Computation Row 01 Input Gen CB 0 CB 1 CB 16 CB 17 00 1 1 8 8 (1 LUT, 1 DFF) Array of 18 core blocks (864 LUTs, 864 DFFs)

Example Circuit Layout (Configuration 1x, 9% LUTs and DFFs) RLOC_ORIGIN: Row, Col (27,6) Thermal Workload Unit

Example Circuit Layout (Configuration 4x, 36% LUTs and DFFs)

Observed Temperature vs. Frequency T ~ P P ~ F*C*V2 Steady-State Temperatures Cfg4x Cfg10x Cfg2x Cfg1x

Observed Temperature vs. Active Area Max rated Tj 85 C T ~ P P ~ F*C*V2 Steady-State Temperatures 200 MHz 100 MHz 50 MHz 25 MHz 10 MHz

Projecting Thermal Trajectories Estimate Steady State Temperature 5.4±.5 Tj_ss = Power * θjA + TA θjA is the FPGA Thermal resistance (ºC/W) Use measured power at t=0 Exponential specific equation Temperature(t) = ½*(-41*e(-t/20) + 71) + ½*(-41*e(-t/180) + 71)

Projecting Thermal Trajectories Estimate Steady State Temperature How long until 60 C? 5.4±.5 Exploit this phase for performance Tj_ss = Power * θjA + TA θjA is the FPGA Thermal resistance (ºC/W) Use measured power at t=0 Exponential specific equation Temperature(t) = ½*(-41*e(-t/20) + 71) + ½*(-41*e(-t/180) + 71)

Thermal Shutdown Max Tj (70C)

Outline Why Thermal Management? Measuring Temperature Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

Image Correlation Application Template

Image Correlation Application Heats FPGA a lot! (> 85 C) Virtex-4 100FX Resource Utilization 200 MHz 44 (11%) 32,868 (77%) 49,148 (58%) 57,461 (68%) Max Frequency Block RAM Occupied Slices D Flip Flops (DFFs) Lookup Tables (LUTs)

Application Infrastructure Temperature Sample Controller Thermoregulation Controller Pause 65 C Application Mode “Adaptive Thermoregulation for Applications on Reconfigurable Devices”, by Phillip H. Jones, James Moscola, Young H. Cho, and John W. Lockwood; Field Programmable Logic and Applications (FPL’07), Amsterdam, Netherlands

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Image Buffer Mode Image Processor Core 1 Mask 1 2 Image Processor Core 3 Image Processor Core 2 Mask 1 2 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 Score Out

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 200 Mode MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 3 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 4 Mask 1 2 Score Out

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 200 MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 High Priority Features Low Priority Features Score Out

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 200 180 150 100 MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 High Priority Features Low Priority Features Score Out

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 100 75 50 MHz MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 High Priority Features Low Priority Features Score Out

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 50 MHz 6 4 5 7 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 2 Mask 1 Mask 2 Mask 1 Mask 2 Mask 2 High Priority Features Low Priority Features Score Out

Application Specific Adaptation Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 75 100 180 150 50 200 MHz MHz 4 7 8 6 5 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 Mask 2 Mask 1 Mask 2 High Priority Features Low Priority Features Score Out

Thermally Adaptive Frequency High Frequency Thermal Budget = 72 C “An Adaptive Frequency Control Method Using Thermal Feedback for Reconfigurable Hardware Applications”, by Phillip H. Jones, Young H. Cho, and John W. Lockwood; Field Programmable Technology (FPT’06), Bangkok, Thailand Junction Temperature, Tj (C) Low Frequency Low Threshold = 67 C Time (s)

Thermally Adaptive Frequency Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) Time (s)

Thermally Adaptive Frequency Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) S. Wang (“Reactive Speed Control”, ECRTS06) Time (s)

Outline Why Thermal Management? Measuring Temperature Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

Platform Overview Virtex-4 FPGA Temperature Probe

Thermal Budget Efficiency 200 MHz 106 MHz 184 MHz 50 MHz 65 MHz 50 MHz 50 MHz Adaptive Fixed 70 Adaptive Thermal Budget (65 C) 65 4 Features 50 MHz 4 50 6 50 8 65 8 106 8 184 60 Fixed 8 200 25 C Unused 55 Junction Temperature (C) 50 45 40 35 30 40 C 35 C 30 C 25 C 25 C 25 C 0 Fans 0 Fans 0 Fans 0 Fans 1 Fan 2 Fans Thermal Condition

Conclusions Motivated the need for thermal management Measuring temperature Application dependent voltage variations effects. Temperature benchmark circuits Examined application specific adaptation for improving performance in dynamic thermal environments

Outline Why Thermal Management? Measuring Temperature Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

Thermally Constrained Systems Space Craft Sun Earth

Thermally Constrained Systems

Temperature-Safe Real-time Systems Task scheduling is a concern in many embedded systems Goal: Satisfy thermal constraints without violating real-time constraints

How to manage temperature? Static frequency scaling Sleep while idle Time T1 T2 T3 T1 T2 T3 Time

How to manage temperature? Static frequency scaling Sleep while idle Time T1 T2 T3 Too hot? Deadlines could be missed T1 T2 T3 Idle Time

How to manage temperature? Static frequency scaling Sleep while idle Time T1 T2 T3 Deadlines could be missed T1 T2 T3 Idle Idle Idle Time Generalization: Idle task insertion

Idle Task Insertion More Powerful Task for schedule at F_max (100 MHz) Period (s) Cost (s) Deadline (s) Utilization (%) Deadline equals cost, frequency cannot be scaled or task schedule becomes infeasible 30 10.0 10.0 33.33 120 30.0 120 25.00 480 30.0 480 6.25 960 20.0 960 2.08 66.66 a. No idle task inserted Tasks scheduled at F_max (100 MHz), 1 Idle Task 960 480 120 60.0 10.0 Deadline (s) 33.33 20.0 60 2.08 99.99 6.25 30.0 25.00 30 Utilization (%) Cost (s) Period (s) b. 1 idle task inserted Idle task insertion No impact on tasks’ cost Higher priority task response times unaffected Allow control over distribution of idle time

Sleep when idle is insufficient Temperature constraint = 65 C Peak Temperature = 70 C

Idle-task inserted Temperature constraint = 65 C Peak Temperature = 61 C

Idle-Task Insertion + Deadlines Temperature met? Yes No System (task set) Idle tasks Scheduler (e.g. RMS) + Deadlines met? Temperature Yes No a. Original schedule does not meet temperature constraints b. Use idle tasks to redistribute device idle time in order to reduce peak device temperature

Related Research Power Management Thermal Management EDF, Dynamic Frequency Scaling Yao (FOCS’95) EDF, Minimize Temperature Bansal (FOCS’04) Worst Case Execution Time Shin (DAC’99) RMS, Reactive Frequency, CIA Wang (RTSS’06, ECRTS’06)

Outline Why Thermal Management? Measuring Temperature Thermally Driven Adaptation Experimental Results Conclusions Temperature-Safe Real-time Systems Future Directions

Research Fronts Near term Longer term Exploration of adaptation techniques Advanced FPGA reconfiguration capabilities Other frequency adaptation techniques Integration of temperature into real-time systems Longer term Cyber physical systems (NSF initiative)

Questions/Comments? Near term Longer term Exploration of adaptation techniques Advanced FPGA reconfiguration capabilities Other frequency adaptation techniques Integration of temperature into real-time systems Longer term Cyber physical systems (NSF initiative)

Temperature per Processing Core Temperature vs. Number of Processing Core 70 y = + 60.1 2.21x 65 S1 y = + 57.1 2.24x S2 60 y = + 52.1 2.23x S3 55 2.07x Junction Temperature (C) y = + 44.2 50 S4 45 y = + 37.5 1.43x S5 40 y = + 34.0 1.22x S6 35 1 2 3 4 Number of Processing Cores

Temperature Sample Mode

Ring Oscillator Thermometer Characteristics Thermometer size Ring oscillator size Oscillation period Incrementer Cycle Period Temperature resolution ~100 LUTs 48 LUTs (47 NOT + 1 OR) ~40 ns ~.16 ms (40ns * 4096) .1ºC/ count Or .1ºC/ 20ns

Application Mode B C Count = 8235 Count = 8425 Count = 8620 Temperature vs. Incrementer Period (Measuring Temperature while Application Active) 10 20 30 40 50 60 70 80 90 8100 8200 8300 8400 8500 8600 8700 Incrementer Period (20ns/count) Temperature (C) Application Mode A B C Count = 8235 Count = 8425 Count = 8620

Virtex-4 100FX Resource Utilization Application implementation statistics Virtex-4 100FX Resource Utilization 200 MHz 44 (11%) 32,868 (77%) 49,148 (58%) 57,461 (68%) Max Frequency Block RAM Occupied Slices D Flip Flops (DFFs) Lookup Tables (LUTs) Image Correlation Characteristics 40.6 (at 200 MHz) 1 - 8 8-bit (grey scale) 320x480 Image Processing Rate (Frames per second) # of Features Pixel Resolution Image Size (# pixels)

VirtexE 2000 Resource Utilization Image Correlation Characteristics Application implementation statistics 125 MHz 26% (43) 32,868 (15,808) 49,148 (58%) 57,461 (68%) Max Frequency Block RAM Occupied Slices D Flip Flops (DFFs) Lookup Tables (LUTs) VirtexE 2000 Resource Utilization 12.7/second (at 125 MHz) 10 (in parallel) 1 - 4 8-bit (grey scale) 640x480 Image Processing Rate # of Templates # of Mask Patterns Pixel Resolution Image Size (# pixels) Image Correlation Characteristics a.) b.)

Scenario Descriptions 30 C (86 F) S3 25 C (77 F) S4 40 C (104 F) S1 35 C (95 F) S2 # of Fans Ambient Temperature Scenario S1 – S6 1 S5 2 S6

High Level Architecture Application Pause Thermal Manager Frequency & Quality Controller Frequency mode Quality Temperature

Periodic Temperature Sampling Application Pause Thermal Manager 50 ms Event Counter Event Ring Oscillator Based Thermometer ready Sample Mode Controller Temperature Frequency & Quality capture Frequency mode Quality

Ring Oscillator Based Thermometer Reset 12-bit incrementer ring_clk MSB Edge Detect 14-bit Clk DFF reset 14 Temperature sel Ready mux

ASIC, GPP, FPGA Comparison Cost Performance Power Flexibility

Frequency Multiplexing Circuit Frequency Control Clk Multiplier (DLLs) clk clk to global clock tree 2:1 MUX 4xclk BUFG Current Virtex-4 platform uses glitch free BUFGMUX component

Thermally Adaptive Frequency High Frequency Thermal Budget = 72 C Junction Temperature, Tj (C) Low Frequency Low Threshold = 67 C Time (s)

Thermally Adaptive Frequency Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) Time (s)

Thermally Adaptive Frequency Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) Time (s)

Worst Case Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C Thermally Safe Frequency 50 MHz

Worst Case Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C 30/120MHz Adaptive Frequency Thermally Safe Frequency 50 MHz

Worst Case Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C 30/120MHz Adaptive Frequency 48.5 MHz Thermally Safe Frequency 50 MHz

Typical Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C 30/120MHz Adaptive Frequency 48.5 MHz Thermally Safe Frequency 50 MHz

Typical Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C 30/120MHz Adaptive Frequency 95 MHz Adaptive Frequency 48.5 MHz Thermally Safe Frequency 50 MHz

Best Case Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C 30/120MHz Adaptive Frequency 95 MHz Thermally Safe Frequency 50 MHz

Best Case Thermal Condition Thermally Safe Frequency Thermal Budget = 70 C 30/120MHz Adaptive Frequency 95 MHz Adaptive Frequency 119 MHz Thermally Safe Frequency 50 MHz 2.4x Factor Performance Increase