Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adaptive Thermoregulation for Applications on Reconfigurable Devices

Similar presentations


Presentation on theme: "Adaptive Thermoregulation for Applications on Reconfigurable Devices"— Presentation transcript:

1 Adaptive Thermoregulation for Applications on Reconfigurable Devices
Phillip Jones Applied Research Laboratory Washington University Saint Louis, Missouri, USA Iowa State University Seminar April 2008 Funded by NSF Grant ITR

2 What are FPGAs? FPGA: Field Programmable Gate Array
Sea of general purpose logic gates CLB Configurable Logic Block

3 What are FPGAs? FPGA: Field Programmable Gate Array
Sea of general purpose logic gates CLB CLB CLB Configurable Logic Block CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB CLB

4 What are FPGAs? FPGA: Field Programmable Gate Array
Sea of general purpose logic gates CLB CLB Configurable Logic Block CLB CLB CLB CLB CLB CLB

5 FPGA Usage Models Partial Reconfiguration Fast Prototyping System on
Experimental ISA Experimental Micro Architectures Run-time adaptation Run-time Customization CPU + Specialized HW - Sparc-V8 Leon Partial Reconfiguration Fast Prototyping System on Chip (SoC) Parallel Applications Full Reconfiguration Image Processing Computational Biology Remote Update Fault Tolerance

6 Some FPGA Details CLB CLB CLB CLB

7 Some FPGA Details CLB CLB CLB 4 input Look Up Table 0000 0001 1110
1111 ABCD Z Z A LUT B C D

8 Some FPGA Details CLB CLB CLB Z A LUT B C D ABCD Z 0000 0001 1110 1111
1 A AND Z 4 input Look Up Table B C D

9 Some FPGA Details CLB CLB CLB Z A LUT B C D ABCD Z 0000 0001 1110 1111
1 A OR Z 4 input Look Up Table B C D

10 Some FPGA Details CLB CLB CLB Z A LUT B C D ABCD Z B X000 X001 X110
1 Z 4 input Look Up Table C 2:1 Mux D

11 Some FPGA Details CLB CLB CLB Z A LUT B C D

12 Some FPGA Details CLB CLB PIP Programmable Interconnection Point CLB Z
LUT DFF B C D

13 Some FPGA Details CLB CLB PIP Programmable Interconnection Point CLB Z
LUT DFF B C D

14 Outline Why Thermal Management? Measuring Temperature
Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

15 Why Thermal Management?

16 Why Thermal Management?
Location? Hot Cold Regulated

17 Why Thermal Management?
Mobile? Hot Cold Regulated

18 Why Thermal Management?
Reconfigurability FPGA Plasma Physics Microcontroller

19 Why Thermal Management?
Exceptional Events

20 Why Thermal Management?
Exceptional Events

21 Local Experience Thermally aggressive application
Disruption of air flow

22 Damaged Board (bottom view)
Thermally aggressive application Disruption of air flow

23 Damaged Board (side view)
Thermally aggressive application Disruption of air flow

24 Response to catastrophic thermal events
Easy Fix Not Feasible!! Very Inconvenient

25 Solutions Over provision Use thermal feedback
Large heat sinks and fans Restrict performance Limiting operating frequency Limit amount chip utilization Use thermal feedback Dynamic operating frequency Adaptive Computation Shutdown device My approach

26 Outline Why Thermal Management? Measuring Temperature
Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

27 Measuring Temperature
FPGA

28 Measuring Temperature
FPGA A/D 60 C

29 Background: Measuring Temperature
FPGA S. Lopez-Buedo, J. Garrido, and E. Boemo, . Thermal testing on reconfigurable computers,. IEEE Design and Test of Computers, vol. 17, pp , 2000. Temperature 1. .0 .1 0. .0 1. Period

30 Background: Measuring Temperature
FPGA Temperature 1. .0 1. .1 0. 1. .0 0. 1. Period

31 Background: Measuring Temperature
FPGA Temperature 1. .0 1. .1 0. 1. .0 0. 1. Period

32 Background: Measuring Temperature
FPGA S. Lopez-Buedo, J. Garrido, and E. Boemo, . Thermal testing on reconfigurable computers,. IEEE Design and Test of Computers, vol. 17, pp , 2000. Temperature 1. .1 .0 Period Voltage

33 Background: Measuring Temperature
FPGA Temperature 1. .1 .0 Period Voltage

34 Background: Measuring Temperature
FPGA “Adaptive Thermoregulation for Applications on Reconfigurable Devices”, by Phillip H. Jones, James Moscola, Young H. Cho, and John W. Lockwood; Field Programmable Logic and Applications (FPL’07), Amsterdam, Netherlands Temperature 1. .1 .0 Period Voltage

35 Background: Measuring Temperature
FPGA Mode 1 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

36 Background: Measuring Temperature
FPGA Mode 1 Mode 2 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

37 Background: Measuring Temperature
FPGA Mode 3 Mode 1 Mode 2 Core 1 Core 2 70C Temperature 40C Core 3 Core 4 Period 8,000 8,300 Frequency: Low Frequency: High

38 Background: Measuring Temperature
FPGA Mode 3 Mode 1 Mode 2 Pause Sample Controller Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

39 Background: Measuring Temperature
FPGA Mode 3 Mode 1 Mode 2 Pause Time out Counter Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

40 Background: Measuring Temperature
FPGA Mode 3 Mode 1 Mode 2 Pause Time out Counter 2 5 3 1 4 5 2 3 1 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: Low Frequency: High

41 Background: Measuring Temperature
FPGA Mode 3 Mode 1 Mode 2 Pause Time out Counter 3 2 5 1 4 5 3 1 2 3 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: Low Frequency: High

42 Background: Measuring Temperature
FPGA Mode 2 1 3 Sample Mode Pause Time out Counter 2 1 5 4 3 5 2 3 3 1 Core 1 Core 2 Temperature Core 3 Core 4 Period Frequency: High

43 Temperature Benchmark Circuits
Desired Properties: Scalable Work over a wide range of frequencies Can easily increase or decrease circuit size Simple to analyze Regular structure Distributes evenly over chip Help reduce thermal gradients that may cause damage to the chip May serve as standard Further experimentation Repeatability of results “A Thermal Management and Profiling Method for Reconfigurable Hardware Applications”, by Phillip H. Jones, John W. Lockwood, and Young H. Cho; Field Programmable Logic and Applications (FPL’06), Madrid, Spain,

44 Temperature Benchmark Circuits
LUT 00 70 05 75 DFF Core Block (CB): Array of 48 LUTs and 48 DFF

45 Temperature Benchmark Circuits
RLOC: Row, Col 0 , 0 7 , 5 AND 00 70 05 75 DFF Core Block (CB): Array of 48 LUTs and 48 DFF Each LUT configured to be a 4-input AND gate 8 Input Gen Array of 18 core blocks (864 LUTs, 864 DFFs) (1 LUT, 1 DFF) Thermal workload unit: Computation Row CB 0 CB 17 CB 1 CB 16

46 Temperature Benchmark Circuits
RLOC: Row, Col 0 , 0 7 , 5 AND 00 70 05 75 DFF Core Block (CB): Array of 48 LUTs and 48 DFF Each LUT configured to be a 4-input AND gate RLOC_ORIGIN: Row, Col 100% Activation Rate Thermal workload unit: Computation Row 01 Input Gen CB 0 CB 1 CB 16 CB 17 00 1 1 8 8 (1 LUT, 1 DFF) Array of 18 core blocks (864 LUTs, 864 DFFs)

47 Example Circuit Layout (Configuration 1x, 9% LUTs and DFFs)
RLOC_ORIGIN: Row, Col (27,6) Thermal Workload Unit

48 Example Circuit Layout (Configuration 4x, 36% LUTs and DFFs)

49 Observed Temperature vs. Frequency
T ~ P P ~ F*C*V2 Steady-State Temperatures Cfg4x Cfg10x Cfg2x Cfg1x

50 Observed Temperature vs. Active Area
Max rated Tj 85 C T ~ P P ~ F*C*V2 Steady-State Temperatures 200 MHz 100 MHz 50 MHz 25 MHz 10 MHz

51 Projecting Thermal Trajectories
Estimate Steady State Temperature 5.4±.5 Tj_ss = Power * θjA + TA θjA is the FPGA Thermal resistance (ºC/W) Use measured power at t=0 Exponential specific equation Temperature(t) = ½*(-41*e(-t/20) + 71) + ½*(-41*e(-t/180) + 71)

52 Projecting Thermal Trajectories
Estimate Steady State Temperature How long until 60 C? 5.4±.5 Exploit this phase for performance Tj_ss = Power * θjA + TA θjA is the FPGA Thermal resistance (ºC/W) Use measured power at t=0 Exponential specific equation Temperature(t) = ½*(-41*e(-t/20) + 71) + ½*(-41*e(-t/180) + 71)

53 Thermal Shutdown Max Tj (70C)

54 Outline Why Thermal Management? Measuring Temperature
Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

55 Image Correlation Application
Template

56 Image Correlation Application
Heats FPGA a lot! (> 85 C) Virtex-4 100FX Resource Utilization 200 MHz 44 (11%) 32,868 (77%) 49,148 (58%) 57,461 (68%) Max Frequency Block RAM Occupied Slices D Flip Flops (DFFs) Lookup Tables (LUTs)

57 Application Infrastructure Temperature Sample Controller
Thermoregulation Controller Pause 65 C Application Mode “Adaptive Thermoregulation for Applications on Reconfigurable Devices”, by Phillip H. Jones, James Moscola, Young H. Cho, and John W. Lockwood; Field Programmable Logic and Applications (FPL’07), Amsterdam, Netherlands

58 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Image Buffer Mode Image Processor Core 1 Mask 1 2 Image Processor Core 3 Image Processor Core 2 Mask 1 2 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 Score Out

59 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 200 Mode MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 3 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 4 Mask 1 2 Score Out

60 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 200 MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 High Priority Features Low Priority Features Score Out

61 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 200 180 150 100 MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 High Priority Features Low Priority Features Score Out

62 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 100 75 50 MHz MHz 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 2 Mask 2 Mask 1 High Priority Features Low Priority Features Score Out

63 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 50 MHz 6 4 5 7 8 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 2 Mask 1 Mask 2 Mask 1 Mask 2 Mask 2 High Priority Features Low Priority Features Score Out

64 Application Specific Adaptation
Temperature Sample Controller Thermoregulation Controller Pause 65 C Frequency Quality Image Buffer 75 100 180 150 50 200 MHz MHz 4 7 8 6 5 Image Processor Core 1 Mask 1 2 Image Processor Core 2 Mask 1 2 Image Processor Core 3 Image Processor Core 4 Mask 1 Mask 2 Mask 1 Mask 2 High Priority Features Low Priority Features Score Out

65 Thermally Adaptive Frequency
High Frequency Thermal Budget = 72 C “An Adaptive Frequency Control Method Using Thermal Feedback for Reconfigurable Hardware Applications”, by Phillip H. Jones, Young H. Cho, and John W. Lockwood; Field Programmable Technology (FPT’06), Bangkok, Thailand Junction Temperature, Tj (C) Low Frequency Low Threshold = 67 C Time (s)

66 Thermally Adaptive Frequency
Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) Time (s)

67 Thermally Adaptive Frequency
Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) S. Wang (“Reactive Speed Control”, ECRTS06) Time (s)

68 Outline Why Thermal Management? Measuring Temperature
Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

69 Platform Overview Virtex-4 FPGA Temperature Probe

70 Thermal Budget Efficiency
200 MHz 106 MHz 184 MHz 50 MHz 65 MHz 50 MHz 50 MHz Adaptive Fixed 70 Adaptive Thermal Budget (65 C) 65 4 Features MHz 60 Fixed 25 C Unused 55 Junction Temperature (C) 50 45 40 35 30 40 C 35 C 30 C 25 C 25 C 25 C 0 Fans 0 Fans 0 Fans 0 Fans 1 Fan 2 Fans Thermal Condition

71 Conclusions Motivated the need for thermal management
Measuring temperature Application dependent voltage variations effects. Temperature benchmark circuits Examined application specific adaptation for improving performance in dynamic thermal environments

72 Outline Why Thermal Management? Measuring Temperature
Thermally Driven Adaptation Experimental Results Temperature-Safe Real-time Systems Future Directions

73 Thermally Constrained Systems
Space Craft Sun Earth

74 Thermally Constrained Systems

75 Temperature-Safe Real-time Systems
Task scheduling is a concern in many embedded systems Goal: Satisfy thermal constraints without violating real-time constraints

76 How to manage temperature?
Static frequency scaling Sleep while idle Time T1 T2 T3 T1 T2 T3 Time

77 How to manage temperature?
Static frequency scaling Sleep while idle Time T1 T2 T3 Too hot? Deadlines could be missed T1 T2 T3 Idle Time

78 How to manage temperature?
Static frequency scaling Sleep while idle Time T1 T2 T3 Deadlines could be missed T1 T2 T3 Idle Idle Idle Time Generalization: Idle task insertion

79 Idle Task Insertion More Powerful
Task for schedule at F_max (100 MHz) Period (s) Cost (s) Deadline (s) Utilization (%) Deadline equals cost, frequency cannot be scaled or task schedule becomes infeasible 30 10.0 10.0 33.33 120 30.0 120 25.00 480 30.0 480 6.25 960 20.0 960 2.08 66.66 a. No idle task inserted Tasks scheduled at F_max (100 MHz), 1 Idle Task 960 480 120 60.0 10.0 Deadline (s) 33.33 20.0 60 2.08 99.99 6.25 30.0 25.00 30 Utilization (%) Cost (s) Period (s) b. 1 idle task inserted Idle task insertion No impact on tasks’ cost Higher priority task response times unaffected Allow control over distribution of idle time

80 Sleep when idle is insufficient
Temperature constraint = 65 C Peak Temperature = 70 C

81 Idle-task inserted Temperature constraint = 65 C
Peak Temperature = 61 C

82 Idle-Task Insertion + Deadlines Temperature met? Yes No System
(task set) Idle tasks Scheduler (e.g. RMS) + Deadlines met? Temperature Yes No a. Original schedule does not meet temperature constraints b. Use idle tasks to redistribute device idle time in order to reduce peak device temperature

83 Related Research Power Management Thermal Management
EDF, Dynamic Frequency Scaling Yao (FOCS’95) EDF, Minimize Temperature Bansal (FOCS’04) Worst Case Execution Time Shin (DAC’99) RMS, Reactive Frequency, CIA Wang (RTSS’06, ECRTS’06)

84 Outline Why Thermal Management? Measuring Temperature
Thermally Driven Adaptation Experimental Results Conclusions Temperature-Safe Real-time Systems Future Directions

85 Research Fronts Near term Longer term
Exploration of adaptation techniques Advanced FPGA reconfiguration capabilities Other frequency adaptation techniques Integration of temperature into real-time systems Longer term Cyber physical systems (NSF initiative)

86 Questions/Comments? Near term Longer term
Exploration of adaptation techniques Advanced FPGA reconfiguration capabilities Other frequency adaptation techniques Integration of temperature into real-time systems Longer term Cyber physical systems (NSF initiative)

87 Temperature per Processing Core
Temperature vs. Number of Processing Core 70 y = 2.21x 65 S1 y = 2.24x S2 60 y = 2.23x S3 55 2.07x Junction Temperature (C) y = 50 S4 45 y = 1.43x S5 40 y = 1.22x S6 35 1 2 3 4 Number of Processing Cores

88 Temperature Sample Mode

89 Ring Oscillator Thermometer Characteristics
Thermometer size Ring oscillator size Oscillation period Incrementer Cycle Period Temperature resolution ~100 LUTs 48 LUTs (47 NOT + 1 OR) ~40 ns ~.16 ms (40ns * 4096) .1ºC/ count Or .1ºC/ 20ns

90 Application Mode B C Count = 8235 Count = 8425 Count = 8620
Temperature vs. Incrementer Period (Measuring Temperature while Application Active) 10 20 30 40 50 60 70 80 90 8100 8200 8300 8400 8500 8600 8700 Incrementer Period (20ns/count) Temperature (C) Application Mode A B C Count = 8235 Count = 8425 Count = 8620

91 Virtex-4 100FX Resource Utilization
Application implementation statistics Virtex-4 100FX Resource Utilization 200 MHz 44 (11%) 32,868 (77%) 49,148 (58%) 57,461 (68%) Max Frequency Block RAM Occupied Slices D Flip Flops (DFFs) Lookup Tables (LUTs) Image Correlation Characteristics 40.6 (at 200 MHz) 1 - 8 8-bit (grey scale) 320x480 Image Processing Rate (Frames per second) # of Features Pixel Resolution Image Size (# pixels)

92 VirtexE 2000 Resource Utilization Image Correlation Characteristics
Application implementation statistics 125 MHz 26% (43) 32,868 (15,808) 49,148 (58%) 57,461 (68%) Max Frequency Block RAM Occupied Slices D Flip Flops (DFFs) Lookup Tables (LUTs) VirtexE 2000 Resource Utilization 12.7/second (at 125 MHz) 10 (in parallel) 1 - 4 8-bit (grey scale) 640x480 Image Processing Rate # of Templates # of Mask Patterns Pixel Resolution Image Size (# pixels) Image Correlation Characteristics a.) b.)

93 Scenario Descriptions
30 C (86 F) S3 25 C (77 F) S4 40 C (104 F) S1 35 C (95 F) S2 # of Fans Ambient Temperature Scenario S1 – S6 1 S5 2 S6

94 High Level Architecture
Application Pause Thermal Manager Frequency & Quality Controller Frequency mode Quality Temperature

95 Periodic Temperature Sampling
Application Pause Thermal Manager 50 ms Event Counter Event Ring Oscillator Based Thermometer ready Sample Mode Controller Temperature Frequency & Quality capture Frequency mode Quality

96 Ring Oscillator Based Thermometer
Reset 12-bit incrementer ring_clk MSB Edge Detect 14-bit Clk DFF reset 14 Temperature sel Ready mux

97 ASIC, GPP, FPGA Comparison
Cost Performance Power Flexibility

98 Frequency Multiplexing Circuit
Frequency Control Clk Multiplier (DLLs) clk clk to global clock tree 2:1 MUX 4xclk BUFG Current Virtex-4 platform uses glitch free BUFGMUX component

99 Thermally Adaptive Frequency
High Frequency Thermal Budget = 72 C Junction Temperature, Tj (C) Low Frequency Low Threshold = 67 C Time (s)

100 Thermally Adaptive Frequency
Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) Time (s)

101 Thermally Adaptive Frequency
Thermal Budget = 72 C High Frequency Low Frequency Low Threshold = 67 C Junction Temperature, Tj (C) Time (s)

102 Worst Case Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C Thermally Safe Frequency 50 MHz

103 Worst Case Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C 30/120MHz Adaptive Frequency Thermally Safe Frequency 50 MHz

104 Worst Case Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C 30/120MHz Adaptive Frequency 48.5 MHz Thermally Safe Frequency 50 MHz

105 Typical Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C 30/120MHz Adaptive Frequency 48.5 MHz Thermally Safe Frequency 50 MHz

106 Typical Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C 30/120MHz Adaptive Frequency 95 MHz Adaptive Frequency 48.5 MHz Thermally Safe Frequency 50 MHz

107 Best Case Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C 30/120MHz Adaptive Frequency 95 MHz Thermally Safe Frequency 50 MHz

108 Best Case Thermal Condition Thermally Safe Frequency
Thermal Budget = 70 C 30/120MHz Adaptive Frequency 95 MHz Adaptive Frequency 119 MHz Thermally Safe Frequency 50 MHz 2.4x Factor Performance Increase


Download ppt "Adaptive Thermoregulation for Applications on Reconfigurable Devices"

Similar presentations


Ads by Google