Download presentation
Presentation is loading. Please wait.
Published byMercy Maxwell Modified over 8 years ago
1
Optimizing DRAM Timing for the Common-Case Donghyuk Lee Yoongu Kim, Gennady Pekhimenko, Samira Khan, Vivek Seshadri, Kevin Chang, Onur Mutlu Adaptive-Latency DRAM
2
2 (11 – 11 – 28) Timing Parameters DRAM Module x86 CPU DDR3 1600MT/s (11-11- 28) SPEC mcf Runtime: 527min Runtime: 477min -10.5% (no error) (8 – 8 – 19) MemCtrl Parsec GUPS Memcached Apache
3
3 Reducing DRAM Timing Why can we reduce DRAM timing parameters without any errors?
4
4 Executive Summary Observations –DRAM timing parameters are dictated by the worst- case cell (smallest cell across all products at highest temperature) –DRAM operates at lower temperature than the worst case Idea: Adaptive-Latency DRAM –Optimizes DRAM timing parameters for the common case (typical DIMM operating at low temperatures) Analysis: Characterization of 115 DIMMs –Great potential to lower DRAM timing parameters (17 – 54%) without any errors Real System Performance Evaluation –Significant performance improvement (14% for memory-intensive workloads) without errors (33 days)
5
5 1. DRAM Operation Basics 2. Reasons for Timing Margin in DRAM 4. Adaptive-Latency DRAM 5. DRAM Characterization 6. Real System Performance Evaluation 3. Key Observations
6
6 DRAM Stores Data as Charge 1. Sensing 2. Restore 3. Precharge DRAM Cell Sense-Amplifier Three steps of charge movement
7
7 Data 0 Data 1 Cel l time charge Sense- Amplifier DRAM Charge over Time Sensin g Restor e Why does DRAM need the extra timing margin? Timing Parameters In theory In practice margi n Cell Sense-Amplifier
8
8 1. DRAM Operation Basics 2. Reasons for Timing Margin in DRAM 4. Adaptive-Latency DRAM 5. DRAM Characterization 6. Real System Performance Evaluation 3. Key Observations
9
9 1. Process Variation –DRAM cells are not equal –Leads to extra timing margin for cell that can store small amount of charge ` 2. Temperature Dependence –DRAM leaks more charge at higher temperature –Leads to extra timing margin when operating at low temperature Two Reasons for Timing Margin 1. Process Variation –DRAM cells are not equal –Leads to extra timing margin for a cell that can store a large amount of charge 1. Process Variation –DRAM cells are not equal –Leads to extra timing margin for a cell that can store a large amount of charge
10
10 DRAM Cells are Not Equal RealIdeal Same Size Same Charge Different Size Different Charge Largest Cell Smallest Cell Same LatencyDifferent Latency Large variation in cell size Large variation in charge Large variation in access latency
11
11 Contact Process Variation Access Transistor Bitline Capacit or Small cell can store small charge Small cell capacitance High contact resistance Slow access transistor ❶ Cell Capacitance ❷ Contact Resistance ❸ Transistor Performance ACCESS DRAM Cell High access latency
12
12 Two Reasons for Timing Margin 1. Process Variation –DRAM cells are not equal –Leads to extra timing margin for a cell that can store a large amount of charge ` 2. Temperature Dependence –DRAM leaks more charge at higher temperature –Leads to extra timing margin for cells that operate at the high temperature 2. Temperature Dependence –DRAM leaks more charge at higher temperature –Leads to extra timing margin for cells that operate at the high temperature 2. Temperature Dependence –DRAM leaks more charge at higher temperature –Leads to extra timing margin for cells that operate at the low temperature
13
13 Room Temp. Hot Temp. (85°C) Small LeakageLarge Leakage Cells store small charge at high temperature and large charge at low temperature Large variation in access latency
14
14 DRAM Timing Parameters DRAM timing parameters are dictated by the worst-case –The smallest cell with the smallest charge in all DRAM products –Operating at the highest temperature Large timing margin for the common-case
15
15 Our Approach We optimize DRAM timing parameters for the common-case –The smallest cell with the smallest charge in a DRAM module –Operating at the current temperature Common-case cell has extra charge than the worst-case cell Can lower latency for the common- case
16
16 1. DRAM Operation Basics 2. Reasons for Timing Margin in DRAM 4. Adaptive-Latency DRAM 5. DRAM Characterization 6. Real System Performance Evaluation 3. Key Observations
17
17 Key Observations 1. Sensing 2. Restore 3. Precharge Sense cells with extra charge faster Lower sensing latency No need to fully restore cells with extra charge Lower restore latency No need to fully precharge bitlines for cells with extra charge Lower precharge latency
18
18 Typical DIMM at Low Temperature Observation 1. Faster Sensing More Charge Strong Charge Flow Faster Sensing Typical DIMM at Low Temperature More charge Faster sensing Timing (tRCD) 17% ↓ No Errors 115 DIMM Characterizati on
19
19 Observation 2. Reducing Restore Time Larger Cell & Less Leakage Extra Charge No Need to Fully Restore Charge Typical DIMM at lower temperature More charge Restore time reduction Typical DIMM at Low Temperature Read (tRAS) 37% ↓ Write (tWR) 54% ↓ No Errors 115 DIMM Characterizati on
20
20 Empty (0V) Full (Vdd) Half Observation 3. Reducing Precharge Time Bitline Sense-Amplifier Sensin g Prechar ge Precharg e ? – Setting bitline to half-full charge Typical DIMM at Lower Temperature
21
21 Empty (0V) Full (Vdd) Half bitline Not Fully Precharged More Charge Strong Sensing Access Empty Cell Access Full Cell Timing (tRP) 35% ↓ No Errors 115 DIMM Characterizati on Typical DIMM at Lower Temperature More charge Precharge time reduction Observation 3. Reducing Precharge Time
22
22 Key Observations 1. Sensing 2. Restore 3. Precharge Sense cells with extra charge faster Lower sensing latency No need to fully restore cells with extra charge Lower restore latency No need to fully precharge bitlines for cells with extra charge Lower precharge latency
23
23 1. DRAM Operation Basics 2. Reasons for Timing Margin in DRAM 4. Adaptive-Latency DRAM 5. DRAM Characterization 6. Real System Performance Evaluation 3. Key Observations
24
24 Adaptive-Latency DRAM Key idea –Optimize DRAM timing parameters online Two components – DRAM manufacturer profiles multiple sets of reliable DRAM timing parameters at different temperatures for each DIMM –System monitors DRAM temperature & uses appropriate DRAM timing parameters reliable DRAM timing parameters DRAM temperature
25
25 1. DRAM Operation Basics 2. Reasons for Timing Margin in DRAM 4. Adaptive-Latency DRAM 5. DRAM Characterization 6. Real System Performance Evaluation 3. Key Observations
26
26 DRAM Temperature DRAM temperature measurement Server cluster: Operates at under 34°C Desktop: Operates at under 50°C DRAM standard optimized for 85°C Previous works – DRAM temperature is low El-Sayed+ SIGMETRICS 2012 Liu+ ISCA 2007 Previous works – Maintain DRAM temperature low David+ ICAC 2011 Liu+ ISCA 2007 Zhu+ ITHERM 2008 DRAM operates at low temperatures in the common- case
27
27 TemperatureController PC HeaterFPGAsFPGAs DRAM Testing Infrastructure
28
28 Test Pattern Writ e time Acce ss Verif y Refresh Interval: 64– 512ms Single cache line test (Read/Write) Overlapping multiple single cache line tests to simulate power noise and coupling Writ e Acce ss Verif y time Refresh Interval: 64– 512ms Acce ss Verif y...
29
29 Control Factors Timing parameters –Sensing: tRCD –Restore: tRAS (read), tWR(write) –Precharge: tRP Temperature: 55 – 85°C Refresh interval: 64 – 512ms –Longer refresh interval leads to smaller charge –Standard refresh interval: 64ms
30
30 10 10 2 10 3 10 4 10 5 0 Errors Temperature: 85°C/Refresh Interval: 64, 128, 256, 512ms 1. Timings ↔ Charge More charge enables more timing parameter reduction Sensing Restore (Read) Prechar ge Restore (Write)
31
31 Temperature: 55, 65, 75, 85°C/Refresh Interval: 512ms 10 10 2 10 3 10 4 10 5 0 Errors 2. Timings ↔ Temperature Lower temperature enables more timing parameter reduction Sensing Restore (Read) Prechar ge Restore (Write)
32
32 3. Summary of 115 DIMMs Latency reduction for read & write (55°C) –Read Latency: 32.7% –Write Latency: 55.1% Latency reduction for each timing parameter (55°C) –Sensing: 17.3% –Restore: 37.3% (read), 54.8% (write) –Precharge: 35.2%
33
33 1. DRAM Operation Basics 2. Reasons for Timing Margin in DRAM 4. Adaptive-Latency DRAM 5. DRAM Characterization 6. Real System Performance Evaluation 3. Key Observations
34
34 Real System Evaluation Method System –CPU: AMD 4386 ( 8 Cores, 3.1GHz, 8MB LLC) –DRAM: 4GByte DDR3-1600 (800Mhz Clock) –OS: Linux –Storage: 128GByte SSD Workload –35 applications from SPEC, STREAM, Parsec, Memcached, Apache, GUPS
35
35 1.4% 6.7% 5.0% Single-Core Evaluation AL-DRAM improves performance on a real system Performance Improvement Average Improvement all-35- workload
36
36 14.0% 2.9% 10.4% Multi-Core Evaluation AL-DRAM provides higher performance for multi-programmed & multi-threaded workloads Performance Improvement Average Improvement all-35-workload
37
37 Conclusion Observations –DRAM timing parameters are dictated by the worst- case cell (smallest cell across all products at highest temperature) –DRAM operates at lower temperature than the worst case Idea: Adaptive-Latency DRAM –Optimizes DRAM timing parameters for the common case (typical DIMM operating at low temperatures) Analysis: Characterization of 115 DIMMs –Great potential to lower DRAM timing parameters (17 – 54%) without any errors Real System Performance Evaluation –Significant performance improvement (14% for memory-intensive workloads) without errors (33 days)
38
Optimizing DRAM Timing for the Common-Case Donghyuk Lee Yoongu Kim, Gennady Pekhimenko, Samira Khan, Vivek Seshadri, Kevin Chang, Onur Mutlu Adaptive-Latency DRAM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.