Enabling Big Memory with Emerging Technologies Manjunath Shevgoor Enabling Big Memory with Emerging Technologies1
Big Memory DRAM needs are increasing rapidly Enabling Big Memory with Emerging Technologies 2 Increased Data Gathering Data Analytics In Memory Databases
Need more capacity Enabling Big Memory with Emerging Technologies 3 Source: Kevin Lim et al., Disaggregated Memory for Expansion and Sharing in Blade Servers, ISCA’09 Core count doubling ~ every 2 years DIMM capacity doubling ~ every 3 years Memory capacity per core expected to drop every year [Source: Memory Scaling: Systems Architecture Perspective, O Mutlu]
Enabling Big Memory with Emerging Technologies 4 Possible Solutions 3D Stacking Increased Current Draw [MICRO’13] Many RankHigh Refresh Power [Under Submission] [ICCD’15, HPCA’12, NVMW’11,15] Non- Volatile Memory Sneak Currents Memristor
Thesis Statement Memory capacity requirements are increasing at a very fast rate. Management of high currents is crucial for effective deployment of new technologies. This thesis hypothesizes that architecture/OS policies for data placement can help manage some of the problems posed by high currents. Enabling Big Memory with Emerging Technologies 5
Talk Outline Current Constraints in 3D DRAM Addressing Refresh Overheads in DRAM Improving Memristor Memory by Re-using Sneak Currents Conclusion and Future Work Enabling Big Memory with Emerging Technologies 6
IR-Drop in 3D DRAM Enabling Big Memory with Emerging Technologies7 [MICRO’13]
What is power delivery network? Enabling Big Memory with Emerging Technologies 8 Source: Sani R. Nassif, Power Grid Analysis Benchmarks V VSS Grid of wires which connects power and circuits Voltage drops across every PDN Voltage lost on the PDN is the IR Drop Explore architectural policies to manage IR Drop
3D stacking increases current density – Increased ‘I’ TSVs add resistance to the PDN – Increased ‘R’ Navigate 8 TSV layers to reach the top die Insufficient voltage leads to incorrect operation 9 High IR Drop Low IR Drop IR Drop in 3D DRAM Enabling Big Memory with Emerging Technologies
Banks that are farther away from the TSVs suffer higher IR Drop Enabling Big Memory with Emerging Technologies 10 V on M1 on Layer 9 X Coordinate V Y Coordinate Floor Plan and Quality of Power Delivery
Enabling Big Memory with Emerging Technologies11 Layer 2Layer 3Layer 4Layer 5 Layer 6Layer 7Layer 8Layer 9 IR Drop Varies along a Die and across the stack
Enabling Big Memory with Emerging Technologies12 Top 4 Dies Bot 4 Dies Logic Layer Create constraints for Iso-IR Drop regions Place critical pages in IR Drop resistant regions IR Drop oblivious page placement leads to 47% performance degradation
Region Based Constraints Enabling Big Memory with Emerging Technologies 13 Top Region 1-2 Reads allowed/region Bottom Region 4 Reads allowed/region At least 1 Top-Read 8 Reads allowed/stack No Top-Reads 16 Reads allowed/stack Spatio-Temporal Constraints
Dynamic Page Placement 14 Pages with highest total queuing delay are moved to bottom regions Using page access count to promote pages can starve threads Scheduler ensures fairness Page migration is limited by Migration Penalty (10k/15M cycles) Enabling Big Memory with Emerging Technologies
Results Enabling Big Memory with Emerging Technologies 15 Within 20% of ideal
Enabling Big Memory with Emerging Technologies 16 Overview 3D Stacking Increased Current Draw [MICRO’13] Many RankRefresh Overhead [Under Submission] [ICCD’15, HPCA’12, NVMW’11,15] Non- Volatile Memory Sneak Currents Memristor
Re-Thinking Data Placement in Highly Ranked DRAM Systems Enabling Big Memory with Emerging Technologies17
Refresh Power in DRAM CommandCurrent (mA) Act67 Read125 Write125 Refresh245 Enabling Big Memory with Emerging Technologies 18 Refresh consumes 96% more power than read Source: Micron 8GB DDR3L data sheet There can be up to 4 ranks in DIMM
Enabling Big Memory with Emerging Technologies19 8-core CMP MC Channel 1 Channel 2 Rank 1 Rank 2 Rank 3 Rank 4 Stagger refresh to reduce peak power
Increase in Refresh Time Enabling Big Memory with Emerging Technologies 20 Chip Capacity (GB) tRFC (ns) tRFC_2X (ns) tRFC_4X (ns) Refresh Interval7.8 µs3.9 µs1.95 µs Fine grained refresh
Effect of Staggered Refresh Enabling Big Memory with Emerging Technologies 21
Enabling Big Memory with Emerging Technologies22 8-core CMP MC Channel 1 Channel 2 Rank 1 Rank 2 Rank 3 Rank 4 T1 R 2 T2 R 3 T1 R 1 T2 R 2 T2 R 1 T3 R 1 T1 R 1 T2 R 3 T1 R 3 T1 R 3 T3 R 3 T3 R 3 Stalle d Each Staggered Refresh stalls many cores
Limit the spread- Address Mapping Enabling Big Memory with Emerging Technologies 23
Enabling Big Memory with Emerging Technologies24 8-core CMP MC Channel 1 Channel 2 Rank 1 Rank 2 Rank 3 Rank 4 T1 R 2 T2 R 3 T1 R 1 T2 R 2 T2 R 1 T3 R 1 T1 R 1 T2 R 3 T1 R 3 T1 R 3 T3 R 3 T3 R 3 Stalle d T1 R 1 T2 R 2 T1 R 1 T2 R 2 T2 R 2 T3 R 3 T1 R 1 T2 R 2 T1 R 1 T1 R 1 T3 R 3 T3 R 3 Ideally
Rank Assigned Page Mapping Enabling Big Memory with Emerging Technologies 25 8-core CMP MC Channel 1 Channel 2 Rank 1 Rank 2 Rank 3 Rank 4 Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 (a) Strict mapping of threads to ranks.
Enabling Big Memory with Emerging Technologies % better than Staggered Refresh
Limit the spread- Page Mapping Enabling Big Memory with Emerging Technologies 27 Channel 1 Channel 2 Rank 1 Rank 2 Rank 3 Rank 4 Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 8-core CMP MC
Relaxing Rank Assignment Enabling Big Memory with Emerging Technologies 28
Data Mapping Enabling Big Memory with Emerging Technologies 29 Address Mapping Page Mapping 18.6% better than Staggered Refresh
Enabling Big Memory with Emerging Technologies 30 Overview 3D Stacking Increased Current Draw [MICRO’13] Many RankRefresh Overhead [Under Submission] [ICCD’15, HPCA’12, NVMW’11,15] Non- Volatile Memory Sneak Currents Memristor
Designing a Fast and Reliable Memory with Memristor Technology Enabling Big Memory with Emerging Technologies31 [ICCD’15, NVMW’15]
Background Store data in the form of resistance Metal oxide sandwiched between two electrodes Inherently non conducting Creation of conductive Filaments of oxygen vacancies reduces resistance Enabling Big Memory with Emerging Technologies 32 Source: Cong Xu et al., Modeling and Design Analysis of 3D Vertical Resistive Memory - A Low Cost Cross-Point Architecture, ASPDAC 2014
Voltage Dependent Resistance Resistance decreases with increasing voltage Enabling Big Memory with Emerging Technologies 33 The resistance of a ReRAM cell is not constant but varies with the applied voltage Combination of a selector in series with memristor device
Enabling Big Memory with Emerging Technologies 34 Bit Line Word Line DRAM Cell Bit Line Word Line PCM Cell Word Line Bit Line Memristor Cell Cell Size of 4F 2
Cross Point Structure Enabling Big Memory with Emerging Technologies 35 Because of non-linearity, it is possible to select a cell without an access transistor. Arrays can be layered vertically without resorting to 3D stacking. Mem- ristor Selector Memristor Cell
Reading and Writing Enabling Big Memory with Emerging Technologies 36 Half Selected Cells Selected Cell Sneak Current 0V V/2 V
Effects of I leak Enabling Big Memory with Emerging Technologies37
Effects of I leak Enabling Big Memory with Emerging Technologies 38 Decreases Voltage at selected cell Increases Write Latency Can cause Write Failure Distorts bit line current Increases read complexity Decreases read margin Limits Array Size
Enabling Big Memory with Emerging Technologies 39 Reading from the crossbar array Step 1: Read background current (I leak ) V read /2 0 I leak V read /2
Enabling Big Memory with Emerging Technologies 40 Reading from the crossbar array Step 2: Read total V read current (I read ) V read 0 I read I leak V read /2
Enabling Big Memory with Emerging Technologies 41 State of selected cell determines I read ~ I leak tBG_READtREAD Read Latency
Enabling Big Memory with Emerging Technologies 42 Proposal 1: Re-use value in sample and hold circuit V read V read /2 VrVr P acc P prech S1 Sensing Circuit S2 Sample and Hold Sneak Current
Reusing Sneak Current Read 43 Sneak Current uA Columns Enabling Big Memory with Emerging Technologies Rows
Enabling Big Memory with Emerging Technologies 44 Re-Use Sneak Current Reading for the same Column tBG_READtREAD Read Latency1 tREAD Read Latency2
Impact of Cell Location Enabling Big Memory with Emerging Technologies45
Enabling Big Memory with Emerging Technologies 46 Bit Line Mux Word Line Drivers Increased error rates
Enabling Big Memory with Emerging Technologies Byte Cache line Array 1 Array 2 Array 3 Array 512 Bit 1 Bit 2 Bit 3 Bit 512 Default mapping leads to some lines with high error rate
Proposal 2: Stagger the array mapping Enabling Big Memory with Emerging Technologies 48 Cacheline 1 Cacheline 2Cacheline 3 Cacheline Nth bit in cacheline Array 0 Array 1 Array 2 Array 3 Default Mapping Proposed Mapping 30X reduction in probability of a single bit error
Performance Vs Baseline Improving Memristor Memory with Sneak Current Sharing 49
Exploring Address Mapping Improving Memristor Memory with Sneak Current Sharing 50
Enabling Big Memory with Emerging Technologies 51 Summary of Dissertation 3D Stacking Increased Current Draw [MICRO’13] Many RankRefresh Overhead [Under Submission] [ICCD’15, HPCA’12, NVMW’11,15] Non- Volatile Memory Memory Latencies Memristor Spatio-Temporal Constraints Re-Thinking Data Placement Re-use Sneak Currents
Conclusions Enabling Big Memory with Emerging Technologies 52 3D Stacking Many Rank MemristorRe-Use Sneak Currents Rank Assignment IR Drop Constraints
Future Work Mitigating the Rising Cost of Process Variation in 3D DRAM PDN Aware Refresh Cycle Time for 3D DRAM Addressing Long Write Latencies in Memristor based Memory Enabling Big Memory with Emerging Technologies 53
Other Projects and Publications Efficiently Prefetching Complex Address Patterns MICRO’15 USIMM: The Utah Simulated Memory Module Used for the Memory Scheduling Championship Efficient Scrub Mechanisms for Error-Prone Emerging Memories HPCA’12 Accelerating Critical Word Access using Heterogeneous Memory MICRO’12 Avoiding Information Leakage in the Memory Controller MICRO’15 Enabling Big Memory with Emerging Technologies 54
Acknowledgements Rajeev Ashwini, Parents Al, Erik, Naveen, Ken Chris Wilkerson, Zeshan Chishti Utah Arch team-mates Karen, Ann Enabling Big Memory with Emerging Technologies 55
Thank You Enabling Big Memory with Emerging Technologies56
Enabling Big Memory with Emerging Technologies 57 Thesis Overview 3D Stacking Increased Current Density [MICRO’13] Many RankHigh Refresh Current [Under Submission] [ICCD’15, NVMW’11,15] Non- Volatile Memory Sneak Currents Memristor Analyze Impact of Currents + Performance Loss Data Placement
Comparisons to Prior Work Enabling Big Memory with Emerging Technologies 58
Enabling Big Memory with Emerging Technologies 59 RWRW RWRW RWRW RWRW RWRW RWRW RWRW RWRW RWRW RWRW RWRW RWRW 0 V/2 V Bit Lines Word Lines V W1 V W2 V WN V WN1 V WNM Bit Line Mux Bit line and word line resistances eat into the cell Voltage
Percentage of refreshes stalling a thread Enabling Big Memory with Emerging Technologies 60
Memory Latency Improving Memristor Memory with Sneak Current Sharing 61
Memristor Read Power Improving Memristor Memory with Sneak Current Sharing 62
Enabling Big Memory with Emerging Technologies 63 Core 1 z Last Level $$ $ Miss Core 8 Delta History Tables Prediction See a Delta? Predict a Delta! Prediction Feedback Delta Prediction Tables Delta History Tables Delta Prediction Tables
Enabling Big Memory with Emerging Technologies 64 Sneak path currents can distort I read V read 0 I read I leak V read /2
Sneak Currents
Compress to reduce write latency Enabling Big Memory with Emerging Technologies Byte Cache line Array 1 Array 2 Array 3 Array 512 Bit 1 Bit 2 Bit 3 Bit Proposed Mapping With 50% Compression
Enabling Big Memory with Emerging Technologies 67
Summary With great density come a few challenges Sneak Currents limit array size, complicate reads, and delay writes Affect reliability Background current can be reused Reliability can be improved at the cost of write latency Compression can reduce write latency 8.3% performance improvement 30X reduction in multi bit error probability Enabling Big Memory with Emerging Technologies 68
Column Hit Rate Improving Memristor Memory with Sneak Current Sharing 69