Download presentation
Presentation is loading. Please wait.
1
Phase Change Memory What to wear out today? Chris Craik, Aapo Kyrola, Yoshihisa Abe
2
Memory Technologies Concerns – Density – Latency – Energy Off Chip Technologies – DRAM Moderately dense, but not very fast – Flash Fairly dense, but near-disk slowness
3
Evaluation of Technologies DRAMNAND FlashNOR Flash Density140.25 Read Latency60ns25,000ns300ns Write Speed1000MB/s2.4MB/s0.5MB/s EnduranceEff. Infinite10^4 Retention?Refresh10 Years
4
Phase Change Memory Bit recorded in ‘Phase Change Material’ – SET to 1 by heating to crystallization point – RESET to 0 by heating to melting point – Resistance indicates state
5
Phase Change Memory Density – 4x increase over DRAM Latency – 4x increase over DRAM Energy – No leakage – Reads are worse(2x), writes much worse (40x) Wear out – Limited number of writes (but better than Flash) Non-volatile – data persists in memory
6
Evaluation of Technologies DRAMNAND FlashNOR FlashPCM Density140.252-4 Read Latency60ns25,000ns300ns200-300ns Write Speed1000MB/s2.4MB/s0.5MB/s100MB/s EnduranceEff. Infinite10^4 10^6 to 10^8 Retention?Refresh10 Years
7
Solutions to wearing & energy Partial writes = write only bits that have changed a)Caches keep track of written bytes/words per cacheline (Lee et. al) storage overhead vs. accuracy b)When writing a row to memory, first read old row and compare => write only modified bits (Zhou et al.) Writes cause thermal expansion / contraction that wears the material and requires strong current. But contrary to DRAM, PCM does not leak energy. Most written bits redundant!
8
Solutions to wearing & energy (cont.) Buffer organisation (Lee et al.) – DRAM uses one row buffer (2048B) – propose using up to 32 * 64B narrow buffers, each with own association capture coalescing writes: temporal locality more important than spatial locality find 4*512B most effective area-neutral also helps decrease latency Small DRAM buffer for PCM (Qureshi et al.) – combine low latency of DRAM with high capacity of PCM – similarly use Flash cache for Disk
9
Solutions to wearing & energy Wear leveling (Zhou et al.) – row shifting: even out writes among cells in a row needs extra hardware – segment swapping: even out between pages implemented in memory controller Spatial locality is now a problem!
10
PCM as On-chip Cache Hybrid on-chip cache architecture consisting of multiple memory technologies PCM, SRAM, embedded DRAM (eDRAM), and Magnetic RAM (MRAM) PCM is slow compared to SRAM etc. – But high density, non-volatility etc. help Use as complement to faster memory technologies As “slow” L2 cache, as L3 cache etc. PCM
11
Cache Structure Example Use PCM as huge L3 cache SRAM and eDRAM both as L2 – Faster and smaller SRAM region – Slower and larger eDRAM region L3 PCM (32MB) L2 eDRAM (Slow: <4MB) L2 SRAM (Fast: 256KB) Core w/ L1 L3 SRAM 1MB L2 SRAM 256K B Core w/ L1 Same Footprint Compared to 3-level SRAM cache model: 18% improvement in instructions per cycle Comparable power consumption Despite additional layer of PCM and its large capacity Various design possibilities PCM as “third” L2 cache etc.
12
Summary PCM can be viable approach towards next-generation memory architecture – High density, non-volatility – Various techniques to overcome shortcomings Short endurance, high-energy writes, latencies – Could be used as main memory or in on-chip cache hierarchy
13
Questions How well do results obtained on benchmark apps translate to real usage? Variance of endurance of memory cells? – may some cells wear out very quickly? Possibilities of PCM non-volatility instant wake-up from hibernation etc.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.