Download presentation
Presentation is loading. Please wait.
1
Information Storage and Spintronics 14
Atsufumi Hirohata Department of Electronic Engineering 09:00 Tuesday, 13/November/2018 (J/Q 004)
2
Quick Review over the Last Lecture
FeRAM : PRAM : ReRAM : * ** ***
3
14 Cache Memory Level 1 Level 2 Level 3 Racetrack memory Register
4
Cache Memory In a PC, cache is used to make processing data fast : *
To overcome the von Neumann bottleneck : Access speed : Processor ≫ memories * 4
5
Roles of Cache To hold the instructions / data which are very commonly used or computer uses frequently. To read the likely data; that is data which is to be most probably read in near future. * 5
6
Cache Types * 6
7
Level 1 Cache Level 1 / primary cache (L1 cache) : *
Static memory integrated with a processor core To store information recently accessed by a processor To improve data access speed in cases when the CPU accesses the same data multiple times Access time : L1 cache > system memory In a modern PC, Split into two caches of equal size One for storing programme data Another for storing microprocessor instructions * ** 7
8
Level 2 Cache Level 2 / secondary cache (L2 cache) : *
Large static memory (may be) integrated with a processor core To store recently accessed information To reduce data access time when the same data was already accessed before Access time : L1 cache > L2 cache In a modern PC, Data pre-fetching feature to buffer programme instructions and data to be requested Inclusive cache : requested data stays Exclusive cache : requested data removed after transfer to L1 cache Unified for storing both programme data and microprocessor instructions * ** 8
9
Level 3 Cache Level 3 cache (L3 cache) : *
Very Large static memory outside a processor core and shared by the cores To store copies of requested items in case a different core makes a subsequent request. Access time : L1 cache > L2 cache > L3 cache > DRAM In a modern PC, Inclusive cache : requested data stays Exclusive cache : requested data removed after transfer to L1 cache Unified for storing both programme data and microprocessor instructions * ** 9
10
Data Associativity Cache memory stores data by a blocked line (64 Bytes for Intel Pentium 4 L1) : * Direct mapped : Fastest hit times and best trade-off for large caches 2-way set / skewed associative : Best trade-off for 4 ~ 8 kbyte caches 4-way set associative Fully associative : Lowest miss rates and best trade-off for very high penalty *
11
Cache Miss SPEC CPU2000 benchmark test carried out by Hill and Cantin : Refill process is performed once cache miss occurs : Round robin : Refill data in order Least Recently Used (LRU) : Refill from the oldest data accessed Random → Hit rate : LRU > Random > Round robin → Complexity : LRU > Random > Round robin *
12
Example : Cache Sizes Intel Nehalem (2008) :
*
13
Example : Cache Architecture
Intel Nehalem (2008) : *
14
Memory Development Deeper memory hierarchy :
*
15
Racetrack Memory In 2008, 3-bit racetrack memory was demonstrated by Stuart S. P. Parkin (IBM) : * Utilise domain-wall motion by STT * * S. S. P. Parkin, Sci. Am. 300, 76 (2009).
16
Domain Wall Displacement
For fast operation spin transfer torque (STT) will be used to move the walls. This requires a narrow track, possibly down to 100 nm, so that the STT dominates the Lorentz field. * A. Yamaguchi et al., Science 97, (2004).
17
Read / Write Operation Fully electrical read-out / write-in :
* S. S. P. Parkin, Sci. Am. 300, 76 (2009).
18
Racetrack-Memory Properties
Racetrack memory architecture : Utilise magnetic domain walls “1” : head-to-head wall “0” : tail-to-tail wall CMOS process compatible 3-dimensional (3D) structure Reproducible domain-wall trapping 3D fabrication *
19
Racetrack Memory Demonstration
MRAM cell structure : 150 nm wide, 20 nm thick and 10 mm long ferromagnetic wires CMOS implementation *
20
Information Technology Pyramid
Layered structures between CPU and storages : * *
21
Information Technology Pyramid
Future replacements : * *
22
Register Register is a very fast memory directly attached to a processor : * * **
23
Integrated Fan-Out Wafer-Level-Package (InFO-WLP)
Allowing more chips rather than more pins on a wafer : * *
24
Low-Voltage-in-Package-Interconnect (LIPINCON)
Width and Separation of 5 µm, and thickness of 0.6 mm : * 0.3 V signals can be sent. *
25
25-Core Piton 460M transistors using the 32 nm IBM SOI process : *
*
26
Piton Kilo-Core Chip 621M transistors using the 32 nm IBM SOI process : * *
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.