Information Storage and Spintronics 14 Atsufumi Hirohata Department of Electronic Engineering 09:00 Tuesday, 13/November/2018 (J/Q 004)
Quick Review over the Last Lecture FeRAM : PRAM : ReRAM : * http://loto.sourceforge.net/feram/doc/film.xhtml; ** http://www.wikipedia.org/; *** http://phys.nsysu.edu.tw/ezfiles/85/1085/img/588/Oxide-basedResistiveMemoryTechnology_CHLien.pdf
14 Cache Memory Level 1 Level 2 Level 3 Racetrack memory Register
Cache Memory In a PC, cache is used to make processing data fast : * To overcome the von Neumann bottleneck : Access speed : Processor ≫ memories * http://www.engineersgarage.com/mygarage/how-cache-memory-works?page=3 4
Roles of Cache To hold the instructions / data which are very commonly used or computer uses frequently. To read the likely data; that is data which is to be most probably read in near future. * http://www.engineersgarage.com/mygarage/how-cache-memory-works?page=3 5
Cache Types * http://www.engineersgarage.com/mygarage/how-cache-memory-works?page=3 6
Level 1 Cache Level 1 / primary cache (L1 cache) : * Static memory integrated with a processor core To store information recently accessed by a processor To improve data access speed in cases when the CPU accesses the same data multiple times Access time : L1 cache > system memory In a modern PC, Split into two caches of equal size One for storing programme data Another for storing microprocessor instructions * http://www.cpu-world.com/Glossary/L/Level_1_cache.html; ** http://wccftech.com/review/intel-core-i7-975-extreme-edition/ 7
Level 2 Cache Level 2 / secondary cache (L2 cache) : * Large static memory (may be) integrated with a processor core To store recently accessed information To reduce data access time when the same data was already accessed before Access time : L1 cache > L2 cache In a modern PC, Data pre-fetching feature to buffer programme instructions and data to be requested Inclusive cache : requested data stays Exclusive cache : requested data removed after transfer to L1 cache Unified for storing both programme data and microprocessor instructions * http://www.cpu-world.com/Glossary/L/Level_1_cache.html; ** http://wccftech.com/review/intel-core-i7-975-extreme-edition/ 8
Level 3 Cache Level 3 cache (L3 cache) : * Very Large static memory outside a processor core and shared by the cores To store copies of requested items in case a different core makes a subsequent request. Access time : L1 cache > L2 cache > L3 cache > DRAM In a modern PC, Inclusive cache : requested data stays Exclusive cache : requested data removed after transfer to L1 cache Unified for storing both programme data and microprocessor instructions * http://www.wisegeek.com/what-is-l3-cache.htm; ** http://wccftech.com/review/intel-core-i7-975-extreme-edition/ 9
Data Associativity Cache memory stores data by a blocked line (64 Bytes for Intel Pentium 4 L1) : * Direct mapped : Fastest hit times and best trade-off for large caches 2-way set / skewed associative : Best trade-off for 4 ~ 8 kbyte caches 4-way set associative Fully associative : Lowest miss rates and best trade-off for very high penalty * http://www.wikipedia.org/
Cache Miss SPEC CPU2000 benchmark test carried out by Hill and Cantin : Refill process is performed once cache miss occurs : Round robin : Refill data in order Least Recently Used (LRU) : Refill from the oldest data accessed Random → Hit rate : LRU > Random > Round robin → Complexity : LRU > Random > Round robin * http://www.wikipedia.org/
Example : Cache Sizes Intel Nehalem (2008) : * http://pc.watch.impress.co.jp/docs/2008/0321/kaigai427.htm
Example : Cache Architecture Intel Nehalem (2008) : * http://pc.watch.impress.co.jp/docs/2008/0321/kaigai427.htm
Memory Development Deeper memory hierarchy : * http://pc.watch.impress.co.jp/docs/2008/0321/kaigai427.htm
Racetrack Memory In 2008, 3-bit racetrack memory was demonstrated by Stuart S. P. Parkin (IBM) : * Utilise domain-wall motion by STT * http://www.i-micronews.com/news/IBM-moves-closer-class-memory,1231.html; * S. S. P. Parkin, Sci. Am. 300, 76 (2009).
Domain Wall Displacement For fast operation spin transfer torque (STT) will be used to move the walls. This requires a narrow track, possibly down to 100 nm, so that the STT dominates the Lorentz field. * A. Yamaguchi et al., Science 97, 077205 (2004).
Read / Write Operation Fully electrical read-out / write-in : * S. S. P. Parkin, Sci. Am. 300, 76 (2009).
Racetrack-Memory Properties Racetrack memory architecture : Utilise magnetic domain walls “1” : head-to-head wall “0” : tail-to-tail wall CMOS process compatible 3-dimensional (3D) structure Reproducible domain-wall trapping 3D fabrication * http://www.ibm.com/
Racetrack Memory Demonstration MRAM cell structure : 150 nm wide, 20 nm thick and 10 mm long ferromagnetic wires CMOS implementation * http://www.ibm.com/
Information Technology Pyramid Layered structures between CPU and storages : * * http://www.howstuffworks.com/computer-memory1.htm
Information Technology Pyramid Future replacements : * * http://www.imec.be/
Register Register is a very fast memory directly attached to a processor : * * http://withfriendship.com/user/levis/processor-register.php; ** http://www.wikipedia.org/
Integrated Fan-Out Wafer-Level-Package (InFO-WLP) Allowing more chips rather than more pins on a wafer : * * http://www.tsmc.com/
Low-Voltage-in-Package-Interconnect (LIPINCON) Width and Separation of 5 µm, and thickness of 0.6 mm : * 0.3 V signals can be sent. * http://www.tsmc.com/
25-Core Piton 460M transistors using the 32 nm IBM SOI process : * * http://parallel.princeton.edu/openpiton/
Piton Kilo-Core Chip 621M transistors using the 32 nm IBM SOI process : * * http://parallel.princeton.edu/openpiton/