ICC Module 3 Lesson 2 – Memory Hierarchies 1 / 25 © 2015 Ph. Janson Information, Computing & Communication Memory Hierarchies – Clip 8 – Example School of Computer Science & Communications B. Falsafi (charts), Ph. Janson (commentary)
ICC Module 3 Lesson 2 – Memory Hierarchies 2 / 25 © 2015 Ph. Janson Outline ►Clip 1 – TechnologiesClip 1 ►Clip 2 – ConceptClip 2 ►Clip 3 – PrincipleClip 3 ►Clip 4 – ImplementationClip 4 ►Clip 5 – Reading memoryClip 5 ►Clip 6 – Writing memoryClip 6 ►Clip 7 – Cache management – the Least Recently Used algorithmClip 7 ►Clip 8 – A simulated exampleClip 8 ►Clip 9 – LocalityClip 9 First clipPrevious clipNext clip
ICC Module 3 Lesson 2 – Memory Hierarchies 3 / 25 © 2015 Ph. Janson Example from the preceding lesson: sum of integers up to n Sum of n first integers input : n output : m s 0 while n > 0 s s + n n n – 1 m s
ICC Module 3 Lesson 2 – Memory Hierarchies 4 / 25 © 2015 Ph. Janson Focus on the loop Sum of n first integers input : n output : m s 0 while n > 0 s s + n n n – 1 m s
ICC Module 3 Lesson 2 – Memory Hierarchies 5 / 25 © 2015 Ph. Janson Imagine a hypothetical computer Cache with 2 blocks Memory with 4 blocks Blocks of 4 words each (Only one register in the processor) Cache ? ? ? ? Main memory Processor
ICC Module 3 Lesson 2 – Memory Hierarchies 6 / 25 © 2015 Ph. Janson Assume the following memory layout ►s and m initially 0 ►n initially 2 Cache ? ? ? ? Main memory m=0 n=2 s=0 Processor
ICC Module 3 Lesson 2 – Memory Hierarchies 7 / 25 © 2015 Ph. Janson 13 Cache ? ? ? ? Main memory m=0 n=2 s= Execution Processor While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 8 / 25 © 2015 Ph. not in cache ((de)fault) Cache ? ? ? ? Main memory m=0 n=2 s= Execution Processor While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 9 / 25 © 2015 Ph. not in cache ((de)fault) load Cache ? ? ? ? Main memory m=0 n=2 s=0 load Execution Processor While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 10 / 25 © 2015 Ph. not in cache ((de)fault) load Cache ? ? ? ? Processor n=2 s=0 12 Execution Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 11 / 25 © 2015 Ph. not in cache ((de)fault) load place Main memory m=0 n=2 s=0 Processor Execution Cache ? ? 12 n=2 s=0 While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 12 / 25 © 2015 Ph. not in cache ((de)fault) load place return 2 Processor = 2 Cache ? ? 12 n=2 s=0 Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 13 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 s=0 read not in cache ((de)fault) load place return 2 Main memory m=0 n=2 s= While n > 0 s s + n n n – 1 Execution
ICC Module 3 Lesson 2 – Memory Hierarchies 14 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 = 0 (in not in cache ((de)fault) load place return 2 (in cache!) return 0 Main memory m=0 n=2 s= While n > 0 s s + n n n – 1 Execution
ICC Module 3 Lesson 2 – Memory Hierarchies 15 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 s=0 not in cache ((de)fault) load place return 2 (in cache) return 0 Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 16 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 = 2 (in cache !) not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 17 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) Main memory m=0 n=2 s= While n > 0 s s + n n n – 1 Execution
ICC Module 3 Lesson 2 – Memory Hierarchies 18 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 s=2 write not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 19 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 s=2 not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 20 / 25 © 2015 Ph. Janson Cache ? ? Processor 12 n=2 = 2 not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 21 / 25 © 2015 Ph. not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 add n, -1 (2 – 1) Cache ? ? Processor 12 n=2 s=2 Execution Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 22 / 25 © 2015 Ph. not in cache ((de)fault) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 add n, -1 (2 – 1) write 13 (in cache) Cache ? ? Processor 12 n=1 s=2 Execution Main memory m=0 n=2 s= write While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 23 / 25 © 2015 Ph. not in cache ((de)fault) load place return (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache)1 (in cache) return 2 add n, -1 (2 – 1) write 13 (in cache) Cache ? ? Processor 12 n=1 s=2 How many accesses to the cache and to the main memory ? Main memory m=0 n=2 s= While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 24 / 25 © 2015 Ph. Janson (in cache) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 add n, -1 (2 – 1) write 13 (in cache) Cache ? ? Processor 12 n=1 s=2 For all subsequent iterations of the loop Main memory m=0 n=2 s= While n > 0 s s + n n n – 1 6 cache accesses, 0 (de)fault
ICC Module 3 Lesson 2 – Memory Hierarchies 25 / 25 © 2015 Ph. Janson Overall For a total of 6n memory accesses ►6(n-1) + 5 in cache (at 1ns / access) ►1 cache (de)fault => memory access (at 100ns) Overall performance => ( n) ns Not bad at all if n is large which is typically the case in real programs Cache ? ? Processor 12 n=1 s=2 Main memory m=0 n=2 s=