ICC Module 3 Lesson 2 – Memory Hierarchies 1 / 13 © 2015 Ph. Janson Information, Computing & Communication Memory Hierarchies – Clip 9 – Locality School of Computer Science & Communications B. Falsafi (charts), Ph. Janson (commentary)
ICC Module 3 Lesson 2 – Memory Hierarchies 2 / 13 © 2015 Ph. Janson Outline ►Clip 1 – TechnologiesClip 1 ►Clip 2 – ConceptClip 2 ►Clip 3 – PrincipleClip 3 ►Clip 4 – ImplementationClip 4 ►Clip 5 – Reading memoryClip 5 ►Clip 6 – Writing memoryClip 6 ►Clip 7 – Cache management – the Least Recently Used algorithmClip 7 ►Clip 8 – A simulated exampleClip 8 ►Clip 9 – LocalityClip 9 First clipPrevious clipFirst clipPrevious clipNext clip
ICC Module 3 Lesson 2 – Memory Hierarchies 3 / 13 © 2015 Ph. Janson (in cache in n-1 cases) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 add n, -1 (2 – 1) write 13 (in cache) Cache ? ? Processor 12 n n s s Have a closer look at cache accesses Main memory m m n n s s While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 4 / 13 © 2015 Ph. Janson (in cache in n-1 cases) load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 add n, -1 (2 – 1) write 13 (in cache) Cache ? ? Processor 12 n n s s Two things are happening here: Fact 1: identical addresses are re-accessed over time Main memory m m n n s s While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 5 / 13 © 2015 Ph. Janson That is what is called temporal locality ►In cache because of multiple accesses to identical addresses within a short period of time ►This is typical of practical reality All “interesting” algorithms include loops that re-access the same variables many times (During a week of winter sports, you reuse your snowboard every day) 12 n n s s
ICC Module 3 Lesson 2 – Memory Hierarchies 6 / 13 © 2015 Ph. Janson load place return 2 (in cache) return 0 (in cache) return 2 add s, n (0 + 2) write (in cache) (in cache) return 2 add n, -1 (2 – 1) write 13 (in cache) Cache ? ? Processor 12 n n s s Two things are happening here: Fact 2: addresses in the same bloc are re-accessed over time Main memory m m n n s s While n > 0 s s + n n n – 1
ICC Module 3 Lesson 2 – Memory Hierarchies 7 / 13 © 2015 Ph. Janson This is what is called spatial locality ►In cache because of multiple accesses to different addresses within a short range of space (blocks) ►This is typical of practical reality All “interesting” algorithms include work with related and closely located variables (when you go skiing, you need your left ski and your right ski … and your ski shoes) 12 n n s s
ICC Module 3 Lesson 2 – Memory Hierarchies 8 / 13 © 2015 Ph. Janson ►2 words instead of 4 ►More smaller blocks ►n and s would be in different blocks ►Which would cause 2 cache (de)faults instead of 1 ►Less spatial locality ►But performance maintained with a smaller cache Cache Processor 12 n n s s What if blocks were smaller ? Main memory : : m m : : n n s s 0 2 : 12 14
ICC Module 3 Lesson 2 – Memory Hierarchies 9 / 13 © 2015 Ph. Janson ►8 words instead of 4 ►Fewer blocks in cache at the same time ►m, n, and s no longer fit in memory at the same time ►Which would cause 2 cache (de)faults instead of 1 at each execution of the program ►Less temporal locality ►But can be compensated by a larger cache Cache Processor What if blocks were larger ? Main memory 8 m m n n s s s s n n s s s s
ICC Module 3 Lesson 2 – Memory Hierarchies 10 / 13 © 2015 Ph. Janson Spatial vs. temporal locality & block size Fewer large blocks Better spatial locality ✗ Worse temporal locality More small blocks Better temporal locality ✗ Worse spatial locality Cache (de)faults Block size The optimum depends on the cache size and the number and usage of variables of the program
ICC Module 3 Lesson 2 – Memory Hierarchies 11 / 13 © 2015 Ph. Janson Programmers cannot ignore locality
ICC Module 3 Lesson 2 – Memory Hierarchies 12 / 13 © 2015 Ph. Janson Adding the elements of a matrix by lines or by columns for (i=0;i<n;i++) { /* row by row */ for (j=0;j<n;j++) {/* then column by column */ acc += m[i][j]; } for (j=0;j<n;j++) { /* column by column */ for (i=0;i<n;i++) {/* then row by row */ acc += m[i][j]; }
ICC Module 3 Lesson 2 – Memory Hierarchies 13 / 13 © 2015 Ph. Janson Results