Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cache Performance Analysis of Traversals and Random Accesses R. E. Ladner, J. D. Fix, and A. LaMarca Presented by Tomer Shiran.

Similar presentations


Presentation on theme: "Cache Performance Analysis of Traversals and Random Accesses R. E. Ladner, J. D. Fix, and A. LaMarca Presented by Tomer Shiran."— Presentation transcript:

1 Cache Performance Analysis of Traversals and Random Accesses R. E. Ladner, J. D. Fix, and A. LaMarca Presented by Tomer Shiran

2 The Model A large memory – M blocks A smaller cache – C blocks We examine only direct-mapped caches Each block y in the cache is associated with exactly one block of memory such that y=x modC.

3 The Model (2)

4 The Model (3) There are n different memory blocks that map to each cache block. Thus, M=nC

5 Algorithms and Cache An algorithm is simply a sequence of accesses to blocks in memory We assume that initially, none of the blocks to be accessed are in the cache A read or write to a variable that is part of a block is modeled as one access to the block We do not distinguish between reads and writes – a copy back architecture with a write buffer is used An access to a memory block x is a hit if x is in the cache and is a miss, otherwise The cache performance of an algorithm is measured by the number of misses it incurs

6 Traversals A traversal with block access rate K accesses each block of a contiguous array of N/K blocks exactly K times each (we always assume that K divides N) There are a total of N accesses in a traversal Two types of traversals: –Scan traversal –Permutation traversal

7 Scan Traversals A scan traversal accesses the first block K times, then the second block K times, and so forth (for a total of N/K blocks and N accesses) Scan traversals are extremely common in algorithms that manipulate arrays –If B array elements fit in a block then a left- to-right traversal of the array is a scan traversal with block access rate B [P-5.1] A scan traversal with block access rate K has 1/K cache misses per access

8 Permutation Traversals Consider the multiset S that contains K copies of x where 0 ≤ x < N/K Let σ= σ 1 σ 2 …σ N be a permutation of S, chosen uniformly at random If σ i =x then the i-th access (out of N) in the permutation traversal is to x At any point in the permutation traversal, if there are k accesses remaining and memory block x has j accesses remaining, then memory block x is chosen for the next access with probability j/k

9 Hit Rate of Permutation Traversals [T-5.1] Assuming all permutations are equally likely, a permutation traversal with block access rate K of N/K contiguous memory blocks has the following number of misses per access:

10 Hit Rate of Permutation Traversals (2) x is a particular cache block m 1, m 2, …, m n are memory blocks that map to cache block x in the region accessed by the traversal (N=nCK) During the traversal, nK accesses will be made to x B i =j whenever the i-th access that maps to x is to location m j (1≤i≤nK)

11 Hit Rate of Permutation Traversals (3) X ij is a random variable that indicates whether the i- th access that maps to x is a hit to location m j The first access to x is always a miss, so X 1j =0 for all j For i>1 (and i≤nK) we have the following:

12 Hit Rate of Permutation Traversals (4) For a traversal, the expected number of hits at x is then: For the expected number of hits incurred by the traversal for all cache blocks, we need to multiply the result by the number of cache blocks:

13 Tree Traversals – An Example The nodes of the tree are allocated contiguously in memory L is the number of tree nodes that fit in a single cache block  K=3L Even if the tree is arbitrary, the permutation traversal that arises from a preorder traversal is not completely arbitrary: –When the key of a node is visited, the next access will always be to pL (the left child pointer) –pR (the right child pointer) will be accessed next for the majority of nodes (the leaves), or may be accessed soon after Therefore, we model the accesses to the keys as a permutation traversal with K=L, and the remaining accesses to the child pointers as hits

14 Tree Traversals – An Example (2) The total number of misses in a preorder traversal is: This result was validated with an implementation in C on a DEC Alpha (the memory access was monitored using Atom), and was found to be extremely accurate!

15 Random Access In a random access pattern each block x of memory is accessed statistically (in other words, on a given access x is accessed with some probability) We assume the independent reference assumption The analysis of a set of random access patterns is called collective analysis

16 Collective Analysis The cache is partitioned into a set R of regions The accesses are partitioned into a set P of processes The processes are used to model accesses to different portions of memory that map to the same portion of the cache (a single process doesn’t access different data items that conflict in the cache) λ ij is the probability that region i is accessed by process j r i is the is the size of region i in blocks λ i is the probability that region i is accessed

17 Collective Analysis (2) [P-6.1] In a system of random accesses, in the limit as the number of accesses goes to infinity, the expected number of misses per access is: We define the following quantities:

18 Random Access for a Finite Period Proposition 6.1 gives the expected miss ratio if we think of a system of random accesses running forever In some cases we are interested in the number of misses that occur in N accesses [L-6.1] In a system of random accesses, for each block in region i, the expected number of misses in N accesses is:

19 Random Access for a Finite Period (2) x is a particular block in region i ρ ik is the probability that the k-th access is a miss at block x q ik is the probability that the k-th access was a hit to x given that it was an access to x (i.e., q ik is the hit ratio of x at access k)

20 Random Access for a Finite Period (3) From Lemma 6.1 (which we just proved), we can find the expected number of misses in all the N accesses [T-6.1] In a system of random accesses, the expected number of misses per access in N accesses is: As N goes to infinity the expected number of misses per access goes to 1-η, the expected miss rate from Proposition 6.1

21 Random Access for a Finite Period (4) In the most simple case, there is only one process and one region In the collective analysis model, an access to a block in a direct mapped cache by process j will be a hit if no other process has accessed the block since the last access by process j When there is only one process an access to a block is always a hit, so η=1 As a consequence the expected number of misses per access simplifies to:

22 Interaction of a Scan Traversal with a System of Random Access Suppose we have a system of accesses that consists of a scan traversal with block access rate K to some segment of memory interleaved with a system of random accesses to another segment of memory that makes L accesses per traversal access The pattern of access is described by the regular expression: (t 1 r L t 2 r L … t K r L ) *, where a sequence t 1 t 2… t K indicates K accesses to the same block and r represents a random access We assume that the system of random access has regions R and processes P and the probability that process j accesses region i is λ ij As before, region i has r i blocks

23 Scan Traversal with Access Rate 1 In this case K=1 and we are analyzing the access pattern described by the regular expression (tr L ) *, where t indicates a traversal access and r indicates a random access N is the total number of accesses and we assume that (1+L)C divides N A traversal access is always a miss, because K=1 and the traversal accesses and random accesses are to different memory segments The number of traversal misses is N/(1+L)

24 Scan Traversal with Access Rate 1 (2) Consider a block x in region i Every C traversal accesses the traversal captures the block x (i.e., the traversal accesses a memory block that maps to x) During the next C-1 traversal accesses, a random access might be made to the block that was evicted from x by the traversal By Lemma 6.1 (with N=LC) the expected number of misses per block of region i in the random accesses during C traversal accesses is: The expected number of misses, both traversal and random accesses, during C traversal accesses is:

25 Scan Traversal with Access Rate 1 (3) [T-7.1] In a system consisting of a scan traversal with access rate 1 and system of random accesses with L accesses per traversal access, the expected number of misses per access is:

26 Scan Traversal with Access Rate 1 (4)

27 Scan Traversal with Access Rate 1 (5) Assume there is one region of size C and two processes where each is equally likely to access a given block r 1 =C, λ 1 =1, and η=η 1 =½ For large size C the previous formula (Theorem 7.1) evaluates to approximately: For L=1 (creating the access pattern (tr) * ) this formula evaluates to approximately 0.91 misses per access As L grows the number of misses per access approaches 0.5 which is what one would expect with the system of random accesses without any interaction with a traversal

28 Any Questions?


Download ppt "Cache Performance Analysis of Traversals and Random Accesses R. E. Ladner, J. D. Fix, and A. LaMarca Presented by Tomer Shiran."

Similar presentations


Ads by Google