Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture.

Slides:



Advertisements
Similar presentations
Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.
Advertisements

Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Updated by tomer gurevich Slides from Lihu and Adi’s Lecture.
EECS 470 Virtual Memory Lecture 15. Why Use Virtual Memory? Decouples size of physical memory from programmer visible virtual memory Provides a convenient.
16.317: Microprocessor System Design I
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Virtual Memory I Steve Ko Computer Sciences and Engineering University at Buffalo.
Computer Architecture Virtual Memory
Memory Management (II)
Memory Management and Paging CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Chapter 3.2 : Virtual Memory
©UCB CS 162 Ch 7: Virtual Memory LECTURE 13 Instructor: L.N. Bhuyan
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
Lecture 19: Virtual Memory
1 Chapter 3.2 : Virtual Memory What is virtual memory? What is virtual memory? Virtual memory management schemes Virtual memory management schemes Paging.
Operating Systems ECE344 Ding Yuan Paging Lecture 8: Paging.
1 Virtual Memory. 2 Outline Pentium/Linux Memory System Core i7 Suggested reading: 9.6, 9.7.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui COMP 203 / NWEN 201 Computer Organisation / Computer Architectures Virtual.
Computer Architecture 2011 – VM x86 1 Computer Architecture Virtual Memory (VM) – x86 By Dan Tsafrir, 30/5/2011 Presentation based on slides by Lihu Rappoport.
Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Updated by tomer gurevich Slides from Lihu and Adi’s Lecture.
Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture.
Virtual Memory Part 1 Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology May 2, 2012L22-1
Computer Architecture 2008 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport.
Pentium III Memory.
Computer Structure 2012 – VM 1 Computer Structure Virtual Memory Lihu Rappoport and Adi Yoaz.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Virtual Memory 1 1.
1 Some Real Problem  What if a program needs more memory than the machine has? —even if individual programs fit in memory, how can we run multiple programs?
1 Memory Management. 2 Fixed Partitions Legend Free Space 0k 4k 16k 64k 128k Internal fragmentation (cannot be reallocated) Divide memory into n (possible.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Demand Paging.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Computer Structure Virtual Memory
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
VMLihu Rappoport, 12/ MAMAS – Computer Architecture Virtual Memory Dr. Lihu Rappoport.
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Computer Structure – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Updated by tomer gurevich Slides from Lihu and Adi’s Lecture.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.
CS203 – Advanced Computer Architecture Virtual Memory.
Computer Architecture 2010 – VM 1 Computer Architecture Virtual Memory Dr. Lihu Rappoport.
1 Virtual Memory. 2 Outline Case analysis –Pentium/Linux Memory System –Core i7 Suggested reading: 9.7.
Computer Architecture Virtual Memory
Virtual Memory Chapter 7.4.
ECE232: Hardware Organization and Design
Memory COMPUTER ARCHITECTURE
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
From Address Translation to Demand Paging
From Address Translation to Demand Paging
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Computer Architecture Virtual Memory (VM)
Some Real Problem What if a program needs more memory than the machine has? even if individual programs fit in memory, how can we run multiple programs?
CSE 153 Design of Operating Systems Winter 2018
CSCI206 - Computer Organization & Programming
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
From Address Translation to Demand Paging
CS 105 “Tour of the Black Holes of Computing!”
Virtual Memory Hardware
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Translation Buffers (TLB’s)
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CS 105 “Tour of the Black Holes of Computing!”
CS 105 “Tour of the Black Holes of Computing!”
CSE 153 Design of Operating Systems Winter 2019
Virtual Memory Lecture notes from MKP and S. Yalamanchili.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Review What are the advantages/disadvantages of pages versus segments?
Computer Structure Virtual Memory
Virtual Memory 1 1.
Presentation transcript:

Computer Structure 2012 – VM 1 Computer Structure X86 Virtual Memory and TLB Franck Sala Slides from Lihu and Adi’s Lecture

Computer Structure 2012 – VM 2 Virtual Memory  Provides the illusion of a large memory  Different machines have different amount of physical memory –Allows programs to run regardless of actual physical memory size  The amount of memory consumed by each process is dynamic –Allow adding memory as needed  Many processes can run on a single machine –Provide each process its own memory space –Prevents a process from accessing the memory of other processes running on the same machine –Allows the sum of memory spaces of all process to be larger than physical memory  Basic terminology –Virtual Address Space: address space used by the programmer –Physical Address: actual physical memory address space

Computer Structure 2012 – VM 3 Virtual Memory: Basic Idea  Divide memory (virtual and physical) into fixed size blocks –Pages in Virtual space, Frames in Physical space –Page size = Frame size –Page size is a power of 2: page size = 2 k  All pages in the virtual address space are contiguous  Pages can be mapped into physical Frames in any order  Some of the pages are in main memory (DRAM), some of the pages are on disk  All programs are written using Virtual Memory Address Space  The hardware does on-the-fly translation between virtual and physical address spaces –Use a Page Table to translate between Virtual and Physical addresses

Computer Structure 2012 – VM 4 Virtual to Physical Address translation Page size: 2 12 byte =4K byte 47 Page offset 0 11 Virtual Page Number 11 0 Physical Page Number 38 Virtual Address Physical Address VD Phy. page # 1 Page table base reg 0 Valid bit Dirty bit 12 AC Access Control - Memory type (WB, WT, UC, WP …) - User / Supervisor 12 Page offset

Computer Structure 2012 – VM 5 Translation Look aside Buffer (TLB)  Page table resides in memory  each translation requires an extra memory access  TLB caches recently used PTEs –speed up translation –typically 128 to 256 entries, 4 to 8 way associative  TLB Indexing  On A TLB miss –Page Miss Handler (HW PMH) gets PTE from memory Access Page Table In memory Yes No TLB Hit ? Virtual Address Physical Addresses TLB Access TagSet Offset Virtual page number

Computer Structure 2012 – VM 6 Virtual Memory And Cache  TLB access is serial with cache access  Page table entries are cached in L1 D$, L2$ and L3 $ as data Yes Page Walk: get PTE from memory Hierarchy No Access Cache Virtual Address L1 Cache Hit ? Yes No Physical Addresses Data No Access Memory L2 Cache Hit ? TLB Hit ? Access TLB STLB Hit ? No

Computer Structure 2012 – VM 7 Glossary  Page Fault –Page is not in main memory –Context switch –Demand load policy  Access control / Protection fault  Context switch  Page Aliasing –DLL, Shared code, large Malloc  Page swap out –snoop-invalidate for each byte in the cache (both L1 and L2)  All data in the caches which belongs to that page is invalidated The page in the disk is up-to-date –If the TLB hits for the swapped-out page, TLB entry is invalidated –In the page table  The valid bit in the PTE entry of the swapped-out pages set to 0  All the rest of the bits in the PTE entry may be used by the operating system for keeping the location of the page in the disk

Computer Structure 2012 – VM 8 32bit Mode: 4KB / 4MB Page Mapping  2-level hierarchical mapping: Page Directories and Page tables –4KB aligned  PDE –Present (0 = page fault) –Page size (4KB or 4 MB)  CR4.PSE=1  both 4MB & 4KB pages supported  Separate TLBs OFFSET 031 DIRTABLE Linear Address Space (4K Page) K entry Page Table 1K entry Page Directory PDE 4K Page data CR3 (PDBR) PTE 20+12=32 (4K aligned) 20 OFFSET 031 DIR Linear Address Space (4MB Page) 21 Page Directory PDE 4MByte Page data CR3 (PDBR) =32 (4K aligned) 10

Computer Structure 2012 – VM 9 32bit Mode: PDE and PTE Format  20 bit pointer to a 4K Aligned address  Virtual memory –Present –Accessed –Dirty (in PTE only) –Page size (in PDE only) –Global  Protection –Writable (R#/W) –User / Supervisor #  2 levels/type only  Caching –Page WT –Page Cache Disabled –PAT – PT Attribute Index  3 bits available for OS usage Page Directory Entry (4KB page table) Page Table Entry UG Page Frame Address 31:12 AVAIL0AA PCDPCD PWTPWT WP Present Writable User / Supervisor Write-Through Cache Disable Accessed Page Size (0: 4 Kbyte) Global Available for OS Use GAVAIL Page Frame Address 31:12 D PCDPCD PWTPWT UWAP PATPAT Present Writable User / Supervisor Write-Through Cache Disable Accessed Dirty PAT Global Available for OS Use

Computer Structure 2012 – VM 10 4KB Page Mapping with PAE Physical addresses is extended to M bits / Linear address remains 32 bit 1995: Pentium Pro… OFFSET 031 DIRTABLE Linear Address Space (4K Page) K entry Page Table 1K entry Page Directory PDE 4K Page data CR3 (PDBR) PTE 20+12=32 (4K aligned) 20  32  Dir ptr 029 DIRTABLEOFFSET Linear Address Space (4K Page) entry Page Table 512 entry Page Directory PDE 4KByte Page data PTE CR3 (PDPTR) 32 (32B aligned) M entry Page Directory Pointer Table Dir ptr entry M-12 2  64 

Computer Structure 2012 – VM 11 2MB Page Mapping with PAE Physical addresses is extended to M bits / Linear address remains 32 bit Dir ptr 029 DIROFFSET Linear Address Space (2MB Page) 20 Page Directory PDE 2MByte Page data 9 21 CR3 (PDPTR) 32 (32B aligned) M Page Directory Pointer Table Dir ptr entry M-12 2  64  OFFSET 031 DIR Linear Address Space (4MB Page) 21 Page Directory PDE 4MByte Page data CR3 (PDBR) =32 (4K aligned) 10  32 

Computer Structure 2012 – VM 12 4KB Page Mapping in 64 bit Mode 2003: AMD Opteron… sign ext. 029 DIRTABLEOFFSET Linear Address Space (4K Page) entry Page Table 512 entry Page Directory PDE 4KByte Page data PTE CR3 (PDPTR) 40 (4KB aligned) M entry Page Directory Pointer Table PDP entry M-12 9 PDPPML entry PML4 Table PML4 entry 9 M-12  64  256 TB of virtual memory (2 48 ) 1 TB of physical memory (2 40 ) PML4: Page Map Level 4 PDP: Page Directory Pointer

Computer Structure 2012 – VM 13 2MB Page Mapping in 64 bit Mode PML4PDPsign ext. 029 DIROFFSET Linear Address Space (2M Page) entry Page Directory PDE 2MByte Page data 9 21 CR3 (PDPTR) 40 (4KB aligned) M entry Page Directory Pointer Table PDP entry M entry PML4 Table PML4 entry 9 M-12

Computer Structure 2012 – VM 14 1GB Page Mapping in 64 bit Mode OFFSETPML4PDPsign ext. 029 Linear Address Space (1G Page) 1GByte Page data 30 CR3 (PDPTR) 40 (4KB aligned) M entry Page Directory Pointer Table PDP entry entry PML4 Table PML4 entry 9 M-12

Computer Structure 2012 – VM 15 TLBs  The processor saves most recently used PDEs and PTEs in TLBs –Separate TLB for data and instruction caches –Separate TLBs for 4KB and 2/4MB page sizes

Computer Structure 2012 – VM 16 Question 1  We have a core similar to X86 –64 bit mode –Support Small Pages (PTE) and Large Pages (DIR) –Page table size in each hierarchy is the size of a small page –Entry size in the Page Table is 16 byte, in all the hierarchies 011N sign ext.DIRTABLEOFFSETPDPPML4 N2N3N4  What is the size of a small page ? 12 bits in the offset field  2 12 B = 4KB  How many entries are in each Page Table? Page Table size = Page Size = 4KB PTE = 16B  4KB / 16B = 2 12 / 2 4 = 2 8 = 256 entries in each Page Table

Computer Structure 2012 – VM 17 Question 1 011N sign ext.DIRTABLEOFFSETPDPPML4 N2N3N4  What are the values of N1, N2, N3 and N4 ? Since we have 256 entries in each table, we need 8 bits to address them -Table [19:12]N1 = 19 -DIR [27:20]N2 = 27 -PDP [35:28]N3 = 35 -PML4 [43:36]N4 = bit (large & small) -PT size = Page size = 4KB -PTE = 16B -Page Table: 256 entries  What is the size of a large page ? Large pages are pointed by DIR So the large pages offset is 20 bits [19:0]  large pages size: 2 20 = 1MB We can also say: DIR can point to 256 pages of 4KB = 1MB

Computer Structure 2012 – VM 18 Question sign ext.DIRTABLEOFFSETPDPPML  We access a sequence of virtual addresses For each address, what is the minimal number of tables that were added in all the hierarchies ?  See next foil in presentation mode… -64 bit (large & small) -PT size = Page size = 4KB -Large page size = 1 MB -PTE = 16B -Page Table: 256 entries

Computer Structure 2012 – VM 19 PML4 PDP DIR PTE offset E2 4KB Page FF Page Table CR PDP Table PML4 Table 8 sign ext.DIRTABLEOFFSETPDPPML4 Page Dir Question 1: sequence of allocations D3 28 B C B D3 82 FF E2 B D3 82 FF E D3 82 FF D C B bits instead of 9 for example purposes only 3 new tables + 1 page 0 new table 0 new table + 1 page 1 new tables + 1 page 3 new tables + 1 page

Computer Structure 2012 – VM 20 Caches and Translation Structures PlatformOn-die Core PMH L1 data cache L1 Inst. cache L2 Inst. TLBData. TLB translation L3 Instruction bytes PTE STLB PTE Memory PDE cache PDP cache PML4 cache PDE entry PDP entry PML4 entry VA[47:12] VA[47:21] VA[47:30] VA[47:39] Page Walk Logic Load entry

Computer Structure 2012 – VM 21 cache Accessed with virtual address bits If hits, returns PDE cache[47:21]PDE PDP cache[47:30]PDP entry PML4 cache [47:39]PML4 entry SIGN EXT.PML4PDPDIRTABLEOFFSET

Computer Structure 2012 – VM 22 Question 2  Processor similar to X86 – 64 bits  Pages of 4KB  The processor has a TLB –TLB Hit: we get the translation with no need to access the translation tables –TLB Miss: the processor has to do a Page Walk  The hardware that does the Page Walk (PMH) contains a cache for each of the translation tables  All Caches and TLB are empty on Reset  For the sequence of memory access below, how many accesses are needed for the translations? sign ext.DIRTABLEOFFSETPDPPML Address memory access Explanations H H H

Computer Structure 2012 – VM 23 Question 2  Processor similar to X86 – 64 bits  Pages of 4KB  The processor has a TLB –TLB Hit: we get the translation with no need to access the translation tables –TLB Miss: the processor has to do a Page Walk  The hardware that does the Page Walk (PMH) contains a cache for each of the translation tables  All Caches and TLB are empty on Reset  For the sequence of memory access below, how many accesses are needed for the translations? sign ext.DIRTABLEOFFSETPDPPML Address memory access Explanations H H H

Computer Structure 2012 – VM 24 Question 2  Processor similar to X86 – 64 bits  Pages of 4KB  The processor has a TLB –TLB Hit: we get the translation with no need to access the translation tables –TLB Miss: the processor has to do a Page Walk  The hardware that does the Page Walk (PMH) contains a cache for each of the translation tables  All Caches and TLB are empty on Reset  For the sequence of memory access below, how many accesses are needed for the translations? sign ext.DIRTABLEOFFSETPDPPML Address memory access Explanations H 4We need to access the memory for each of the 4 translation tables H H

Computer Structure 2012 – VM 25 Question 2  Processor similar to X86 – 64 bits  Pages of 4KB  The processor has a TLB –TLB Hit: we get the translation with no need to access the translation tables –TLB Miss: the processor has to do a Page Walk  The hardware that does the Page Walk (PMH) contains a cache for each of the translation tables  All Caches and TLB are empty on Reset  For the sequence of memory access below, how many accesses are needed for the translations? sign ext.DIRTABLEOFFSETPDPPML Address memory access Explanations H 4We need to access the memory for each of the 4 translation tables H 0Same pages as above  TLB hit  No memory access H

Computer Structure 2012 – VM 26 Question 2  Processor similar to X86 – 64 bits  Pages of 4KB  The processor has a TLB –TLB Hit: we get the translation with no need to access the translation tables –TLB Miss: the processor has to do a Page Walk  The hardware that does the Page Walk (PMH) contains a cache for each of the translation tables  All Caches and TLB are empty on Reset  For the sequence of memory access below, how many accesses are needed for the translations? sign ext.DIRTABLEOFFSETPDPPML Address memory access Explanations H 4We need to access the memory for each of the 4 translation tables H 0Same pages as above  TLB hit  No memory access H 3We hit in PML4 cache. Then we miss in the PDP and so we need to access the memory 3 times: PDP, DIR, PTE

Computer Structure 2012 – VM 27 Question 2  L1 data Cache: 32KB – 2 ways of 64B each  How can we access this cache before we get the physical address? 64B  6 bits offset bits [5:0] 32KB = 2 15 / (2 ways * 2 6 bytes) = 2 8 = 256 sets [13:6]  12 bits are not translated: [11:0]  we lack 2 bits [13:12] to get the set address  So we do a lookup using 2 un-translated bits for the set address  Those bits can be different from the PFN obtained after translation, therefore we need to compare the whole PFN to the tag stored in the Cache Tag array

Computer Structure 2012 – VM 28 Question 2: Read Acces Offset 05 Set 613 Not Translated 1211 VPN Translated 47 40:12 == Tag Match Tag Match Tag [40:12] Tag [40:12] Way 0Way 1 PFN

Computer Structure 2012 – VM 29 Question 2: example … … 0000 VB:PB: Set: [13:6] = 0 Tag: 3 (0 if we don’t take [13:12]) Read Virtual Address B … … 0000 VA:PA: VPNOffset PFN Set: [13:6] = 0 Write Virtual Address A Tag: 0 Offset 05 Set 613 Not Translated 1211 VPN Translated 47

Computer Structure 2012 – VM 30 Question 2: Virtual Alias  L1 data Cache: 32KB  2 ways of 64B each  What will happen when we access with a given offset the virtual page A and after this, there is an access with the same offset in the virtual page B, which is mapped by the OS to the same physical page as A? Offset 05 Set 613 Not Translated 1211 VPN Translated 47 PFN VPN xxxx01 VPN yyyy00 PFN zzzz -2 virtual pages map to the same frame xxxx01.set.ofset and yyyy00.set.ofset 01 and 00 are bits 13:12 Physical Addresses Cache Set[11:6] Not translated Max: 64 sets zzzz Phys. addressed cache: The data exist only once Physical Addresses Cache 01.set[11:6] zzzz 00.set[11:6] Virtual addressed cache: The data may exist twice AVOID THIS !!!

Computer Structure 2012 – VM 31 Question 2: Virtual Alias  L1 data Cache: 32KB  2 ways of 64B each  What will happen when we access with a given offset the virtual page A and after this, there is an access with the same offset in the virtual page B, which is mapped by the OS to the same physical page as A? Offset 05 Set 613 Not Translated 1211 VPN Translated 47 PFN Physical Addresses Cache 01.set[11:6] zzzz 00.set[11:6] Virtual addressed cache: The data may exist twice AVOID THIS !!!  Avoid having the same data twice in the cache xxxx01.set.ofset and yyyy00.set.ofset  Check 4 sets when we allocate a new entry and see if the same tag appears  If yes, evict the second occurrence of the data (the alias)

Computer Structure 2012 – VM 32 Question 2: Snoop  L1 data Cache: 32KB  2 ways of 64B each  What happens in case of snoop in the cache? Offset 05 Set 613 Not Translated 1211 VPN Translated 47 PFN Physical Addresses Cache 01.set[11:6] zzzz 00.set[11:6] Virtual addressed cache: The data may exist twice AVOID THIS !!! The cache is snooped with a physical address [40:0] Since the 2 MSB bits of the set address are virtual, a given physical address can map to 4 different sets in the cache (depending on the virtual page that is mapped to it) So we must snoop 4 sets * 2 ways in the cache

Computer Structure 2012 – VM 33 Question 3  Core similar to X86 in 64 bit mode  Supports small pages (pointed by PTE) and large pages (pointed by DIR).  Size of an entry in all the different page tables is 8 Bytes  PMH Caches at all the levels –4 entries direct mapped –Access time on hit: 2 cycles –Miss known after 1 cycle  PMH caches are accessed at all the levels in parallel  In each level, when there is a HIT, the PMH cache provides the relevant entry in the page table in the relevant level  In each level, when there is a miss: the core accesses the relevant page table in the main memory.  Access time to the main memory is 100 cycles, not including the time needed to get the PMH cache miss sign ext.DIRTABLEOFFSETPDPPML

Computer Structure 2012 – VM 34 Question 3  What is the size of the large pages? The large page is pointed by DIR, therefore, all the bits under are offset inside the large page: 2 24 = 16 MB sign ext.DIRTABLEOFFSETPDPPML  How many entries in each Page Table ? PTE: 2 12 DIR: 2 12 PDP: 2 12 PML4:2 8

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD FF ABCD FF ABCD FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD401Miss and memory access at each level: 1+ 4 *100 = 401 cycles FF ABCD FF ABCD FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD401Miss and memory access at each level: 1+ 4 *100 = 401 cycles FF ABCD202(PML4, PDP, DIR, sTLB) = (H,H,M,M) 1cyc PDP TLB read 1 more cycle DIR TLB: Miss 100 cycles PTE TLB: Miss 100 cycles FF ABCD FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD401Miss and memory access at each level: 1+ 4 *100 = 401 cycles FF ABCD202(PML4, PDP, DIR, sTLB) = (H,H,M,M) 1cyc PDP TLB read 1 more cycle DIR TLB: Miss 100 cycles PTE TLB: Miss 100 cycles FF ABCD401PMH cache miss in all the levels: 1 + 4*100 = 401 cycles FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD401Miss and memory access at each level: 1+ 4 *100 = 401 cycles FF ABCD202(PML4, PDP, DIR, sTLB) = (H,H,M,M) 1cyc PDP TLB read 1 more cycle DIR TLB: Miss 100 cycles PTE TLB: Miss 100 cycles FF ABCD401PMH cache miss in all the levels: 1 + 4*100 = 401 cycles FF ABCD302(PML4, PDP, DIR, sTLB) = (H,M,M,M) 1 cyc PDP: the entry 234 that was filled for the second access was replaced by the entry that was filled in access 3, as it is in the same set  miss 2 + (3 × 100) = 302 FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD401Miss and memory access at each level: 1+ 4 *100 = 401 cycles FF ABCD202(PML4, PDP, DIR, sTLB) = (H,H,M,M) 1cyc PDP TLB read 1 more cycle DIR TLB: Miss 100 cycles PTE TLB: Miss 100 cycles FF ABCD401PMH cache miss in all the levels: 1 + 4*100 = 401 cycles FF ABCD302(PML4, PDP, DIR, sTLB) = (H,M,M,M) 1 cyc PDP: the entry 234 that was filled for the second access was replaced by the entry that was filled in access 3, as it is in the same set  miss 2 + (3 × 100) = 302 FF A0CD2Hit in TLB – no need to go to PMH: 2 cycles TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM 41 Backup Slides

Computer Structure 2012 – VM 42 Offset 05 Set 613 VPN 47 == Tag Match Tag Match Way 0Way 3 PFN Tag field : Translated not translated Virtual Address …

Computer Structure 2012 – VM 43 Offset Set VPN == Tag Match Tag Match Way 0Way 3 PFN 40 Tag field : Translated not translated Virtual Address …

Computer Structure 2012 – VM 44 Question 3  We have a sequence of accesses to virtual addresses. For each address, how many cycles are needed to translate the address. Assume that we use small pages only and that the TLBs are empty upon reset sign ext.DIRTABLEOFFSETPDPPML TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle Virtual Addr. Cycles Comment FF ABCD FF ABCD FF ABCD FF ABCD FF A0CD

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD404Miss and memory access at each level: 4 *101 = 404 cycles FF ABCD FF ABCD FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD404Miss and memory access at each level: 4 *101 = 404 cycles FF ABCD206PML4 TLB: Hit 2 cycles PDP TLB: Hit 2 cycles DIR TLB: Miss 101 cycles PTE TLB: Miss 101 cycles FF ABCD FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD404Miss and memory access at each level: 4 *101 = 404 cycles FF ABCD206PML4 TLB: Hit 2 cycles PDP TLB: Hit 2 cycles DIR TLB: Hit 101 cycles PTE TLB: Hit 101 cycles FF ABCD404TLB miss in all the levels: 4 *101 = 404 cycles FF ABCD FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD404Miss and memory access at each level: 4 *101 = 404 cycles FF ABCD206PML4 TLB: Hit 2 cycles PDP TLB: Hit 2 cycles DIR TLB: Hit 101 cycles PTE TLB: Hit 101 cycles FF ABCD404TLB miss in all the levels: 4 *101 = 404 cycles FF ABCD305PML4 TLB: 2 cycles PDP: the entry 234 that was filled for the second access was replaced by the entry that was filled in access 3, as it is in the same set  miss DIR & PTE: miss (1 × 2) + (3 × 101) = 305 FF A0CD TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle

Computer Structure 2012 – VM sign ext.DIRTABLEOFFSETPDPPML Virtual Addr. Cycles Comment FF ABCD404Miss and memory access at each level: 4 *101 = 404 cycles FF ABCD206PML4 TLB: Hit 2 cycles PDP TLB: Hit 2 cycles DIR TLB: Hit 101 cycles PTE TLB: Hit 101 cycles FF ABCD404TLB miss in all the levels: 4 *101 = 404 cycles FF ABCD305PML4 TLB: 2 cycles PDP: the entry 234 that was filled for the second access was replaced by the entry that was filled in access 3, as it is in the same set  miss DIR & PTE: miss (1 × 2) + (3 × 101) = 305 FF A0CD8Hit in all TLBs: 4 * 2 = 8 TLB 4entries Direct mapped TLB hit: 2 cycles TLB miss: 1 cycle Memory access: 100 cycle