© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. CPUs zCaches. zMemory management.

Slides:



Advertisements
Similar presentations
SE-292 High Performance Computing
Advertisements

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Memory Management Unit
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
1 Lecture 20 – Caching and Virtual Memory  2004 Morgan Kaufmann Publishers Lecture 20 Caches and Virtual Memory.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Memory Management -3 CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Organization.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Computer Organization Cs 147 Prof. Lee Azita Keshmiri.
Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
11/10/2005Comp 120 Fall November 10 8 classes to go! questions to me –Topics you would like covered –Things you don’t understand –Suggestions.
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Lecture 21 Last lecture Today’s lecture Cache Memory Virtual memory
CPUs Input and output. Supervisor mode, exceptions, traps. Co-processors.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Computer Architecture Lecture 28 Fasih ur Rehman.
CMP 301A Computer Architecture 1 Lecture 3. Outline zQuick summary zMultilevel cache zVirtual memory y Motivation and Terminology y Page Table y Translation.
Lecture 19: Virtual Memory
© 2000 Morgan Kaufman Overheads for Computers as Components CPUs: Memory System Mechanism zMicroprocessor clock rates are increasing zMemories are falling.
© 2000 Morgan Kaufman Overheads for Computers as Components CPUs zInput and output. zSupervisor mode, exceptions, traps. zCo-processors.
IT253: Computer Organization
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
1 Memory Management. 2 Fixed Partitions Legend Free Space 0k 4k 16k 64k 128k Internal fragmentation (cannot be reallocated) Divide memory into n (possible.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Operating Systems Unit 7: – Virtual Memory organization Operating Systems.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
1  2004 Morgan Kaufmann Publishers Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality:
Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)
Lecture 20 Last lecture: Today’s lecture: Types of memory
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Operating Systems Session 7: – Virtual Memory organization Operating Systems.
Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.
Virtual Memory By CS147 Maheshpriya Venkata. Agenda Review Cache Memory Virtual Memory Paging Segmentation Configuration Of Virtual Memory Cache Memory.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
CSCI206 - Computer Organization & Programming
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Improving Memory Access The Cache and Virtual Memory
Memory COMPUTER ARCHITECTURE
Lecture 12 Virtual Memory.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Morgan Kaufmann Publishers
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Virtual Memory 4 classes to go! Today: Virtual Memory.
Overheads for Computers as Components 2nd ed.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
Contents Memory types & memory hierarchy Virtual memory (VM)
CSC3050 – Computer Architecture
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CS703 - Advanced Operating Systems
Virtual Memory 1 1.
Presentation transcript:

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. CPUs zCaches. zMemory management.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Caches and CPUs CPU cache controller cache main memory data address data address

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Cache operation zMany main memory locations are mapped onto one cache entry. zMay have caches for: yinstructions; ydata; ydata + instructions (unified). zMemory access time is no longer deterministic.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Terms zCache hit: required location is in cache. zCache miss: required location is not in cache. zWorking set: set of locations used by program in a time interval.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Types of misses zCompulsory (cold): location has never been accessed. zCapacity: working set is too large. zConflict: multiple locations in working set map to same cache entry.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Memory system performance zh = cache hit rate. zt cache = cache access time, t mai n = main memory access time. zAverage memory access time: yt av = ht cache + (1-h)t main

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Multiple levels of cache CPU L1 cache L2 cache

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Multi-level cache access time zh 1 = cache hit rate. zh 2 = hit rate on L2. zAverage memory access time: yt av = h 1 t L1 + (h 2 -h 1 )t L2 + (1- h 2 -h 1 )t main

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Replacement policies zReplacement policy: strategy for choosing which cache entry to throw out to make room for a new memory location. zTwo popular strategies: yRandom. yLeast-recently used (LRU).

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Cache organizations zFully-associative: any memory location can be stored anywhere in the cache (almost never implemented). zDirect-mapped: each memory location maps onto exactly one cache entry. zN-way set-associative: each memory location can go into one of n sets.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Cache performance benefits zKeep frequently-accessed locations in fast cache. zCache retrieves more than one word at a time. ySequential accesses are faster after first access.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Direct-mapped cache valid = tagindexoffset hit value tagdata 10xabcdbyte byte byte... byte cache block

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Write operations zWrite-through: immediately copy write to main memory. zWrite-back: write to main memory only when location is removed from cache.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Direct-mapped cache locations zMany locations map onto the same cache block. zConflict misses are easy to generate: yArray a[] uses locations 0, 1, 2, … yArray b[] uses locations 1024, 1025, 1026, … yOperation a[i] + b[i] generates conflict misses.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Set-associative cache zA set of direct-mapped caches: Set 1Set 2Set n... hit data

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Example: direct-mapped vs. set-associative

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Direct-mapped cache behavior zAfter 001 access: blocktagdata z After 010 access: blocktagdata

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Direct-mapped cache behavior, cont’d. zAfter 011 access: blocktagdata z After 100 access: blocktagdata

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Direct-mapped cache behavior, cont’d. zAfter 101 access: blocktagdata z After 111 access: blocktagdata

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. 2-way set-associtive cache behavior zFinal state of cache (twice as big as direct-mapped): setblk 0 tagblk 0 datablk 1 tagblk 1 data

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. 2-way set-associative cache behavior zFinal state of cache (same size as direct- mapped): setblk 0 tagblk 0 datablk 1 tagblk 1 data

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Example caches zStrongARM: y16 Kbyte, 32-way, 32-byte block instruction cache. y16 Kbyte, 32-way, 32-byte block data cache (write-back). zC55x: yVarious models have 16KB, 24KB cache. yCan be used as scratch pad memory.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Scratch pad memories zAlternative to cache: ySoftware determines what is stored in scratch pad. zProvides predictable behavior at the cost of software control. zC55x cache can be configured as scratch pad.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Memory management units zMemory management unit (MMU) translates addresses: CPU main memory management unit logical address physical address

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Memory management tasks zAllows programs to move in physical memory during execution. zAllows virtual memory: ymemory images kept in secondary storage; yimages returned to main memory on demand during execution. zPage fault: request for location not resident in memory.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Address translation zRequires some sort of register/table to allow arbitrary mappings of logical to physical addresses. zTwo basic schemes: ysegmented; ypaged. zSegmentation and paging can be combined (x86).

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Segments and pages memory segment 1 segment 2 page 1 page 2

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Segment address translation segment base addresslogical address range check physical address + range error segment lower bound segment upper bound

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Page address translation pageoffset pageoffset page i base concatenate

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Page table organizations flattree page descriptor page descriptor

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Caching address translations zLarge translation tables require main memory access. zTLB: cache for address translation. yTypically small.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. ARM memory management zMemory region types: ysection: 1 Mbyte block; ylarge page: 64 kbytes; ysmall page: 4 kbytes. zAn address is marked as section-mapped or page-mapped. zTwo-level translation scheme.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. ARM address translation offset1st index2nd index physical address Translation table base register 1st level table descriptor 2nd level table descriptor concatenate