Overheads for Computers as Components 2nd ed.

Slides:

Advertisements

Similar presentations

Virtual Memory In this lecture, slides from lecture 16 from the course Computer Architecture ECE 201 by Professor Mike Schulte are used with permission.

Advertisements

SE-292 High Performance Computing Memory Hierarchy R. Govindarajan

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.

Memory Management Unit

Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.

CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. CPUs zCaches. zMemory management.

Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.

Memory Organization.

Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.

1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)

11/10/2005Comp 120 Fall November 10 8 classes to go! questions to me –Topics you would like covered –Things you don’t understand –Suggestions.

CPUs Input and output. Supervisor mode, exceptions, traps. Co-processors.

Lecture 19: Virtual Memory

© 2000 Morgan Kaufman Overheads for Computers as Components CPUs: Memory System Mechanism zMicroprocessor clock rates are increasing zMemories are falling.

IT253: Computer Organization

1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.

Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)

Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.

1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.

1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.

Virtual Memory Review Goal: give illusion of a large memory Allow many processes to share single memory Strategy Break physical memory up into blocks (pages)

LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”

Summary of caches: The Principle of Locality: –Program likely to access a relatively small portion of the address space at any instant of time. Temporal.

Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.

CSCI206 - Computer Organization & Programming

CS161 – Design and Architecture of Computer

CMSC 611: Advanced Computer Architecture

Memory Hierarchy Ideal memory is fast, large, and inexpensive

Lecture 3 Cache Bus.

Computer Organization

Improving Memory Access The Cache and Virtual Memory

CSE 351 Section 9 3/1/12.

The Memory System (Chapter 5)

Memory COMPUTER ARCHITECTURE

CS161 – Design and Architecture of Computer

Lecture 12 Virtual Memory.

Section 9: Virtual Memory (VM)

CS 704 Advanced Computer Architecture

Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.

Consider a Direct Mapped Cache with 4 word blocks

Morgan Kaufmann Publishers

Lecture 21: Memory Hierarchy

Lecture 21: Memory Hierarchy

Lecture 14 Virtual Memory and the Alpha Memory Hierarchy

Part V Memory System Design

Replacement Policies Assume all accesses are: Cache Replacement Policy

Virtual Memory 4 classes to go! Today: Virtual Memory.

Andy Wang Operating Systems COP 4610 / CGS 5765

Lecture 23: Cache, Memory, Virtual Memory

Lecture 08: Memory Hierarchy Cache Performance

Lecture 22: Cache Hierarchies, Memory

Computer Architecture

Andy Wang Operating Systems COP 4610 / CGS 5765

CPE 631 Lecture 05: Cache Design

Adapted from slides by Sally McKee Cornell University

CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs

Contents Memory types & memory hierarchy Virtual memory (VM)

CSC3050 – Computer Architecture

CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs

Lecture 21: Memory Hierarchy

CS703 - Advanced Operating Systems

Cache - Optimization.

Principle of Locality: Memory Hierarchies

Sarah Diesburg Operating Systems CS 3430

Andy Wang Operating Systems COP 4610 / CGS 5765

Overview Problem Solution CPU vs Memory performance imbalance

Sarah Diesburg Operating Systems COP 4610

Page Main Memory.

Presentation transcript:

Overheads for Computers as Components 2nd ed. CPUs Caches. Memory management. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Caches and CPUs address data cache main memory CPU controller cache address data data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Cache operation Many main memory locations are mapped onto one cache entry. May have caches for: instructions; data; data + instructions (unified). Memory access time is no longer deterministic. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Terms Cache hit: required location is in cache. Cache miss: required location is not in cache. Working set: set of locations used by program in a time interval. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Types of misses Compulsory (cold): location has never been accessed. Capacity: working set is too large. Conflict: multiple locations in working set map to same cache entry. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Memory system performance h = cache hit rate. tcache = cache access time, tmain = main memory access time. Average memory access time: tav = htcache + (1-h)tmain © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Multiple levels of cache L2 cache CPU L1 cache © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Multi-level cache access time h1 = cache hit rate. h2 = hit rate on L2. Average memory access time: tav = h1tL1 + (h2-h1)tL2 + (1- h2-h1)tmain © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Replacement policies Replacement policy: strategy for choosing which cache entry to throw out to make room for a new memory location. Two popular strategies: Random. Least-recently used (LRU). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Cache organizations Fully-associative: any memory location can be stored anywhere in the cache (almost never implemented). Direct-mapped: each memory location maps onto exactly one cache entry. N-way set-associative: each memory location can go into one of n sets. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Cache performance benefits Keep frequently-accessed locations in fast cache. Cache retrieves more than one word at a time. Sequential accesses are faster after first access. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Direct-mapped cache valid tag data 1 0xabcd byte byte byte ... byte cache block tag index offset = hit value © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Write operations Write-through: immediately copy write to main memory. Write-back: write to main memory only when location is removed from cache. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Direct-mapped cache locations Many locations map onto the same cache block. Conflict misses are easy to generate: Array a[] uses locations 0, 1, 2, … Array b[] uses locations 1024, 1025, 1026, … Operation a[i] + b[i] generates conflict misses. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Set-associative cache A set of direct-mapped caches: Set 1 Set 2 Set n ... hit data © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Example: direct-mapped vs. set-associative © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Direct-mapped cache behavior After 001 access: block tag data 00 - - 01 0 1111 10 - - 11 - - After 010 access: block tag data 00 - - 01 0 1111 10 0 0000 11 - - © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Direct-mapped cache behavior, cont’d. After 011 access: block tag data 00 - - 01 0 1111 10 0 0000 11 0 0110 After 100 access: block tag data 00 1 1000 01 0 1111 10 0 0000 11 0 0110 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Direct-mapped cache behavior, cont’d. After 101 access: block tag data 00 1 1000 01 1 0001 10 0 0000 11 0 0110 After 111 access: block tag data 00 1 1000 01 1 0001 10 0 0000 11 1 0100 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

2-way set-associtive cache behavior Final state of cache (twice as big as direct-mapped): set blk 0 tag blk 0 data blk 1 tag blk 1 data 00 1 1000 - - 01 0 1111 1 0001 10 0 0000 - - 11 0 0110 1 0100 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

2-way set-associative cache behavior Final state of cache (same size as direct-mapped): set blk 0 tag blk 0 data blk 1 tag blk 1 data 0 01 0000 10 1000 1 10 0111 11 0100 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Example caches StrongARM: 16 Kbyte, 32-way, 32-byte block instruction cache. 16 Kbyte, 32-way, 32-byte block data cache (write-back). C55x: Various models have 16KB, 24KB cache. Can be used as scratch pad memory. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Scratch pad memories Alternative to cache: Software determines what is stored in scratch pad. Provides predictable behavior at the cost of software control. C55x cache can be configured as scratch pad. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Memory management units Memory management unit (MMU) translates addresses: main memory logical address memory management unit physical address CPU © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Memory management tasks Allows programs to move in physical memory during execution. Allows virtual memory: memory images kept in secondary storage; images returned to main memory on demand during execution. Page fault: request for location not resident in memory. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Address translation Requires some sort of register/table to allow arbitrary mappings of logical to physical addresses. Two basic schemes: segmented; paged. Segmentation and paging can be combined (x86). © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. Segments and pages memory page 1 segment 1 page 2 segment 2 © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Segment address translation segment base address logical address + segment lower bound range error range check segment upper bound physical address © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Page address translation offset page i base concatenate page offset © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Page table organizations descriptor page descriptor flat tree © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Caching address translations Large translation tables require main memory access. TLB: cache for address translation. Typically small. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

Overheads for Computers as Components 2nd ed. ARM memory management Memory region types: section: 1 Mbyte block; large page: 64 kbytes; small page: 4 kbytes. An address is marked as section-mapped or page-mapped. Two-level translation scheme. © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.

ARM address translation Translation table base register 1st index 2nd index offset 1st level table descriptor concatenate concatenate 2nd level table descriptor physical address © 2008 Wayne Wolf Overheads for Computers as Components 2nd ed.