Microprocessor-based systems Curse 7 Memory hierarchies.

Slides:



Advertisements
Similar presentations
COMP375 Computer Architecture and Organization Senior Review.
Advertisements

1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Memory system.
Virtual Memory Chapter 18 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
S.1 Review: The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Organization.
Caching I Andreas Klappenecker CPSC321 Computer Architecture.
Overview: Memory Memory Organization: General Issues (Hardware) –Objectives in Memory Design –Memory Types –Memory Hierarchies Memory Management (Software.
 2004 Deitel & Associates, Inc. All rights reserved. Chapter 9 – Real Memory Organization and Management Outline 9.1 Introduction 9.2Memory Organization.
Memory Systems Architecture and Hierarchical Memory Systems
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Computer Architecture Lecture 28 Fasih ur Rehman.
1. Memory Manager 2 Memory Management In an environment that supports dynamic memory allocation, the memory manager must keep a record of the usage of.
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
IT253: Computer Organization
0 High-Performance Computer Architecture Memory Organization Chapter 5 from Quantitative Architecture January 2006.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
VIRTUAL MEMORY By Thi Nguyen. Motivation  In early time, the main memory was not large enough to store and execute complex program as higher level languages.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
Introduction to Virtual Memory and Memory Management
Structure of Computer Systems Curse 9 Memory hierarchy.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Memory Management. Why memory management? n Processes need to be loaded in memory to execute n Multiprogramming n The task of subdividing the user area.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Jeffrey Ellak CS 147. Topics What is memory hierarchy? What are the different types of memory? What is in charge of accessing memory?
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
Chapter 7 Memory Management Eighth Edition William Stallings Operating Systems: Internals and Design Principles.
1 Contents Memory types & memory hierarchy Virtual memory (VM) Page replacement algorithms in case of VM.
Introduction to computer architecture April 7th. Access to main memory –E.g. 1: individual memory accesses for j=0, j++, j
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
CMSC 611: Advanced Computer Architecture
CS 704 Advanced Computer Architecture
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Computer Organization
Memory COMPUTER ARCHITECTURE
CS352H: Computer Systems Architecture
Lecture 12 Virtual Memory.
Chapter 9 – Real Memory Organization and Management
Cache Memory Presentation I
Morgan Kaufmann Publishers Memory & Cache
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Operating Modes UQ: State and explain the operating modes of X86 family of processors. Show the mode transition diagram highlighting important features.(10.
CMSC 611: Advanced Computer Architecture
MICROPROCESSOR MEMORY ORGANIZATION
Performance metrics for caches
Performance metrics for caches
Performance metrics for caches
Virtual Memory Overcoming main memory size limitation
Contents Memory types & memory hierarchy Virtual memory (VM)
Computer Architecture
Performance metrics for caches
CS703 - Advanced Operating Systems
Main Memory Background
Fundamentals of Computing: Computer Architecture
Cache Memory Rabi Mahapatra
Performance metrics for caches
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Microprocessor-based systems Curse 7 Memory hierarchies

Performance features of memories SRAMDRAMHD, CD Capacitysmall 1-64ko Medium 256-2Go Big Go Access timeSmall 1-10ns Medium 15-70ns Big 1-10ms Costbigmediumsmall

Memory hierarchies Processor Cache Internal memory (operative) Virtual memory SRAM DRAM HD, CD, DVD

Principles in favor of memory hierarchies  Temporal locality – if a location is accessed at a given time it has a high probability of being accessed in the near future examples: exaction of loops (for, while, etc.), repeated processing of some variables  Spatial locality – if a location is accessed than its neighbors have a high probability of being accessed in the near future examples: loops, vectors and records processing  90/10 – 90% of the time the processor executes 10% of the program  The idea: to bring memory zones with higher probability of access in the future, closer to the processor

Cache memory  High speed, low capacity memory  The closest memory to the processor  Organization: lines of cache memories  Keeps copies of zones (lines) from the main (internal) memory  The cache memory is not visible for the programmer  The transfer between the cache and the internal memory is made automatically under the control of the Memory Management Unit (MMU)

Typical cache memory parameters Parameter Value Memory dimension 32kocteţi-16Moctet Dimension of a cache line bytes Access time ns Speed (bandwidth) Mbytes/sec. Circuit types Processor’s internal RAM or external static RAM

Design of cache memory o Design problems: 1. Which is the optimal length of a cache line ? 2. Where should we place a new line ? 3. How do we find a location in the cache memory ? 4. Which line should be replace if the memory is full and a new data is requested ? 5. How are the “write” operations solved ?  Cache memory architectures: cache memory with direct mapping associative cache memory set associative cache memory cache memory organized on sectors

Cache memory with direct mapping Tag

Cache memory with direct mapping  Principle: the address of the line in the cache memory is determined directly from the location’s physical address – direct mapping the tag is used to identify lines with the same position in the cache memory  Advantages: simple to implement easy to place, find and replace a cache line  Drawbacks: in some cases, repeated replacement of lines even if the cache memory is not full inefficient use of the cache memory space

Associative cache memory

 Principle: a line is placed in any free zone of the cache memory a location is found comparing its descriptor with the descriptors of lines present in the cache memory  hardware comparison – (too) many compare circuits  sequential comparison –too slow  advantages: efficient use of the cache memory's capacity  Drawback: limited number of cache lines, so limited cache capacity – because of the comparison operation

Set associative cache memory

 Principle: combination of associative and direct mapping design: lines organized on blocks block identification through direct mapping line identification (inside the block) through associative method  Advantages: combines the advantages of the two techniques:  many lines are allowed, no capacity limitation  efficient use of the whole cache capacity  Drawback: more complex implementation

Cache memory organized on sectors

 Principle: similar with the Set associative cache, but: the order is changed, the sector (block) is identified through associative method and the line inside the sector with direct mapping  Advantages and drawbacks: similar with the previous method

Writing operation in the cache memory  The problem: writing in the chache memory generates inconsistency between the main mamory and the copy in the cache  Two techniques: Write back – writes the data in the internal memory only when the line is downloaded (replaced) from the cache memory  Advantage: write operations made at the speed of the cache memory – high efficiency  Drawback: temporary inconsistency between the two memories – it may be critical in case of multi-master (e.g. multi-processor) systems, because it may generate errors Write through – writes the data in the cache and in the main memory in the same time  Advantage: no inconsistency  Drawback: write operations are made at the speed of the internal memory (much lower speed) but, write operations are not so frequent (1 write from 10 read-write operations)

The efficiency of the cache memory  t a = t c + (1-R s )*t i where:  t a – average access time  t i – access time of the internal memory  t c – access time of the cache memory  R s – success rate  (1-R s ) – miss rate

Virtual memory  Objectives: Extension of the internal memory over the external memory Protection of memory zones from un- authorized accesses  Implementation techniques: Paging Segmentation

 Divide the memory into blocks (segments)  A location is addressed with: Segment_address+Offset_address = Physical_address  Attributes attached to a segment control the operations allowed in the segment and describe its content  Advantages: access of a program or task is limited to the locations contained in segments allocated to it memory zones may be separated according to their content or destination: cod, date, stivă a location address inside of a segment require less address bits – it’s only a relative/offset address  consequence: shorter instructions, less memory required segments may be placed in different memory zones  changing the location of a program does not require the change of relative addresses (e.g. label addresses, variable addresses)

Segmentation for Intel Processors Address computation in Real mode Address computation in Protected mode

Segmentation for Intel Processors  Details about segmentation in Protected mode: Selector:  contains: Index – the place of a segment descriptor in a descriptor table TI – table identification bit: GDT or LDT RPL – requested privilege level – privilege level required for a task in order to access the segment Segment descriptor:  controls the access to the segment through: the address of the segment length of the segment access rights (privileges) flags Descriptor tables:  General Descriptor Table (GDT) – for common segments  Local Descriptor Tables (LDT) – one for each task; contains descriptors for segments allocated to one task Descriptor types:  Descriptors for Code or Data segments  System descriptors  Gate descriptors – controlled access ways to the operating system

Protection mechanisms assured through segmentation (Intel processors)  Access to the memory (only) through descriptors preserved in GDT and LDT GDT keeps the descriptors for segments accessible for more tasks LDT keeps the descriptors of segments allocated for just one task => protected segments  Read and write operations are allowed in accordance with the type of the segment (Code of data) and with some flags (contained in the descriptor) for Code segments: instruction fetch and maybe read data for Data segments: read and maybe write operations  Privilege levels: 4 levels, 0 most privileged, 3 least privileged levels 0,1, and 2 allocated to the operating system, the last to the user programs a less privileged task cannot access a more privileged segment (e.g. a segment belonging to the operating system)

Paging  Internal and external memory is divided in blocks (pages) of fixed length  The internal memory is virtually extended over the external memory (e.g. hard disc)  Only those pages are brought in the internal memory that have a high probability of being used in the future justified by the temporal and spatial locality and 90/10 principles  Implementation – similar with the cache memory  Design issues: Optimal dimension of a page Placement of a new page in the internal memory Finding the page in the memory Selecting the page for download – in case the internal memory is full Implementation of “write” operations

Paging – implementation through associative technique

Paging implemented in Intel processors

Paging – Write operation  Problem: inconsistency between the internal memory and the virtual one it is critical in case of multi-master (multi-processor) systems  Solution: Write back the write through technique is not feasible because of the very low access time of the virtual (external) memory