Computer Organization

Slides:



Advertisements
Similar presentations
MEMORY popo.
Advertisements

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 23, 2002 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
CMSC 611: Advanced Computer Architecture Cache Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from.
M. Mateen Yaqoob The University of Lahore Spring 2014.
Chapter 12 Memory Organization
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
CIS °The Five Classic Components of a Computer °Today’s Topics: Memory Hierarchy Cache Basics Cache Exercise (Many of this topic’s slides were.
Caching I Andreas Klappenecker CPSC321 Computer Architecture.
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Overview: Memory Memory Organization: General Issues (Hardware) –Objectives in Memory Design –Memory Types –Memory Hierarchies Memory Management (Software.
IT Systems Memory EN230-1 Justin Champion C208 –
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 1 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Basic concepts Maximum size of the memory depends on the addressing scheme: 16-bit computer generates 16-bit addresses and can address up to 216 memory.
CS1104: Computer Organisation School of Computing National University of Singapore.
1 CSCI 2510 Computer Organization Memory System I Organization.
Lecture 14 Memory Hierarchy and Cache Design Prof. Mike Schulte Computer Architecture ECE 201.
EEE-445 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output Cache Main Memory Secondary Memory (Disk)
Memory System Unit-IV 4/24/2017 Unit-4 : Memory System.
1 Memory Hierarchy The main memory occupies a central position by being able to communicate directly with the CPU and with auxiliary memory devices through.
CIM101 : Introduction to computer Lecture 3 Memory.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
EEL5708/Bölöni Lec 4.1 Fall 2004 September 10, 2004 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Review: Memory Hierarchy.
The Goal: illusion of large, fast, cheap memory Fact: Large memories are slow, fast memories are small How do we create a memory that is large, cheap and.
MEMORY ORGANIZATION - Memory hierarchy - Main memory - Auxiliary memory - Cache memory.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
A memory is just like a human brain. It is used to store data and instructions. Computer memory is the storage space in computer where data is to be processed.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Computer Architecture Lecture 25 Fasih ur Rehman.
CACHE _View 9/30/ Memory Hierarchy To take advantage of locality principle, computer memory implemented as a memory hierarchy: multiple levels.
Index What is an Interface Pins of 8085 used in Interfacing Memory – Microprocessor Interface I/O – Microprocessor Interface Basic RAM Cells Stack Memory.
Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 26 Memory Hierarchy Design (Concept of Caching and Principle of Locality)
Components of Computer. Memory Unit Most important part of the computer Used to store data and instructions that are currently in use Main memory consists.
Chapter 2 content Basic organization of computer What is motherboard
CMSC 611: Advanced Computer Architecture
CS 704 Advanced Computer Architecture
COSC3330 Computer Architecture
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Computer Organization
Yu-Lun Kuo Computer Sciences and Information Engineering
The Goal: illusion of large, fast, cheap memory
Internal Memory.
Memory Main memory consists of a number of storage locations, each of which is identified by a unique address The ability of the CPU to identify each location.
Memory Main memory consists of a number of storage locations, each of which is identified by a unique address The ability of the CPU to identify each location.
Fundamental Concepts The Memory System.
Module IV Memory Organization.
CACHE MEMORY.
Memory chips Memory chips have two main properties that determine their application, storage capacity (size) and access time(speed). A memory chip contains.
Module IV Memory Organization.
Computer Memory BY- Dinesh Lohiya.
CMSC 611: Advanced Computer Architecture
Memory Organization.
EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007
Morgan Kaufmann Publishers Memory Hierarchy: Introduction
MICROPROCESSOR MEMORY ORGANIZATION
Hardware Main memory 26/04/2019.
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache Memory and Performance
Lecture 2 (Memory) Computer Organization and Assembly Language. (CSC-210) Dr. Mohammad Ammad uddin.
Memory & Cache.
Presentation transcript:

Computer Organization 2015 - 2016

MEMORY 1

Review: Major Components of a Computer Processor Devices Control Input Memory Datapath Output Cache Main Memory Secondary Memory (Disk)

Basic concepts Maximum size of the memory depends on the addressing scheme: 16-bit computer generates 16-bit addresses and can address up to 216 memory locations. Most modern computers are byte-addressable. Memory is designed to store and retrieve data in word-length quantities. Word length of a computer is commonly defined as the number of bits stored and retrieved in one memory operation.

If the address bus is k-bits, then the length of MAR is k bits. Processor Memory k -bit address bus MAR n -bit 2k locations data bus Up to 2 k addressable MDR Word length = n bits Control lines (R/w and other control signals) Recall that the data transfers between a processor and memory involves two registers MAR and MDR. If the address bus is k-bits, then the length of MAR is k bits. If the word length is n-bits, then the length of MDR is n bits. Control lines include R/W and MFC (Memory Function Complete). For Read operation R/W = 1 and for Write operation R/W = 0.

Basic concepts (cont.) Measures for the speed of a memory: Elapsed time between the initiation of an operation and the completion of an operation is the memory access time (e.g. the time between the Read & MFC signals). Minimum time between the initiation of two successive memory operations is memory cycle time (e.g. the time between two successive Read operations) Memory Cycle time is slightly longer than memory access time .

Basic concepts (cont.) In general, the faster a memory system, the costlier it is and the smaller it is. An important design issue is to provide a computer system with as large and fast a memory as possible, within a given cost target. Several techniques to increase the effective size and speed of the memory: Cache memory (to increase the effective speed). Virtual memory (to increase the effective size).

RAM memories Random Access Memory (RAM) memory unit is a unit where any location can be addressed in a fixed amount of time, independent of the location’s address. Internal organization of memory chips: Each memory cell can hold one bit of information. Memory cells are organized in the form of an array. One row is one memory word. All cells of a row are connected to a common line, known as the “word line”.

RAM memories Internal organization of memory chips size=128 bits (16x8)

RAM memories (cont.) Static RAMs (SRAMs): Consist of circuits that are capable of retaining their state as long as the power is applied. Volatile memories, because their contents are lost when power is interrupted. Access times of static RAMs are in the range of few nanoseconds. Capacity is low (6 transistor cells which making it impossible to pack a large number of cells onto a single chip) However, the cost is usually high.

RAM memories (cont.) Dynamic RAMs (DRAMs): Also Volatile memories. Do not retain their state indefinitely. Contents must be periodically refreshed. Contents may be refreshed while accessing them for reading. Access time is longer than SRAM Capacity is higher than SRAM (1 transistor cells) Cost is lower than SRAM

RAM memories (cont.) Various factors such as cost, speed, power consumption and size of the chip determine how a RAM is chosen for a given application. Static RAMs: Chosen when speed is the primary concern. Circuit implementing the basic cell is highly complex, so cost and size are affected. Used mostly in cache memories. Dynamic RAMs: Predominantly used for implementing computer main memories. High capacities available in these chips. Economically suitable for implementing large memories.

Read-Only Memories (ROMs) Many applications need memory devices to retain contents after the power is turned off. For example, computer is turned on, the operating system must be loaded from the disk into the memory. Store instructions which would load the OS from the disk. Need to store these instructions so that they will not be lost after the power is turned off. We need to store the instructions into a non-volatile memory.

Read-Only Memories (cont.) Read-Only Memory: Data are written into a ROM when it is manufactured. Programmable Read-Only Memory (PROM): Allow the data to be loaded by a user. Process of inserting the data is irreversible. Storing information specific to a user in a ROM is expensive. Providing programming capability to a user may be better.

Read-Only Memories (cont.) Erasable Programmable Read-Only Memory (EPROM): Stored data to be erased and new data to be loaded. Flexibility, useful during the development phase of digital systems. Erasure requires exposing the ROM to ultraviolet light. Electrically Erasable Programmable Read-Only Memory (EEPROM): EEPROMs the contents can be stored and erased electrically.

Secondary Memory Secondary Memory (Magnetic disks): Storage provided by DRAMs is higher than SRAMs, but is still less than what is necessary. Secondary storage such as magnetic disks provide a large amount of storage, but is much slower than DRAMs.

The Memory Hierarchy Goal Fact: Large memories are slow and fast memories are small How do we create a memory that gives the illusion of being large, cheap and fast (most of the time)? With hierarchy With parallelism

Memory Hierarchy Speed (%cycles): ½’s 1’s 10’s 100’s 10,000’s On-Chip Components Control Secondary Memory (Disk) Instr Cache Second Level Cache (SRAM) Main Memory (DRAM) Datapath Instead, the memory system of a modern computer consists of a series of black boxes ranging from the fastest to the slowest. Besides variation in speed, these boxes also varies in size (smallest to biggest) and cost. What makes this kind of arrangement work is one of the most important principle in computer design. The principle of locality. The principle of locality states that programs access a relatively small portion of the address space at any instant of time. The design goal is to present the user with as much memory as is available in the cheapest technology (points to the disk). While by taking advantage of the principle of locality, we like to provide the user an average access speed that is very close to the speed that is offered by the fastest technology. (We will go over this slide in detail in the next lectures on caches). RegFile Data Cache Speed (%cycles): ½’s 1’s 10’s 100’s 10,000’s Size (bytes): 100’s 10K’s M’s G’s T’s Cost: highest lowest

The Memory Hierarchy: How Does it Work? Cache memory is based on the concept of locality of reference. If active segments (instructions or operands) of a program are placed in a fast cache memory, then the execution time can be reduced. Temporal Locality (locality in time) If a memory location is referenced then it will tend to be referenced again soon  Keep most recently accessed data items closer to the processor Spatial Locality (locality in space) If a memory location is referenced, the locations with nearby addresses will tend to be referenced soon  Move blocks consisting of contiguous words closer to the processor How does the memory hierarchy work? Well it is rather simple, at least in principle. In order to take advantage of the temporal locality, that is the locality in time, the memory hierarchy will keep those more recently accessed data items closer to the processor because chances are (points to the principle), the processor will access them again soon. In order to take advantage of the spatial locality, not ONLY do we move the item that has just been accessed to the upper level, but we ALSO move the data items that are adjacent to it. +1 = 15 min. (X:55)

The Memory Hierarchy: Terminology Block: the minimum unit of information that is present in a cache Hit Rate: the fraction of memory accesses found in a level of the memory hierarchy Hit Time: Time to access that level which consists of Time to access the block + Time to determine hit/miss A HIT is when the data the processor wants to access is found in the upper level (Blk X). The fraction of the memory access that are HIT is defined as HIT rate. HIT Time is the time to access the Upper Level where the data is found (X). It consists of: (a) Time to access this level. (b) AND the time to determine if this is a Hit or Miss. If the data the processor wants cannot be found in the Upper level. Then we have a miss and we need to retrieve the data (Blk Y) from the lower level. By definition (definition of Hit: Fraction), the miss rate is just 1 minus the hit rate. This miss penalty also consists of two parts: (a) The time it takes to replace a block (Blk Y to BlkX) in the upper level. (b) And then the time it takes to deliver this new block to the processor. It is very important that your Hit Time to be much much smaller than your miss penalty. Otherwise, there will be no reason to build a memory hierarchy.

The Memory Hierarchy: Terminology (cont.) Miss Rate: the fraction of memory accesses not found in a level of the memory hierarchy  (Hit Rate) Miss Penalty: Time to replace a block in that level with the corresponding block from a lower level which consists of Time to access the block in the lower level + Time to transmit that block to the level that experienced the miss + Time to insert the block in that level + Time to pass the block to the requestor Hit Time << Miss Penalty

How is the Hierarchy Managed? registers  memory by compiler (programmer?) cache  main memory by the cache controller hardware main memory  disks by the operating system (virtual memory) by the programmer (files)

Cache Basics Main Processor Cache memory Processor issues a Read request, a block of words is transferred from the main memory to the cache, one word at a time.

Cache Basics (cont.) Which blocks in the main memory are in the cache is determined by a “mapping function” When the cache is full, and a block of words needs to be transferred from the main memory, another block of words in the cache must be replaced. This is determined by a “replacement algorithm”.

Mapping functions Mapping functions determine how memory blocks are placed in the cache. A simple processor example: Cache consisting of 128 blocks of 16 words each. Total size of cache is 2048 (2K) words. Main memory is addressable by a 16-bit address. Main memory has 64K words. Main memory has 4K blocks of 16 words each. Consecutive addresses refer to consecutive words.

Mapping functions (cont.) Three mapping functions: Direct mapping Associative mapping Set-associative mapping

Direct mapping Block j of the main memory maps to j % 128 of Cache Block 1 tag Block 0 tag Block 1 Block 127 Block j of the main memory maps to j % 128 of the cache. Memory Blocks 0,128,256, …maps to cache block 0. Memory Blocks 1, 129, 257, …maps to cache block 1. And so on. More than one memory block is mapped onto the same position in the cache. May lead to contention for cache blocks even if the cache is not full. Resolve the contention by allowing new block to replace the old block, leading to a trivial replacement algorithm. Block 128 tag Block 127 Block 129 Block 255 7 4 Main memory address T ag Block W ord 5 Block 256 Block 257 Block 4095

Direct mapping Memory address is divided into three fields: Main memory Block 0 Cache Block 1 tag Block 0 tag Block 1 Block 127 Memory address is divided into three fields: - Low order 4 bits determine one of the 16 words in a block. -When a new block is brought into the cache, the next 7 bits determine which cache block this new block is placed in. - High order 5 bits determine which of the possible 32 set of blocks of memory is currently present in the cache. Tag size = No of Blocks of memory / No of Blocks of cache = 4096/128 = 32 = 2^5 Simple to implement but not very flexible. Block 128 tag Block 127 Block 129 Block 255 7 4 Main memory address T ag Block W ord 5 Block 256 Block 257 Block 4095

Main memory Associative mapping Block 0 Cache Block 1 tag Block 0 tag Block 1 Block 127 Main memory block can be placed into any cache position. Memory address is divided into two fields: - Low order 4 bits identify the word within a block. - High order 12 bits or tag bits identify a memory block when it is resident in the cache. Flexible, and uses cache space efficiently. Replacement algorithms can be used to replace an existing block in the cache when the cache is full. Cost is higher than direct-mapped cache because of the need to search all 128 patterns to determine whether a given block is in the cache. Block 128 tag Block 127 Block 129 Block 255 T ag W ord Block 256 12 4 Block 257 Main memory address Block 4095

Set-Associative mapping Main memory Block 0 Set-Associative mapping tag Cache Block 0 Block 1 Block 126 Block 2 Block 3 Block 127 Set 0 Set 1 Set 63 Block 1 Block 63 Set-associative mapping is a combination of direct and associative mapping. Blocks of cache are grouped into sets. Mapping function allows a block of the main memory to reside in any block of a specific set. Divide the cache into 64 sets, with two blocks per set. Memory block 0, 64, 128 etc. map to set 0, and they can occupy either of the two positions. Block 64 Block 65 Block 127 Block 128 T ag Set W ord Block 129 6 6 4 Main memory address Block 4095

Set-Associative mapping Main memory Block 0 Set-Associative mapping tag Cache Block 0 Block 1 Block 126 Block 2 Block 3 Block 127 Set 0 Set 1 Set 63 Block 1 Block 63 Memory address is divided into three fields: - Low order 4 bits identify the word within a block. - 6 bit field determines the set number. - High order 6 bit fields are compared to the tag fields of the two blocks in a set. Tag size = No of Blocks of memory / No of sets of cache = 4096/64 = 64= 2^6 - Number of blocks per set is a design parameter. - One extreme is to have all the blocks in one set, requiring no set bits (fully associative mapping). - Other extreme is to have one block per set, is the same as direct mapping. Block 64 Block 65 Block 127 Block 128 T ag Set W ord Block 129 6 6 4 Main memory address Block 4095