Download presentation
1
Memory Organization
2
Data Organization Big endian Little endian Alignment
Most significant byte stored in first memory location each additional n bytes stored in next n locations Little endian Least significant byte stored in first memory location each additional n bytes stored in next n locations Alignment -Data requires more than one byte to represent a value. -Memory byte addressed. -Values must be stored in more than one location. -Neither format is better than the other. CPU expects data to be stored in one or the other. Problems come when data will be transferred between computers using different organizations. -
3
Memory Organization/Interfacing
Types ROM Masked ROM PROM EPROM EEPROM
4
Memory… RAM DRAM SRAM Organization Linear Two-dimensional
5
Memory Configuration Single chip Multiple chips
Address bus, data bus, control bus are connected to the memory chip Multiple chips Address bus and control bus connected to the chips Different bits of data bus connected to data pins
6
Computer Architectures
von Neumann Instructions and data stored in same memory module Harvard Separate memory modules for each Modern PCs Harvard used in cache memory
7
Memory Hierarchy Hierarchical memory system Registers Cache
Main memory Secondary memory -One of the most important considerations in understanding the performance capabilities of a processor. -some types of memory far less efficient (cheaper) than others. -computer systems use a combination of memory types to provide the best performance at the best cost.(hierarchical memory approach) -in general, the faster memory is the more expensive it is per bit of storage. -by using a hierarchy of memories (each with different access speeds and storage capabilities) a computer system can exhibit performance above what would be possible without a combination of the various types. -memory is classified based on its distance from the processor -distance is measured in the number of machine cycles it takes to access the memory (closer:faster)
8
Memory Hierarchy Terminology
Hit - Requested data resides in a given level of memory Miss - Requested data not found in the given level of memory Hit rate – percentage of memory accesses found in a given level of memory Miss rate – percentage of memory accesses not found in a given level of memory (1- Hit rate) Hit time – time required to access the requested data in a given level of memory Miss penalty – time required to process a miss -typically, we are concerned with the hit rate only for upper levels of memory -miss penalty includes replacing a block in upper level memory, plus the additional time to deliver the requested data to the processor. Time to process a miss is typically significantly larger than the time to process a hit. -
9
Memory Hierarchy Access Time Registers – 1 ns -> 2ns System
L1 Cache – 3 ns -> 10 ns System L2 Cache – 25 ns -> 50 ns System Main Memory – 30 ns -> 90 ns System Fixed disk – 5 ms -> 29 ms Online Optical disk – 100 ms -> 5s Near line Magnetic – 10 s -> 3 m Offline
10
Locality of Reference Temporal locality – recently accessed items tend to be accessed again in the near future Spatial locality – accesses tend to be clustered in the address space (arrays or loops) Sequential locality – instructions tend to be accessed sequentially Processors access memory in a patterned way. If memory location X is accessed at time t, there is a high probability that location X+1 will be accessed in the near future. Locality of reference can be exploited by implementing the memory as a hierarchy; when a miss is processed, instead of simply transferring the requested data to a higher level, the entire block containing the data is transferred. Since it is likely that the additional data in the block will be needed in the near future, and if so, this data can be loaded quickly from the faster memory. - this principle provides the opportunity for a system to use a small amount of very fast memory to effectively accelerate the majority of memory accesses.
11
Cache Small High speed Temporarily stores data from frequently used memory locations Connected to main memory Very high speed, small amount Data from frequently used memory locations is temporarily stored L2 typically 256K or 512K – resides between the CPU and main memory L1 smaller (8K or 16K) – resides on processor Purpose is to speed up memory accesses by storing recently used data closer to the CPU instead of storing it in main memory. Cache composed of SRAM Cache is not accessed by address; it is accessed by content (content addressable memory)
12
Cache Mapping Schemes The mapping scheme determines where the data is placed when it is originally copied into cache and provides a method for the CPU to find previously copied data when searching cache Direct mapped cache Fully associative cache Set associative cache For cache to be functional it must store useful data. The data isn’t useful, though, if the CPU can’t find it. When accessing data or instructions the CPU first generates a main memory address. If the data has been copied to cache the address of the data in cache is not the same as the main memory address. How does the CPU find the data when it is in cache? It uses a specific mapping scheme that “converts” the main memory address into a cache location by giving special significance to the bits in the main memory address. The bits are divided into distinct groups called fields. Depending on the mapping scheme there may be 2 or 3 fields. How the fields are used depends on the mapping scheme as well.
13
Direct Mapped Cache Modular approach
Block X of main memory is mapped to block Y of cache, mod N, where N is the total number of blocks in cache. In direct mapping the binary main memory address is partitioned into the fields shown: More main memory blocks than cache blocks. Main memory blocks compete for cache locations. Inexpensive, restrictive approach A given block of memory can only be placed in a certain block in cache. Tag Block Word
14
Example Small system with 16 words of main memory divided into 8 blocks (each block has 2 words). Assume cache is 4 blocks in size (total of 8 words). Main memory address has 4 bits (24 = 16 words) 4-bit address is divided into three fields word field: 1 bit, block field: 2 bits, tag field: 1 bit Mapping: Main Memory Maps to Cache Block 0 (addresses 0,1) Block 0 Block 1 (addresses 2,3) Block 1 Block 2 (addresses 4,5) Block 2 Block 3 (addresses 6,7) Block 3 Block 4 (addresses 8, 9) Block 0 Block 5 (addresses 10, 11) Block 1 Block 6 (addresses 12, 13) Block 2 Block 7 (addresses 14, 15) Block 3
15
Main Memory Address 9 = 10012 Split into fields: tag = 1 (1 bit) block = 00 (2 bits) word = 1 (1 bit)
16
Fully Associative Cache
Built from associative memory so it can be searched in parallel. A single search must compare the requested tag to all tags in cache to determine if the block is present. Special hardware required to allow associative searching (expensive). Block of memory can be placed in any block in cache. Not as restrictive as direct mapping. Requires a larger tag to be stored, which results in a larger cache.
17
Set Associative Cache N-way set associative cache mapping
Combination of direct mapped and fully associative The address maps the block to a set of cache blocks Address is partitioned into three fields: tag, set, and word. This scheme is in the middle between Fully associative and direct mapped. All sets in cache must be the same size 2-way associative cache each set is two blocks, 8-way 8 cache blocks per set, etc Tag and word field are the same as in direct mapped. Set field indicates into which cache set the main memory block maps. 2-way assoc. mapping with a main memory of 2^14 words, a cache of 16 blocks each of 8 words = 8 sets in cache. The main memory address has to be 14 bits long, the set field then has to be 3 bits, the word field is 3 bits, and the tag field is the remaining 8 bits.
18
Main Memory Medium speed Much larger than cache
Complemented by a very large secondary memory Composed of DRAM
19
Secondary Memory Very large Slower access Hard disk Removable media
20
Virtual Memory Virtual Address Physical Address Mapping Page Frames
Pages Paging Fragmentation Page Fault use hard disk space as an extension of RAM. Increases the available address space a process can use. This allows a program to run when only specific pieces are present in memory. Parts not currently being used are stored in the page file on disk. Even 512 MB RAM is not enough memory to hold multiple applications concurrently and the OS Area on the hard drive used for virtual memory is called a page file. Most common way to implement virtual memory is paging Virtual Address – The logical or program address that the process uses. Whenever the CPU generates an address, it is always in terms of virtual address space. physical address – The real address in physical memory mapping – The mechanism by which virtual addresses are translated into physical ones (similar to cache mapping) page frames – the equal size chunks or blocks into which main memory is divided pages – the chunks or blocks into which virtual memory (the logical address space) is divided, each equal in size to a page frame. Virtual pages are stored on disk until needed. paging – the process of copying a virtual page from disk to a page frame in main memory (most popular implementation of VM. VM can also be implemented with segmentation or a combination of paging and segmentation) fragmentation – memory that becomes unstable (system allocates more memory to a process than it needs because it has to allocate a page) page fault – a requested page is not in main memory and must be copied into memory from disk. success of paging is very dependant on the locality principle just like caching
21
Paging Allocate physical memory to processes in fixed size chunks
Page table in main memory (typically) N rows (N = # of virtual pages in the process) valid bit 0 page is not in main memory 1 page is in main memory every process has its own page table which resides in main memory. page table stores the physical location of each virtual page of the process. Additional fields can be added to the page table to provide more information: dirty bit- (aka modify bit) indicates whether the page has been changed. If the page has not been modified it does not need to be rewritten to disk. usage bit – indicate page usage. 1 whenever the page is accessed; set to 0 after a certain period of time.
22
How Paging Works Extract the page number Extract the offset
Translate page number into physical page frame number using page table Os dynamically translates a virtual address generated by a process into the physical address in main memory where the data actually resides. To convert the address the virtual address is divided into two fields: page and offset the offset field represents the offset within the page where the data is located.
23
How Paging Works cont’d
look up page number in page table check the valid bit valid bit = 0 system generates a page fault and OS must intervene locate the page on disk find a free page frame copy the page into the free page frame update the page table resume execution of process if a process has free frames in main memory when a page fault occurs, the newly retrieved page can be placed in any of those free frames. If the memory allocated to the process is full, a victim page must be selected. Replacement algorithms used to select a victim page include FIFO, Random, and LRU (least recently used)
24
How Paging Works cont’d
valid bit = 1 page is in memory replace virtual page number with actual frame number access data at offset in physical page frame add the offset to the frame number for the given virtual page
25
Example process has a virtual address space of 28 words
physical memory of 4 page frames 32 word in length Virtual address is 8 bits Physical address is 7 bits in this example the system has no cache. physical address is 7 bits because 4 frames of 32 words each is 128 words = 2^7
26
Example cont’d 2 fields of virtual address
Page – 3 bits offset – 5 bits system generates virtual address 13 ( in binary) Page 000 Offset 01101 Physical address = offset must be 5 bits because 2^5 = 32. Need 5 bits to address 32 words use page field as an index into the page table 0th entry in page table virtual page 0 maps to physical page frame 2 (10 in binary) Translated physical address becomes page frame 2 offset 13. combine the page frame (10) and offset (01101) for the physical address. Physical address only has 7 bits (2 for the frame (4 frames), and 5 for the offset).
27
Access Time Time penalty associated with virtual memory
two physical memory accesses for each memory access the processor generates two memory accesses: 1. page table, 2. the actual data
28
Disadvantages extra resource consumption
memory overhead for storing page tables special hardware and OS support required
29
Advantages Programs are no longer restricted by the amount of physical memory available Easier to write programs don’t need to worry about physical address space limitations Allows multitasking virtual memory allows us to run programs whose virtual address space is larger than physical memory
30
Segmentation virtual address space divided into logical variable length units (segments) To copy a segment into memory OS looks for a chunk of free memory large enough Segment base address – where located in memory bounds limit – indicates its size Segment table – base/bounds pairs Physical memory isn’t really divided into anything Memory accesses are translated into segment number and an offset within the segment Check is performed to make sure the offset is within the segment If offset is within bounds, base value for the segment (from the segment table) is added to the offset to yield the physical address
31
Segmentation External fragmentation
as segments are copied into and out of memory free chunks of memory are broken up eventually many small chunks none big enough for any segment Garbage collection combats external fragmentation Enough total memory may exist but it exists as a large number of small, unusable holes garbage collection shuffles occupied chunks of memory to collect the smaller, fragmented chunks into fewer, larger, usable chunks. Similar to defragmenting a disk drive.
32
Paging and Segmentation
Systems can use a combination Virtual address space is divided into segments of variable length Segments are divided into fixed-size pages Main memory is divided into the same size frames Each segment has a page table Physical address divided into 3 fields: segment, page number, offset Paging is easier to manage allocation, freeing, swapping, and relocating are easy when everything is the same size Segmentation less overhead segments are usually larger than pages Segmentation eliminates internal fragmentation Paging eliminates external fragmentation Segmentation has the ability to support sharing and protection (difficult with paging) -segment field points the system to the correct page table, page number is used as an offset into the page table, offset is offset within the page Combination advantageous because it allows for segmentation from the user’s point of view and paging from the system point of view
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.