Download presentation
Presentation is loading. Please wait.
Published byMabel Joseph Modified over 8 years ago
1
现代计算机体系结构 主讲教师:张钢 教授 天津大学计算机学院 通信邮箱: gzhang@tju.edu.cn 提交作业邮箱: tju_arch@163.com 2013 年 1
2
2 The Main Contents 课程主要内容 Chapter 1. Fundamentals of Quantitative Design and Analysis Chapter 2. Memory Hierarchy Design Chapter 3. Instruction-Level Parallelism and Its Exploitation Chapter 4. Data-Level Parallelism in Vector, SIMD, and GPU Architectures Chapter 5. Thread-Level Parallelism Chapter 6. Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Appendix A. Pipelining: Basic and Intermediate Concepts
3
3 Main Memory Some definitions: – Bandwidth (bw): Bytes read or written per unit time – Latency: Described by Access Time: Delay between access initiation & completion – For reads: Present address till result ready. Cycle time: Minimum interval between separate requests to memory. – Address lines: Separate bus CPU Mem to carry addresses. (Not usu. counted in BW figures.) – RAS (Row Access Strobe) First half of address, sent first. – CAS (Column Access Strobe) Second half of address, sent second.
4
4 RAS vs. CAS (save address pin) DRAM bit-cell array 1. RAS selects a row 2. Parallel readout of all row data 3. CAS selects a column to read 4. Selected bit written to memory bus
5
5 Types of Memory SRAM (Static Random Access Memory) – Cell voltages are statically (unchangingly) tied to power supply references. No drift, no refresh. – But needs 4-6 transistors per bit. DRAM (Dynamic Random Access Memory) – Cell design needs only 1 transistor per bit stored. – Cell charges leak away and may dynamically (over time) drift from their initial levels. – Requires periodic refreshing to correct drift e.g. every 8 ms – Time spent refreshing kept to <5% of BW DRAM: 4-8x larger, 8-16x slower, 8-16x cheaper/bit
6
6 Typical DRAM Organization (256 Mbit) High 14 bits Low 14 bits
7
7 Amdahl/Case Rule Memory size (and I/O b/w) should grow linearly with CPU speed – Typical: 1 MB main memory, 1 Mbps I/O b/w per 1 MIPS CPU performance. Takes a fairly constant ~8 seconds to scan entire memory (if memory bandwidth = I/O bandwidth, 4 bytes/load, 1 load/4 instructions, and if latency not a problem) Moore’s Law: – DRAM size doubles every 18 months (up 60%/yr) – Tracks processor speed improvements Unfortunately, DRAM latency has only decreased 7%/yr! Latency is a big deal.
8
Memory Optimizations Memory Technology 8
9
Memory Optimizations Memory Technology 9
10
Memory Optimizations SDRAM – Synchronous DRAM – Add a clock signal to DRAM interface – Repeated transfers not bear overhead DDR – Double Data Rate – Transfer data on both rising edge and falling edge – Doubling the peak data rate 10
11
Memory Optimizations DDR: – DDR2 Lower power (2.5 V -> 1.8 V) Higher clock rates (266 MHz, 333 MHz, 400 MHz) – DDR3 1.5 V 800 MHz – DDR4 1-1.2 V 1600 MHz Memory Technology 11
12
Memory Optimizations GDDR5 is graphics memory based on DDR3 – Achieve 2-5 X bandwidth per DRAM vs. DDR3 Wider interfaces (32 vs. 16 bit) Higher clock rate – Possible because they are attached via soldering instead of socketted DIMM modules Graphics Data RAMs – GDRAM/GSDRAM – Graphics or Graphics Synchronous DRAM Reducing power in SDRAMs: – Lower voltage – Low power mode (ignores clock, continues to refresh) Memory Technology 12
13
Memory Power Consumption Memory Technology 13
14
Flash Memory Type of EEPROM Must be erased (in blocks) before being overwritten Non volatile Limited number of write cycles Cheaper than SDRAM, more expensive than disk Slower than SRAM, faster than disk Memory Technology 14
15
Memory Dependability Memory is susceptible to cosmic rays Soft errors: dynamic errors – Detected and fixed by error correcting codes (ECC) Hard errors: permanent errors – Use sparse rows to replace defective rows Chipkill: a RAID-like error recovery technique Memory Technology 15
16
Review Virtual Memory Physical memory? Virtual memory? Physical address? Logical address? Paging virtual memory? Segmentation virtual memory? How to translate logical address to physical address? What is write strategy in virtual memory? 16
17
Review Virtual Memory The mapping of a virtual address to a physical address via a page table 17
18
Protection via Virtual Memory Protection via virtual memory – Keeps processes in their own memory space Role of architecture: – Provide user mode and supervisor mode – Protect certain aspects of CPU state – Provide mechanisms for switching between user mode and supervisor mode – Provide mechanisms to limit memory accesses – Provide TLB to translate addresses Virtual Memory and Virtual Machines 18
19
Protection via Virtual Machines Supports isolation and security Sharing a computer among many unrelated users Enabled by raw speed of processors, making the overhead more acceptable Allows different ISAs and operating systems to be presented to user programs – “System Virtual Machines” – SVM software is called “VMM, virtual machine monitor” or “hypervisor” – Individual virtual machines run under the monitor are called “guest VMs” Virtual Memory and Virtual Machines 19
20
Impact of VMs on Virtual Memory Each guest OS maintains its own set of page tables – VMM adds a level of memory between physical and virtual memory called “real memory” – VMM maintains shadow page table that maps guest virtual addresses to physical addresses Requires VMM to detect guest’s changes to its own page table Occurs naturally if accessing the page table pointer is a privileged operation Virtual Memory and Virtual Machines 20
21
阅读作业 ( 第五版 ) 2.6 Putting it all together: Memory hierachies in the ARM Cortex-A8 and Intel Core i7 Pp113-135 http://www.doc88.com/p-112663203506.html 21
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.