Download presentation
Presentation is loading. Please wait.
Published byLambert Hawkins Modified over 8 years ago
1
CSE 8383 - Advanced Computer Architecture Week-1 Week of Jan 12, 2004 engr.smu.edu/~rewini/8383
2
Contents Course Outline Review of Main Concepts in Computer Architecture Instruction Set Architecture Flynn’s Taxonomy Layers of Computer System Development Performance
3
Course Contents 1. Review of Main Concepts 2. Memory System Design 3. Pipeline Design Techniques 4. Multiprocessors 5. Shared Memory Systems 6. Message Passing Systems 7. Network Computing
4
Course Resources Lecture slides on the web Student presentations on the web Books Hwang, Advanced Computer Architecture-- Parallelism Scalability Programmability, McGraw- Hill. Abd-El-Barr and El-Rewini, Computer Design and Architecture, to be published by John Wiley and Sons in 2004. (Selected chapters will be made available on the web)
5
Student Work Class Participation Assignments Presentations Project Midterm Final
6
Review
7
Memory Locations and Operations Memory Addressing Memory Data Register (MDR) Memory Address Register (MAR) Three steps for read and write
8
Addressing Instruction Format Op-code Address fields Number of address fields Three (memory locations or registers) Two (memory locations or registers) One-and-half (memory location and register) One (accumulator) Zero (stack operations)
9
Addressing Modes Immediate (operands in instruction) Direct (address in instruction) Indirect (address of the address) Indexed (constant is added to index register) Other modes
10
Instruction Types Data Movement Arithmetic and Logical Sequencing Input/Output
11
Flynn’s Classification SISD (single instruction stream over a single data stream) SIMD (single instruction stream over multiple data stream) MIMD (multiple instruction streams over multiple data streams) MISD (multiple instruction streams and a single data streams)
12
SISD (single instruction stream over a single data stream) SISD uniprocessor architecture CU IS DSIS PUMU I/O Captions: CU = control unitPU = Processing unit MU = memory unitIS = instruction stream DS = data streamPE = processing element LM = Local Memory
13
SIMD (single instruction stream over multiple data stream) SIMD Architecture PE n PE 1 LMn CU IS DS IS Program loaded from host Data sets loaded from host LM 1
14
MIMD (multiple instruction streams over multiple data streams) CU 1 PUn ISDS ISDS MMD Architecture (with shared memory) PU 1 Shared Memory I/O IS
15
MISD (multiple instruction streams and a single data streams) Memory (Program and data) CU 1 CU 2 PU 2 CU n PU n PU 1 IS DS I/O DS MISD architecture (the systolic array)
16
Layers for computer system development Applications Programming Environment Languages Supported Communication Model Addressing Space Hardware Architecture Machine Independent Machine Dependent
17
System Attributes to Performances Clock Rate and CPI (clock cycles per instruction) Performance Factors: T = I c x CPI x System Attributes Instruction-set architecture Complier technology CPU implementation and control Cache and memory hierarchy
18
MIPS & Throughput f = 1/ (clock rate) C = total number of cycles MIPS Rate MIPS = I c /(T x 10 6 )= f/(CPI x10 6 ) = (f x I c )/(C x10 6 ) Throughput Rate: W p = f /(I c x CPI)
19
Memory System Design
20
Contents (Memory) Memory Hierarchy Cache Memory Placement Policies Direct Mapping Fully Associative Set Associative Replacement Policies
21
Memory Hierarchy CPU Registers Cache Main Memory Secondary Storage Latency Bandwidth Speed Cost per bit
22
Sequence of events Processor makes a request for X X is sought in the cache If it exists hit (hit ratio h) Otherwise miss (miss ratio m = 1-h) If miss X is sought in main memory It can be generalized to more levels
23
Cache Memory The idea is to keep the information expected to be used more frequently in the cache. Locality of Reference Temporal Locality Spatial Locality Placement Policies Replacement Policies
24
Placement Policies How to Map memory blocks (lines) to Cache block frames (line frames) Blocks (lines) Block Frames (Line Frames) Memory Cache
25
Placement Policies Direct Mapping Fully Associative Set Associative
26
Direct Mapping Simplest A memory block is mapped to a fixed cache block frame (many to one mapping) J = I mod N J Cache block frame number I Memory block number N number of cache block frames
27
Address Format Memory M blocks Block size B words Cache N blocks Address size log 2 (M * B) TagBlock frameWord log 2 Blog 2 NRemaining bits log 2 M/N
28
Example Memory 4K blocks Block size 16 words Address size log 2 (4K * 16) = 16 Cache 128 blocks TagBlock frameWord 475
29
Example (cont.) 128 129 255 0 1 127 3968 4095 0 1 2 127 MemoryTagcache 0131 5 bits
30
Fully Associative Most flexible A memory block is mapped to any available cache block frame (many to many mapping) Associative Search
31
Address Format Memory M blocks Block size B words Cache N blocks Address size log 2 (M * B) TagWord log 2 BRemaining bits log 2 M
32
Example Memory 4K blocks Block size 16 words Address size log 2 (4K * 16) = 16 Cache 128 blocks TagWord 412
33
Example (cont.) 0 1 4094 4095 0 1 2 127 Memory Tagcache 12 bits
34
Set Associative Compromise between the other two Cache number of sets Set number of blocks A memory block is mapped to any available cache block frame within a specific set Associative Search within a set
35
Address Format Memory M blocks Block size B words Cache N blocks Number of sets S N/num of blocks per set Address size log 2 (M * B) log 2 B TagSetWord log 2 S Remaining bits log 2 M/S
36
Example Memory 4K blocks Block size 16 words Address size log 2 (4K * 16) = 16 Cache 128 blocks Num of blocks per set = 4 Number of sets = 32 4 TagSetWord 57
37
Example (cont.) 0 1 2 3 126 127 Set 0 Tag cache 7 bits Set 31 32 33 63 0 1 314095 Memory 01 127 124 125
38
Comparison Simplicity Associative Search Cache Utilization Replacement
39
Replacement Techniques FIFO LRU MRU Random Optimal
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.