Download presentation
1
The CPU and Memory
2
Introduction In a Computer… Memory is separate from the CPU
Data is in binary (not decimal) Central Processing Unit
3
Input/output interface
System Block Diagram CPU Highest Address Memory Lowest ALU Input/output interface Control unit Program counter
4
The Little Man Computer
5
Components of a CPU ALU (arithmetic and logic unit) Control unit
Performs arithmetic and logic operations Arithmetic: add, subtract, multiply, divide, etc. Logic: AND, OR, NOT, Shift, etc. Control unit Interprets instructions Controls the flow of information within the CPU Works with a “program counter” (address of next instruction) Input/output interface Provides mechanism for input and output of data Many variations possible
6
Registers A register is a single storage location within the CPU
Unlike memory, which is “outside” the CPU Examples of registers: Accumulator (ACC) Program counter (PC) Instruction register (IR) Memory address register (MAR) Memory data register (MDR) Status register General purpose registers (R0, R1, …) Included on some CPUs Used for high-speed temporary storage
7
Memory address register
Memory Unit n bits Memory cell bit 0 1 2 3 4 2n-1 bit 1 Memory address register Address decoder bit n - 1 m - 1 Memory data register m bits p. 160
8
MAR-MDR example
9
A visual analogy for memory
10
Memory Implementations
RAM – random access memory Static RAM Dynamic RAM ROM – read-only memory p. 165
11
RAM: Random Access Memory
DRAM (Dynamic RAM) Most common, cheap, less electrical power, less heat, smaller space Volatile: must be refreshed (recharged with power) 1000’s of times each second SRAM (static RAM) Faster and more expensive than DRAM Volatile Small amounts are often used in cache memory for high-speed memory access
12
Nonvolatile Memory ROM Flash Memory Read-only Memory
Holds software that is not expected to change over the life of the system such as firmware used for the system BIOS Flash Memory Inexpensive nonvolatile secondary storage Useful for nonvolatile portable computer storage, digital cameras, tablets, smartphones Slower rewrite time compared to RAM
13
Memory capacity Determined by: Types of memory Memory mapping. Ex:
Number of bits in MAR Number of bits in the instruction address field Types of memory Physical Logical/Virtual Memory mapping. Ex: 1000 locations 2 instruction addressing bits; one more digit for mapping Memory content manipulation: 4 or 8 bytes
14
Memory Capacity 2n x m n address bits = 2n addresses
m data bits; usually m = 8 (for one byte) m is the “width” of the data path Typical values: n: 16, 17, 18, 19, 20, 21, 22, etc. m: 8, 16, 32, 64
15
Question Q: How many bits of memory are contained in a memory unit with 512KB of memory? A: 512 = 29, K = 210, B = byte = 8 = 23 29 x 210 x 23 = 222 = 4,194,304
16
Memory Maps The usage of memory space on a system is commonly depicted in a “memory map” The height of the map is determined by the number of addresses The width of the map is usually 8 bits E.g., a system with a capacity of 216 bytes…
17
Memory Map Data bit position The “bottom” of memory
FFFF 0002 0001 0000 Data bit position The “bottom” of memory Hexadecimal address
18
Use of Memory Maps Memory maps are usually drawn to show “what is where” on a system The possibilities for “what” RAM, ROM, I/O, nothing The possibilities for “where” Determined by the starting/ending addresses for each “block” of RAM, ROM, I/O, nothing E.g., a memory map for a system with a capacity of 224 bytes with two 1 MB RAM modules residing consecutively at the bottom of memory….
19
Memory Map 14 MB empty 224 bytes = 16 MB “capacity” 1 MB RAM 1 MB RAM
FFFFFF 200000 1FFFFF 100000 0FFFFF 000000 14 MB empty 224 bytes = 16 MB “capacity” 1 MB RAM 1 MB RAM
20
Program Counter ( PC ) A dedicated register in the CPU.
Contains the address in memory of the current instruction being executed. Incremented automatically after each instruction. May be forced to change: eg “jump” instruction. Usually initialize to zero when machine starts, or is reset.
21
Instruction Register ( IR )
A dedicated register in the CPU which contains the actual current instruction. Op Code Address What To Do Location of Data Simple 16-bit example:
22
Memory registers Memory Address Register (MAR)
Contains Address in memory to find or place data. Memory Data Register (MDR) Contains Actual Data to be placed in location given in MAR, or which has been retrieved from location given in MAR.
23
Accumulator A dedicated register (or set of registers) in the CPU used for the actual manipulation of data. Default source (or destination) register. Usually contains results of arithmetic or logical operations.
24
Generic CPU With Registers
Program Counter ( PC ) Instruction Register ( IR ) Memory Memory Address Register ( MAR ) Memory Data Register ( MDR ) Accumulator ( A or Acc )
25
Fetch-Execute Cycle Two steps, or cycles, in the execution of every instruction Fetch – fetch the code for the instruction from memory and place it in the IR (instruction register) Execute – execute the instruction Fetch Execute time
26
The LOAD Instruction PC MAR MDR IR Fetch IR[address] MAR MDR A
PC + 1 PC Fetch time Execute
27
The Add Instruction PC MAR MDR IR Fetch IR[address] MAR
A + MDR A PC + 1 PC Fetch time Execute
28
Fetch-Execute Example: Load Accumulator
Assume: Simple Eight bit system. Thirty-two memory locations (0 to 31). “Load” instruction is 010. Value in location 15 is ten (ie: binary ) PC contains 6 (110). The instruction, , is in location 6. Then ...
29
PC: 00110 IR: (previous) MAR: (previous) MDR: (previous) A: (previous)
Location 31 15: 06: Location 0 PC: IR: (previous) MAR: (previous) MDR: (previous) A: (previous)
30
MAR loaded with PC: PC -> MAR
Location 31 15: 06: Location 0 PC: IR: (previous) MAR: MDR: (previous) A: (previous)
31
Memory Location 00110 Accessed and Contents Placed in MDR:
15: 06: Location 0 PC: IR: (previous) MAR: MDR: (previous) A: (previous)
32
Memory Location 00110 Accessed and Contents Placed in MDR:
15: 06: Location 0 PC: IR: (previous) MAR: MDR: A: (previous)
33
MDR copied to IR: MDR -> IR
Location 31 15: 06: Location 0 PC: IR: MAR: MDR: A: (previous)
34
IR [ address part ] -> MAR
Location 31 15: 06: Location 0 PC: IR: MAR: MDR: A: (previous)
35
Location in MAR (01111) Accessed
15: 06: Location 0 PC: IR: MAR: MDR: A: (previous)
36
Contents of 01111 loaded into MDR
Location 31 15: 06: Location 0 PC: IR: MAR: MDR: A: (previous)
37
IR [op code] executed: MDR -> A
Location 31 15: 06: Location 0 PC: IR: MAR: MDR: A:
38
Increment PC: PC = PC + 1 PC: 00111 IR: 01001111 MAR: 01111
Location 31 15: 06: Location 0 PC: IR: MAR: MDR: A:
39
Finished PC: 00111 IR: 01001111 MAR: 01111 MDR: 00001010 A: 00001010
Location 31 15: 06: Location 0 PC: IR: MAR: MDR: A:
40
Assume: Value in location 7 is 10110010. “Add” instruction is 101.
Now: Assume: Value in location 7 is “Add” instruction is 101. Value in location 18 is seventy-one (i.e.: binary ) Everything else is as we left it! Then ...
41
PC -> MAR PC: 00111 IR: 01001111 MAR: 00111 MDR: 00001010
Location 31 18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
42
MAR Accesses Location 00111 PC: 00111 IR: 01001111 MAR: 00111
18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
43
Contents of 00111 -> MDR PC: 00111 IR: 01001111 MAR: 00111
Location 31 18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
44
MDR -> IR PC: 00111 IR: 10110010 MAR: 00111 MDR: 10110010
Location 31 18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
45
IR [address] -> MAR PC: 00111 IR: 10110010 MAR: 10010 MDR: 10110010
Location 31 18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
46
Location 10010 [MAR] Accessed
18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
47
Contents of [10010] -> MDR
Location 31 18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
48
IR [opcode] executed: A = A + MDR
Location 31 18: 15: 07: 06: Location 0 PC: IR: MAR: MDR: A:
49
Increment PC: PC = PC + 1 PC: 01000 IR: 01001111 MAR: 01111
Location 31 18: 15: 08: 07: 06: Location 0 PC: IR: MAR: MDR: A:
50
To Continue: If the next instruction were to load the Accumulator contents into an area of memory reserved for screen output (for example), then the number “81” should appear on the screen. The process continues in the same fashion, more or less, until a stop or halt instruction is encountered.
51
From our first lecture…
Buses Definition: a collection of electrical conductors (eg: wires, traces) with a common purpose Each wire or trace is called a line Typically, buses carry information from one place to another From our first lecture… Ed: k c
52
bus Ports Printer Mouse Keyboard Modem Disk controller Graphics card Monitor Speakers CPU Sound card RAM Network card Computer
53
Bus The physical connection that makes it possible to transfer data from one location in the computer system to another Group of electrical or optical conductors for carrying signals from one location to another Wires or conductors printed on a circuit board Line: each conductor in the bus 4 kinds of signals Data Addressing Control signals Power (sometimes)
54
Bus Characteristics Number of separate wires or conductors
Data width in bits carried simultaneously Addressing capacity Lines on the bus are for a single type of signal or shared Throughput – data transfer rate in bits per second Distance between two endpoints Number and type of attachments supported Type of control required Defined purpose Features and capabilities
55
Types of Buses (1 of 3) Point-to-point Serial port Modem Control unit
ALU
56
Types of Buses (2 of 3) Multipoint Computer Computer Computer Computer
CPU Memory Disk controller Video controller
57
Types of Buses (3 of 3) Daisy chain Device controller Device Device
Terminator
58
Buses Inside a Computer
Motherboard Many configurations possible CPU Data bus Address bus Control bus Memory I/O Module I/O Device
59
Bus Categorizations Parallel vs. serial buses
Direction of transmission Simplex – unidirectional Half duplex – bidirectional, one direction at a time Full duplex – bidirectional simultaneously Method of interconnection Point-to-point – single source to single destination Cables – point-to-point buses that connect to an external device Multipoint bus – also broadcast bus or multidrop bus Connect multiple points to one another
60
Parallel vs. Serial Buses
High throughput because all bits of a word are transmitted simultaneously Expensive and require a lot of space Subject to radio-generated electrical interference, which limits their speed and length Generally used for short distances such as CPU buses and on computer motherboards Serial 1 bit transmitted at a time Single data line pair and a few control lines For many applications, throughput is higher than for parallel because of the lack of electrical interference
61
Data Bus Carries data between the CPU and memory or I/O devices
Bi-directional Data transferred “out of” the CPU for write operations Data transferred “into” the CPU for read operations Typical sizes: 8, 16, 32, 64 lines Signal names: D0, D1, D2, D3, etc.
62
Address Bus Carries an address from the CPU to Memory or I/O devices
Unidirectional The address is always supplied by the CPU (There is one exception to this, which we’ll discuss later.) Typical sizes: 16, 20, 24 lines Signal names: A0, A1, A2, A3, etc.
63
Control Bus Collection of signals for coordinating CPU activities
Each signal has a unique purpose Typical sizes: lines Signals are output, input, or bi-directional Typical signals /RD (read) /WR (write CLK (clock) /IRQ (interrupt request) etc.
64
Figure 7.11 Alternative bus notations
65
Classification of Instructions
Data Movement (load, store) Most common, greatest flexibility Involve memory and registers What’s this size of a word ? 16? 32? 64 bits? Arithmetic Operators + - / * ^ Integers and floating point Boolean Logic Often includes at least AND, XOR, and NOT Single operand manipulation instructions Negating, decrementing, incrementing, set to 0
66
More Instruction Classifications
Bit manipulation instructions Flags to test for conditions Shift and rotate Program control Stack instructions Multiple data instructions I/O and machine control
67
Register Shifts and Rotates
68
Program Control Instructions
Jump and branch Subroutine call and return
69
Stack Instructions Stack instructions
LIFO method for organizing information Items removed in the reverse order from how they are added Push Pop
70
Cache Memory Blocks: between 8 and 64 bytes Cache Line
Unit of transfer between storage and cache memory Tags: pointer to location in main memory Cache controller Hardware that checks tags to determine if in cache Hit Ratio: ratio of hits out of total requests
71
Step-by-Step Use of Cache
72
Step-by-Step Use of Cache
73
Traditional Modern Architectures
Problems with early CPU Architectures and solutions: Large number of specialized instructions were rarely used but added hardware complexity and slowed down other instructions Slow data memory accesses could be reduced by increasing the number of general purpose registers Using general registers to hold addresses could reduce the number of addressing modes and simplify architecture design Fixed-length, fixed-format instruction words would allow instructions to be fetched and decoded independently and in parallel Copyright 2013 John Wiley & Sons, Inc.
74
Performance Advantages
Hit ratios of 90% and above are common 50%+ improved execution speed Locality of reference is why caching works Most memory references confined to small region of memory at any given time Well-written program in small loop, procedure, or function Data likely in array Variables stored together
75
Multiprocessing Reasons Multiprocessor system
Increase the processing power of a system Parallel processing through threads: independent segments of a program that can be executed concurrently Multiprocessor system Tightly coupled Multicore processors—when CPUs are on a single integrated circuit
76
Multiprocessor Systems
Identical access to programs, data, shared memory, I/O, etc. Easily extends multi-tasking and redundant program execution Each CPU in a Processor is called a Core Two ways to configure Master-slave multiprocessing Symmetrical multiprocessing (SMP)
77
Typical Multiprocessing System Configuration
78
Master-Slave Multiprocessing
Master CPU Manages the system Controls all resources and scheduling Assigns tasks to slave CPUs Advantages Simplicity Protection of system and data Disadvantages Master CPU becomes a bottleneck Reliability issues—if master CPU fails entire system fails
79
Symmetrical Multiprocessing
Each CPU has equal access to resources Each CPU determines what to run using a standard algorithm Disadvantages Resource conflicts: memory, I/O, etc. Complex implementation Advantages High reliability Fault tolerant support is straightforward Balanced workload
80
Multiple, Parallel Execution Units
Different instructions have different numbers of steps in their cycle Differences in each step Each execution unit is optimized for one general type of instruction Multiple execution units permit simultaneous execution of several instructions Copyright 2013 John Wiley & Sons, Inc.
81
Superscalar Processing
Process more than one instruction per clock cycle Separate fetch and execute cycles as much as possible Buffers for fetch and decode phases Parallel execution units Copyright 2013 John Wiley & Sons, Inc.
82
Scalar vs. Superscalar Processing
Copyright 2013 John Wiley & Sons, Inc.
83
Intel Core i7 2-4 Cores 2.2 – 4.0 GHz
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.