Computer Organization (Review of Prerequisite Material)
Computer Architecture Processes are an abstraction of the operation of computers. So, to understand operating systems, one must have a basic knowledge about how computer hardware is organized. The von Neumann architecture forms the basis for most contemporary computer systems.
Historical Perspective Babbage designed Difference Engine (1822-1857), later called Analytical Engine, using notion of stored program computer (but with mechanical, not electronic, parts) Used idea from Jacquard loom to store computational patterns Ideas he developed were reinvented, extended, and implemented by Zuse (1936), Atanasoff (1940), Bell labs (1945), Aiken and Hopper (1946), and others in 1930’s and 1940’s
Stored Program Computers and Electronic Devices Pattern Variable Program Stored Program Device Jacquard Loom Fixed Electronic Device
Historical Perspective – cont First true computer which used the concept of stored program was EDVAC (Electronic Digital Variable Automatic Computer), designed in 1945 (but not completed until 1951) While the concept of the stored program computer is often attributed to von Neumann (this architecture is known as the von Neumann architecture), it was not totally due to him – his support did increase government and academic support
von Neumann Architecture Control Unit (CU) Central Processing Unit (CPU) Address Bus Data Bus Arithmetical Logical Unit (ALU) Primary Memory Unit (Executable Memory) Device Device Controller and Device
von Neumann Architecture – cont. The crucial difference between computers and other electronic devices is the variable program
Central Processing Unit Datapath ALU – Arithmetic/Logic Unit Registers General-purpose registers Control registers Communication paths between them Control Controls the data flow and operations of ALU
The ALU Functional Unit Status Registers To/from Primary Memory load R3,b load R4,c add R3,R4 store R3,a Right Operand Left Operand R1 R2 . . . Rn Functional Unit Status Registers Result To/from Primary Memory
Program Specification Source int a, b, c, d; . . . a = b + c; d = a - 100; ; Code for a = b + c load R3,b load R4,c add R3,R4 store R3,a Assembly Language ; Code for d = a - 100 load R4,=100 subtract R3,R4 store R3,d
Machine Language Assembly Language Machine Language ; Code for a = b + c load R3,b load R4,c add R3,R4 store R3,a ; Code for d = a - 100 load R4,=100 subtract R3,R4 store R3,d 10111001001100…1 10111001010000…0 10100111001100…0 10111010001100…1 10100110001100…0 10111001101100…1 Machine Language
Control Unit Primary Memory Control Unit PC IR load R3,b load R4,c 3046 3050 3054 3058 Primary Memory Fetch Unit Decode Unit Execute Unit PC IR Control Unit load R3,b load R4,c add R3,R4 store R3,a 10111001001100…1 10111001010000…0 10100111001100…0 10111010001100…1 load R4, c
Control Unit Operation Fetch phase: Instruction retrieved from memory Execute phase: ALU op, memory data reference, I/O, etc. PC = <machine start address>; IR = memory[PC]; haltFlag = CLEAR; while(haltFlag not SET) { execute(IR); PC = PC + sizeof(INSTRUCT); IR = memory[PC]; // fetch phase };
S1 bus Dest bus S2 bus Control unit ALU A R0, r1,... (registers) C B ia(PC) psw... MAR IR MDR MAR memory address register MDR memory data register IR instruction register Memory
Primary Memory Unit Read Op: 1. Load MAR with address 1234 1. Load MAR with address MAR 1 2 98765 3. Data will then appear in the MDR MDR Command read 2. Load Command with “read” 1234 98765 Read Op: n-1
Instruction Execution Instruction fetch (IF) MAR PC; IR M[MAR] Instruction Decode (ID) A Rs1; B Rs2; PC PC + 4 Execution (EXE) Depends on the instruction Memory Access (MEM) Write-back (WB)
Arithmetic Instruction Example r3 r1 + r2 IF: MAR PC; IR M[MAR] ID: A r1; B r2; PC PC + 4 EXE: ALUoutput A + B MEM: WB: r3 ALUoutput
Memory Instruction Example load 30(r1), r2 IF: MAR PC; IR M[MAR] ID: A r1; PC PC + 4 EXE: MAR A + #30 MEM: MDR M[MAR] WB: r2 MDR
Branch/jump Instruction Example bnez r1, -16 IF: MAR PC; IR M[MAR] ID: A r1; PC PC + 4 EXE: ALUoutput PC + #-16; cond (A op 0) MEM: if (cond) PC ALUoutput WB: r1 = 100 r4 = 0 r3 = 1 L1: r4 = r4 + r3 r3 = r3 + 2 r1 = r1-1 if (r1!=0) goto L1 // Outside loop // r4 ?
Devices I/O devices are used to place data into primary memory and to store its contents on a more permanent medium Logic to control detailed operation Physical device itself Each device uses a device controller to connect it to the computer’s address and data bus Many types of I/O devices
Device Organization Device manager Program to manage device controller Application Program Device Controller Device Software in the CPU Abstract I/O Machine Device manager Program to manage device controller Supervisor mode software
Device Characteristics Block or character oriented Depends on number of bytes transferred in one operation Input or Output (or both) Storage or communication Handled by device controller
Device Controllers A hardware component to control the detailed operations of a device Interface between controllers and devices Interface between software and the controller Through controller’s registers
Device Controller Interface busy done 0 0 idle 0 1 finished 1 0 working 1 1 (undefined) . . . busy done Error code . . . Command Status Data 0 Data 1 Logic Data n-1
Direct Memory Access Conventional devices DMA controllers Primary CPU CPU Controller Controller Device Device
Addressing Devices Instructions to access device controller’s registers Special I/O instructions Memory-mapped I/O
Addressing Devices Primary Memory Primary Memory Memory Addresses Device Addresses Device n-1 Device n-1
Communication Between CPU and Devices Through busy-done flag Called polling A busy-waiting implementation Not effective
Polling I/O busy done Software Hardware … // Start the device While(busy == 1) wait(); // Device I/O complete done = 0; while((busy == 0) && (done == 1)) // Do the I/O operation busy = 1; busy done Software Hardware
Polling I/O – cont. It introduces busy-waiting while(deviceNo.busy || deviceNo.done) <waiting>; deviceNo.data[0] = <value to write> deviceNo.command = WRITE; while(deviceNo.busy) <waiting>; deviceNo.done = TRUE; It introduces busy-waiting The CPU is busy, but is effectively waiting Devices are much slower than CPU CPU waits while device operates Would like to multiplex CPU to a different process while I/O is in process
A More Efficient Approach When a process is waiting for its I/O to be completed, it would be more effective if we can let another process to run to fully utilize the CPU It requires a way for the device to inform the CPU when it has just completed I/O
Better Utilization of CPU Device … Ready Processes I/O Operation Uses CPU
Determining When I/O is Complete CPU Interrupt Pending Device Device Device CPU incorporates an “interrupt pending” flag When device.busy FALSE, interrupt pending flag is set Hardware “tells” OS that the interrupt occurred Interrupt handler part of the OS makes process ready to run
Interrupts An interrupt is an immediate (asynchronous) transfer of control caused by an event in the system to handle real-time events and running-time errors Interrupt can be either software or hardware I/O device request (Hardware) System call (software) Signal (software) Page fault (software) Arithmetic overflow Memory-protection violation Power failure
Interrupts – cont. Causes of interrupts: System call (syscall instruction) Timer expires (value of timer register reaches 0) I/O completed Program performed an illegal operation: Divide by zero Address out of bounds while in user mode Segmentation fault
Interrupts – cont. program interrupt Interrupt handler
Synchronous vs. Asynchronous Events occur at the same place every time the program is executed with the same data and memory Can be predicted Asynchronous Caused by devices external to the processor or memory
Interrupt Handling When an interrupt occurs, the following steps are taken Save current program state Context switch to save all the general and status registers of the interrupted process Find out the interrupt source Go to the interrupt handler
Control Unit with Interrupt (Hardware) PC = <machine start address>; IR = memory[PC]; haltFlag = CLEAR; while(haltFlag not SET) { execute(IR); PC = PC + sizeof(INSTRUCT); if(InterruptRequest) { memory[0] = PC; PC = memory[1] }; memory[1] contains the address of the interrupt handler
Interrupt Handler (Software) saveProcessorState(); for(i=0; i<NumberOfDevices; i++) if(device[i].done) goto deviceHandler(i); /* something wrong if we get to here … */ deviceHandler(int i) { finishOperation(); returnToScheduler(); } saveProcessorState() { for(i=0; i<NumberOfRegisters; i++) memory[K+i] = R[i]; for(i=0; i<NumberOfStatusRegisters; i++) memory[K+NumberOfRegisters+i] = StatusRegister[i]; }
Interrupt Handling – cont. Problem when two or more devices finish during the same instruction cycle Race condition between interrupts The interrupt handler gets interrupted To avoid race conditions implement InterruptEnable flag If FALSE, no interrupts allowed Reset to TRUE when handler exits “critical code” section
Fetch/Execute Cycle w/Interrupts PC = <machine start address>; IR = memory[PC]; haltFlag = CLEAR; while(haltFlag not SET) { execute(IR); PC = PC + sizeof(INSTRUCT); if(InterruptRequest && InterruptEnabled) { disableInterrupts(); memory[0] = PC; PC = memory[1] };
Revisiting the trap Instruction (Hardware) executeTrap(argument) { setMode(supervisor); switch(argument) { case 1: PC = memory[1001]; // Trap handler 1 case 2: PC = memory[1002]; // Trap handler 2 . . . case n: PC = memory[1000+n];// Trap handler n }; The trap instruction dispatches a trap handler routine atomically Trap handler performs desired processing “A trap is a software interrupt”
The Trap Instruction Operation Mode S 1 Branch Table 2 trap 3 Trusted Code User Supervisor
Intel System Initialization ROM CMOS RAM Boot Device POST BIOS Boot Prog Loader OS … Hardware Process Data Flow Power Up
Bootstrapping Bootstrap loader (“boot sector”) Primary Memory PC IR 1 Fetch Unit Decode Unit Execute Unit 0000100 … PC IR BIOS loader 0x0000100 0x0001000 Primary Memory
Bootstrapping Bootstrap loader (“boot sector”) Loader Primary Memory 1 0x0000100 2 BIOS loader 0x0001000 Fetch Unit Decode Unit Execute Unit 0001000 … PC IR 0x0008000 Loader Primary Memory
Bootstrapping Bootstrap loader (“boot sector”) Loader OS 1 0x0000100 2 BIOS loader 0x0001000 Fetch Unit Decode Unit Execute Unit 0008000 … PC IR 0x0008000 Loader 3 0x000A000 OS Primary Memory
Bootstrapping Bootstrap loader (“boot sector”) Loader OS 1 0x0000100 2 BIOS loader 0x0001000 0x0008000 Fetch Unit Loader 3 PC 000A000 0x000A000 Decode Unit OS IR … Primary Memory Execute Unit 4. Initialize hardware 5. Create user environment 6. …
A Bootstrap Loader Program FIXED_LOC: // Bootstrap loader entry point load R1, =0 load R2, =LENGTH_OF_TARGET // The next instruction is really more like // a procedure call than a machine instruction // It copies a block from FIXED_DISK_ADDRESS // to BUFFER_ADDRESS read BOOT_DISK, BUFFER_ADDRESS loop: load R3, [BUFFER_ADDRESS, R1] store R3, [FIXED_DEST, R1] incr R1 bleq R1, R2, loop br FIXED_DEST
Mobile Computers Use von Neumann architecture, BUT Physically very small and light weight Severe restraints on power consumption Limited memory size May use removable devices for storage, networking, etc. System-on-a-chip (SOC) Most components are integrated into same chip as the CPU Power management is critical issue
Speeding Up Computers How can we make computers faster while using the same clock speed Break computer into multiple units – each working on a different part of same problem Divide CPU into functional units => pipelining Use multiple ALUs => SIMD machines Single instruction-multiple data Use multiprocessors => parallel computers
A Pipelined Function Unit Operand 1 Function Unit Result Operand 2 (a) Monolithic Unit Operand 1 Result Operand 2 (b) Pipelined Unit
A SIMD Machine … (a) Conventional Architecture (b) SIMD Architecture ALU Control Unit … (b) SIMD Architecture ALU Control Unit (a) Conventional Architecture
Multiprocessor Machines Shared memory multiprocessors Distributed memory multiprocessors
Summary The von Neumann architecture is used in most computers To manage I/O devices more effectively, interrupts are used Interrupt handling involves hardware and software support There are also machines which use a different architecture Array processors; multiprocessors