CS2100 Computer Organisation Input/Output (AY2010/2011) Semester 2 Adapted from David Patternson’s lecture slides:
CS2100 Input/Output 2 THE BIG PICTURE Control Datapath Memory Processor Input Output Control Datapath Memory Processor Input Output Network
CS2100 Input/Output 3 INPUT/OUTPUT DEVICES
CS2100 Input/Output 4 WHY I/O MATTERS? CPU performance increase ~ 60% per year I/O performance increase < 10% per year Limited by mechanical delays Amdahl’s Law: system speedup is limited by the slowest part Example: Suppose 1 sec I/O + 4 sec CPU => 5 seconds Increase CPU performance by 100% => 3 seconds We only get 66% speedup => I/O bottleneck “I think Silicon Valley was misnamed. If you look back at the dollars shipped in products in the last decade, there has been more revenue from magnetic disks than from silicon. They ought to rename the place Iron Oxide Valley.” -- Al Hoagland, one of the pioneers of magnetic disks, 1982
CS2100 Input/Output 5 TYPES AND CHARACTERISTICS OF I/O DEVICES Behavior Input: read once Output: write only, cannot be read Storage: can be reread and usually rewritten Partner What’s on the other end? Human or machine Data Rate Peek rate of transfer between I/O and memory/CPU
CS2100 Input/Output 6 I/O DEVICE EXAMPLES DeviceBehaviorPartnerData Rate (BB/sec) KeyboardInputHuman0.01 MouseInputHuman0.02 Line PrinterOutputHuman1.00 Floppy diskStorageMachine50.00 Laser PrinterOutputHuman Optical DiskStorageMachine Magnetic DiskStorageMachine5, Network-LANInput or outputMachine20 – 1, Graphics Display OutputHuman30,000.00
CS2100 Input/Output 7 MOUSE Invented by Douglas C. Engelbart (jointly with Bill English) in 1970 “SRI patented the mouse, but they really had no idea of its value. Some years later I learned that they had licensed it to Apple for something like $40,000.” -- Douglas C. Engelbart Douglas was as a pioneer of human-computer interaction whose team developed hypertext, networked computers, and precursors to GUIs Mouse uses optimal or mechanical means to determine the X-Y coordinates Bandwidth requirement limited by human hand coordination We are too slow relative to the rate of reading mouse status First computer mouse
CS2100 Input/Output 8 MAGNETIC DISK Purpose: Long term, nonvolatile storage Large, inexpensive, and slow Lowest level in the memory hierarchy Basic Idea: Rely on a rotating platter coated with a magnetic surface Use a moveable read/write head to access the disk Registers Cache Memory Disk
CS2100 Input/Output 9 MAGNETIC DISK HISTORY Al Hoagland stands with the RAMAC he helped create five decades ago. A 2.5-inch laptop drive with the cover removed, a standard size in most laptops today, shown in the center of a 14-inch magnetic oxide coated disk, which was the standard size in the 1960’s and 1970’s.
CS2100 Input/Output 10 HARD DISK DRIVE EVOLUTION
CS2100 Input/Output 11 MAGNETIC DISK Typical numbers (depending on the disk size): 500 to 2,000 tracks per surface 32 to 128 sectors per track A sector is the smallest unit that can be read or written Traditionally all tracks have the same number of sectors: Constant bit density: record more sectors on the outer tracks Recently relaxed: constant bit size, speed varies with track location Platters Track Sector
CS2100 Input/Output 12 MAGNETIC DISK CHARACTERISTIC Cylinder: all the tacks under the head at a given point on all surface Read/write data is a three-stage process: Seek time: position the arm over the proper track Rotational latency: wait for the desired sector to rotate under the read/write head Transfer time: transfer a block of bits (sector) under the read-write head Average seek time as reported by the industry: Typically in the range of 8 ms to 12 ms (Sum of the time for all possible seek) / (total # of possible seeks) Due to locality of disk reference, actual average seek time may: Only be 25% to 33% of the advertised number Sector Track Cylinder Head Platter
CS2100 Input/Output 13 MAGNETIC DISK: TYPICAL NUMBERS Rotational Latency: Most disks rotate at 3,600 to 7200 RPM Approximately 16 ms to 8 ms per revolution, respectively An average latency to the desired information is halfway around the disk: 8 ms at 3600 RPM, 4 ms at 7200 RPM Transfer Time is a function of : Transfer size (usually a sector): 1 KB / sector Rotation speed: 3600 RPM to 7200 RPM Recording density: bits per inch on a track Diameter typical diameter ranges from 2.5 to 5.25 in Typical values: 2 to 12 MB per second Sector Track Cylinder Head Platter
CS2100 Input/Output 14 NETWORKS Medium to communicate between computers Characteristics Distance: 0.01 to 10,000 km Speed: to 100 MB/sec Topology: Bus, Ring, Star, Tree Examples RS232 standard – star topology, slow LAN – bus topology, 10 Mbit/sec
CS2100 Input/Output 15 I/O SYSTEM Bus is the connection between Processor, Memory and I/O Communication between Processor and devices is via bus protocols and interrupts Processor Cache Memory - I/O Bus Main Memory I/O Controller Disk I/O Controller I/O Controller Graphics Network interrupts
CS2100 Input/Output 16 BUSES Consists of control and data lines Control lines: Signal requests and acknowledgments Data lines: Carry information between the source and the destination Bus Transactions Sending the address Receiving or sending the data Advantages Versatility: single connection scheme for easy add-ons Low cost: single set of writes shared in multiple ways Disadvantages Communication bottleneck: bandwidth limits the maximum I/O throughput Devices will not be able to use the bus when they need to
CS2100 Input/Output 17 TYPES OF BUSES Processor-Memory Bus (design specific) Short and high speed Only need to match the memory system Maximize memory-to-processor bandwidth Connects directly to the processor Optimized for cache block transfers I/O Bus (industry standard) Usually is lengthy and slower Need to match a wide range of I/O devices Connects to the processor-memory bus or backplane bus Backplane Bus (standard or proprietary) Backplane: an interconnection structure within the chassis Allow processors, memory, and I/O devices to coexist Cost advantage: one bus for all components
CS2100 Input/Output 18 A THREE-BUS SYSTEM A small number of backplane buses tap into the processor- memory bus Processor-memory bus is used for processor memory traffic I/O buses are connected to the backplane bus Advantage: loading on the processor bus is greatly reduced ProcessorMemory Processor Memory Bus Bus Adaptor Bus Adaptor Bus Adaptor I/O Bus Backplane Bus I/O Bus
CS2100 Input/Output 19 EXAMPLE: PENTIUM SYSTEM ORGANISATION Processor/Memory Bus PCI Bus [Backplane] I/O Busses [IDE, SCSI]
CS2100 Input/Output 20 OBTAINING ACCESS TO BUS Bus Master – Processor Controls access to bus Must initiate and control all bus requests Slave Responds to read and write requests Drawback of using single master Processor is involved in all requests Alternative schemes Multiple bus masters Mechanism for arbitrating access to the bus needed
CS2100 Input/Output 21 BUS ARBITRATION Bus arbitration scheme: A bus master wanting to use the bus asserts the bus request A bus master cannot use the bus until its request is granted A bus master must release the bus back to the arbiter after finishing the transaction Bus arbitration schemes usually try to balance two factors: Bus priority: the highest priority device should be serviced first Fairness: Even the lowest priority device should never be completely locked out from the bus
CS2100 Input/Output 22 GIVING COMMANDS TO I/O DEVICES Two methods are used to address the device: Special I/O instructions Memory-mapped I/O Special I/O instructions specify: Both the device number and the command word Device number: the processor communicates this via a set of wires normally included as part of the I/O bus Command word: this is usually send on the bus’s data lines Memory-mapped I/O: Portions of the address space are assigned to I/O device Read and writes to those addresses are interpreted as commands to the I/O devices User programs are prevented from issuing I/O operations directly: The I/O address space is protected by the address translation
CS2100 Input/Output 23 I/O DEVICE NOTIFYING THE OS The OS needs to know when: The I/O device has completed an operation The I/O operation has encountered an error This can be accomplished in two different ways: Polling: The I/O device put information in a status register The OS periodically check the status register I/O Interrupt: Whenever an I/O device needs attention from the processor, it interrupts the processor from what it is currently doing.
CS2100 Input/Output 24 POLLING: PROGRAMMED I/O Advantage: Simple: the processor is totally in control and does all the work Disadvantage: Polling overhead can consume a lot of CPU time CPU IOC device Memory busy wait loop not an efficient way to use the CPU unless the device is very fast! but checks for I/O completion can be dispersed among computation intensive code Is the data ready? read data store data yes no done? no yes
CS2100 Input/Output 25 INTERRUPT DRIVEN DATA TRANSFER Advantage: User program progress is only halted during actual transfer Disadvantage, special hardware is needed to: Cause an interrupt (I/O device) Detect an interrupt (processor) Save the proper states to resume after the interrupt (processor) CPU IOC device Memory add sub and or nop read store... rti memory user program (1) I/O interrupt (2) save PC (3) interrupt service addr interrupt service routine (4) :
CS2100 Input/Output 26 I/O INTERRUPT An I/O interrupt is asynchronous with respect to instruction execution: I/O interrupt is not associated with any instruction I/O interrupt does not prevent any instruction from completion You can pick your own convenient point to take an interrupt I/O interrupt is complicated: Needs to convey the identity of the device generating the interrupt Interrupt requests can have different urgencies: Interrupt request needs to be prioritized
CS2100 Input/Output 27 DELEGATING I/O RESPONSIBILITY FROM THE CPU: DMA Direct Memory Access (DMA): External to the CPU Act as a master on the bus Transfer blocks of data to or from memory without CPU intervention CPU IOC device Memory DMAC CPU sends a starting address, direction, and length count to DMAC. Then issues “start”. DMAC provides handshake signals for Peripheral Controller, and Memory Addresses and handshake signals for Memory.
CS2100 Input/Output 28 SUMMARY I/O performance is limited by weakest link in chain between OS and device Wide range of devices Bus hierarchy and arbitration I/O device notifying the operating system: Polling: it can waste a lot of processor time I/O interrupt: similar to exception except it is asynchronous Delegating I/O responsibility from the CPU: DMA
CS2100 Input/Output 29 END