Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adapted from Computer Organization and Design, Patterson & Hennessy ECE232: Hardware Organization and Design Part 17: Input/Output Chapter 6

Similar presentations


Presentation on theme: "Adapted from Computer Organization and Design, Patterson & Hennessy ECE232: Hardware Organization and Design Part 17: Input/Output Chapter 6"— Presentation transcript:

1 Adapted from Computer Organization and Design, Patterson & Hennessy ECE232: Hardware Organization and Design Part 17: Input/Output Chapter 6 http://www.ecs.umass.edu/ece/ece232/

2 ECE232: I/O-1 2 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Anatomy: 5 components of any Computer Memory Devices Input Output Keyboard, Mouse Display, Printer Disk Processor Control Datapath Processor Cache Memory - I/O Bus Main Memory I/O Controller Disk I/O Controller I/O Controller Graphics Network interrupts

3 ECE232: I/O-1 3 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Handling IO  Users like to connect devices to their computers Keyboard, mouse, printer…  External devices may require attention from processor at unpredictable times CPU doesn’t know when you’re about to hit a key  IO devices can be very fast or very slow  Need to have a flexible way to control all devices

4 ECE232: I/O-1 4 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 I/O Device Examples and Speeds  I/O Speed: bytes transferred per second (from mouse to display: million-to-1) DeviceBehaviorPartner Data Rate (Mbit/sec) KeyboardInputHuman0.0001 MouseInputHuman0.0038 Laser PrinterOutput Human 3.2000 Magnetic DiskStorageMachine240-2560 ModemI or O Machine0.016-0.064 Network-LANI or OMachine100-1000 Graphics Display OutputHuman800-8000 See Fig. 6.2 Text

5 ECE232: I/O-1 5 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Hardware Solution (875 Chipset)

6 ECE232: I/O-1 6 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Disk Device Terminology  Several platters, with information recorded magnetically on both surfaces (usually)  Bits recorded in tracks, which in turn are divided into sectors (e.g., 512 Bytes)  Actuator moves head (end of arm, 1/surface) over track (“seek”), select surface, wait for sector rotate under head, then read or write “Cylinder”: all tracks under heads Platter Outer Track Inner Track SectorHeadArm Actuator

7 ECE232: I/O-1 7 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Disk Device Performance  Disk Latency = Seek Time + Rotation Time + Transfer Time + Controller Overhead  Seek Time - depends on no. tracks arm moves, seek speed  Average no. tracks arm moves? Sum all possible seek distances from all possible tracks / total # Assumes average seek distance is random Disk industry standard benchmark  Rotation Time - depends on rotation speed, how far sector is from head  1/2 time of a rotation Example: 7200 Revolutions Per Minute  120 Rev/sec 1 revolution = 1/120 sec  8.33 milliseconds 1/2 rotation (revolution)  4.16 ms  Transfer Time - depends on data rate (bandwidth) of disk (bit density), size of request

8 ECE232: I/O-1 8 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Disk Performance Model /Trends  Capacity + 100%/year (2X/1 yr)  Transfer rate (BW) + 40%/year (2X/2 yrs)  Rotation + Seek time – 8%/year (1/2 in 10 yrs)  MB/$ > 100%/yr (2X/<1.5 yr)

9 ECE232: I/O-1 9 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Disk Performance  Calculate time to read 1 sector (512B) for UltraStar 72 using advertised performance; sector is on outer track  Disk latency = average seek time + average rotational delay + transfer time + controller overhead = 5.3 ms + 0.5 * 1/(10000 RPM) + 0.5 KB / (50 MB/s) + 0.15 ms = 5.3 + 3.0 + 0.10 + 0.15 ms = 8.55 ms

10 ECE232: I/O-1 10 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Instruction Set Architecture for I/O  Some machines have special input and output instructions  Alternative model (used by MIPS): Input: ~ reads a sequence of bytes Output: ~ writes a sequence of bytes  Memory also a sequence of bytes, so use loads for input, stores for output Called “Memory Mapped Input/Output”  A portion of the address space dedicated to communication paths to Input or Output devices (no memory there)  These addresses are not regular memory, instead, they correspond to registers in I/O devices 0 0xFFFFFFFF 0xFFFF0000 cmd reg. data reg. address

11 ECE232: I/O-1 11 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Memory Mapped IO  Make control registers and I/O device data registers appear to be part of the system’s main memory Reads and writes to the mapped region of the memory are translated by memory controller hardware into accesses of hardware device Makes it easy to support variable numbers/types of devices – just map them onto different regions of memory  Accessing I/O device registers and memory can be done by accessing data structures via the device pointers Most device drivers are now written in C/C++. Memory mapped I/O makes this feasible without any changes to the way a CPU is programmed

12 ECE232: I/O-1 12 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Processor-I/O Speed Mismatch  1 GHz microprocessor can execute 1000 million load or store instructions per second, or 4 million KB/s data rate I/O devices from 0.01 KB/s to 30,000 KB/s  Input: device may not be ready to send data as fast as the processor loads it Also, might be waiting for human to act  Output: device may not be ready to accept data as fast as processor stores it  What to do?

13 ECE232: I/O-1 13 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Processor Checks Status before Acting: Polling  Path to device generally has 2 registers: 1 register says it’s OK to read/write (I/O ready), often called Control Register 1 register that contains data, often called Data Register  Processor reads from Control Register in loop, waiting for device to set Ready bit in Control reg to say its OK (0  1)  Processor then loads from (input) or writes to (output) data register Load from device/Store into Data Register resets Ready bit (1  0) of Control Register

14 ECE232: I/O-1 14 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Cost of Polling?  Assume: a 1 GHz processor takes 400 clock cycles for a polling operation (call polling routine, accessing the device, and returning). Determine % of processor time for polling Mouse: polled 30 times/sec - not to miss user movement Hard disk: transfers data in 16-byte chunks and can transfer at 8 MB/second. No transfer can be missed  Mouse Polling Clocks/sec = 30 * 400 = 12000 clocks/sec  % Processor for polling = 12*10 3 /1*10 9 = 0.0012%  Polling mouse has little impact on processor  Times Polling Disk/sec = 8 MB/s /16B = 500K polls/sec  Disk Polling Clocks/sec = 500K * 400 = 200,000,000 clocks/sec  % Processor for polling: 2*10 8 /1*10 9 = 20%  Unacceptable

15 ECE232: I/O-1 15 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 What is the alternative to polling? Interrupt  Wasteful to have processor spend most of its time “spin- waiting” for I/O to be ready  Wish we could have an unplanned procedure call that would be invoked only when I/O device is ready  Solution: use exception mechanism to help I/O. Interrupt program when I/O ready, return when done with data transfer Polling is like picking up the phone every few seconds to see if you have a call. Interrupt is like letting the phone ring

16 ECE232: I/O-1 16 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 I/O Interrupt  Controller sends interrupt to the processor along with additional information which device nature of interrupt: error, no paper, no ink,…  Processor halts execution of current program  Saves State  Processor looks up which handler to start from the interrupt information  When interrupt is handled, returns to program state and resumes

17 ECE232: I/O-1 17 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Interrupt Driven Data Transfer (1) I/O interrupt (2) save PC (3) interrupt service addr Memory add sub and or user program read store... jr interrupt service routine (4) (5)  

18 ECE232: I/O-1 18 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Benefit of Interrupt-Driven I/O  500 clock cycle overhead for each transfer, including interrupt. Find the % of processor consumed if the hard disk is only active 5% of the time  If interrupt rate = polling rate Disk Interrupts/sec = 8 MB/s /16B = 500K interrupts/sec Disk Polling Clocks/sec = 500K * 500 = 250,000,000 clocks/sec % Processor used during transfers: 250*10 6 /1*10 7 = 25%  If disk active 5%  5% * 25%  1.25% busy

19 ECE232: I/O-1 19 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Interrupts – Multiple devices  Aggregates interrupts  Prioritization (network, keyboard,..) Processor Advanced Priority Interrupt Controller (APIC) Device 1 Device 2 Device n Device i

20 ECE232: I/O-1 20 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Interrupt vs. Polling  Which is better: Interrupts or Polling? Interrupts are better if the processor has something else to do and the time-to-response is not critical  Polling is better if the processor has to respond to an event ASAP Polling is also used when data is expected at regular intervals such as in a modem Modem typically connects to a “com” port The “com” port can be polled at expected intervals

21 ECE232: I/O-1 21 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 Direct Memory Access (DMA)  How to transfer large amounts of data between a Device and Memory? Waste of CPU cycles if done through CPU  Let the device controller transfer data directly to and from memory => DMA  The CPU sets up the DMA transfer by supplying the type of operation, memory address and number of bytes to be transferred  The DMA controller contacts the bus directly, provides memory address and transfers the data  Once the DMA transfer is complete, the controller interrupts the CPU to inform completion  Cycle Stealing – Bus gives priority to DMA controller thus stealing cycles from the CPU

22 ECE232: I/O-1 22 Adapted from Computer Organization and Design, Patterson&Hennessy, Kundu,UMass Koren, 2011 OS control of I/O operations  Low-level control of I/O device is complex because it requires managing a set of concurrent events and because requirements for correct device control are often very detailed  I/O systems often use interrupts to communicate information about I/O operations and these can occur at a random time  The I/O system is shared by multiple programs using the processor  Would like I/O services for all user programs under safe control


Download ppt "Adapted from Computer Organization and Design, Patterson & Hennessy ECE232: Hardware Organization and Design Part 17: Input/Output Chapter 6"

Similar presentations


Ads by Google