D75P 34 – HNC Computer Architecture Week 8 Direct Memory Access. © C Nyssen/Aberdeen College 2003 All images © C Nyssen/Aberdeen College except where stated Prepared 20/10/03
Direct Memory Access is an i/o technique enabling high speed data transfers. This is where a device is allowed to take over the main computer bus from the CPU and transfer bytes directly to/from main memory. This is also called Autonomous Data Transfer.
Normally the CPU would make such a transfer in a two step process: 1. reading from the I/O memory space of the device and putting these bytes into the CPU itself 2. writing these bytes from the CPU to main memory With DMA it's usually a one step process of sending the bytes directly from the device to memory.
A transfer using DMA is set up by a DMAC (Direct Memory Access Controller). Modern PC systems which still utilise ISA busses require DMAC chips to use this system.
The device must have such capabilities built into its hardware and thus not all devices can use DMA. Devices do not normally install to use DMA by default, and have to be manually set to utilise it.
Because the data can pass directly between the RAM and device, this saves both time and frees the processor up to get on with other things. The DMAC will either stop the CPU and access the RAM (cycle stealing DMA) or use the bus when the CPU does not required it (hidden cycle DMA). One of the limitations of DMA is that the DMAC still uses the same busses as the CPU, and only one can use it at a time.
The DMAC makes a request to the CPU The CPU hands over control of the bus The DMAC reads the device and transfers the data to memory The DMAC hands back control to the CPU at the end of the operation
The IBM series (successors and clones) all use 2 x Intel 8237 chips, mounted on one circuit, as a DMAC. Each chip provides 4 separate “lines”, or channels, giving a total of 8. DMA channels are numbered 0 – 7. Channel 4 is never used by devices as it is used by the DMAC itself!
Like IRQ lines, each DMA-enabled device should have it’s own channel. It is sometimes possible (but risky) to force two devices to share a channel, but only one can use it at the one time.
Devices on a PCI bus don’t need the DMAC. Instead they use a system called Bus Mastering. Photo © ExHardware, with permission This is, confusingly, sometimes called DMA as well - for example, hard disk drives may be marketed as "UDMA".
Bus Mastering transfers bytes in exactly the same way as DMA, but doesn’t need the channel numbers. This is why you won’t see any resource allocation of DMA channels for items on a PCI bus. This controller card fits into a PCI slot, and lets you add more devices if you have run out of IDE sockets. It is marketed as an ATA-133 standard card, another form of DMA that can transfer data at up to 133 MB/second. Photo © Alchemy, with permission
As part of your assessment, you will have to draw a graph showing the difference in efficiency between DMA-enabled and non-enabled devices. “A system can write from RAM to a hard drive at a rate of 10 bytes/ms where DMA cannot be used.” How long will it take to write a typical 512-byte sector?
512 bytes / 10 = 51.2 bytes per ms. We can plot this on a graph like this…
You will then be given a breakdown of the time the data spends, passing through each component buffer. “The data spends 25% of this time passing through RAM, 25% of the time through the hard drive buffer and the remaining 50% of time through the CPU.” The data would spend 25.6 ms in the CPU, 12.8 ms in the RAM buffer and 12.8 ms in the hard drive buffer.
At the moment, the total time taken is in direct proportion to the amount of data transferred. So 2 KB of data would take milliseconds to be written. The proportions would be the same – ms in the CPU, 51.2 ms each in the RAM and Hard Drive buffers. Because the amount of time taken will always vary in this way, this is a “Variable Cost”.
If we were to use DMA or Bus Mastering, we would save the time that the data would normally spend passing through the CPU. Our original 512 bytes of data would now only take 25.6ms to transfer – 12.8ms in the RAM and 12.8ms in the HDD buffer. We have dispensed with the CPU time completely, thus saving 25.6ms.
This is still totally variable – the 2kB of data would take 102.4ms (51.2ms in the RAM, 51.2ms in the HDD buffer, no time in the CPU). By dispensing with the CPU, we can transfer the same amount of data in 50% of the time. Or alternatively, double the amount of data in the same amount of time!
There is, however, something to pay for this extra speed. Some time has to be allowed to set up the DMAC or Bus Mastering so that the CPU can be bypassed. This is a “Fixed Cost” because it always takes the same amount of time, no matter how much data is being transferred.
“The DMA controller requires 2ms setup time”. This will always take 2ms, whether 512 bytes or 512 gigabytes are being transferred.
Our original 512 bytes will now take 27.6 ms to transfer – (12.8ms in HDD buffer) + (12.8ms in RAM) + (2.0ms for DMAC setup). The 2kB will take milliseconds - (51.2ms in HDD buffer) + (51.2ms in RAM) + (2.0ms for DMAC setup). The total time taken consists of two separate elements – Variable time-cost (depending on amount of data being transferred through buffers) plus Fixed time-cost (to set up the transfer).
To show the whole measurement on the graph, we have to add the 2ms on to every single point on the variable cost line. This has the effect of a vertical shift on the graph of 2 ms.
Summary DMA stands for "Direct Memory Access". This is where a device is allowed to take over the main computer bus from the CPU and transfer bytes directly to main memory. There are two “sorts” of DMA – DMAC chip control for ISA busses Bus Mastering for PCI busses Only DMA-enabled devices can use this system. Some time has to be allocated to set up the DMAC or Bus Mastering, but the data saves time by not having to pass through the CPU.