Using Direct Memory Access to Improve Performance

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

I/O Organization popo.
Avishai Wool lecture Introduction to Systems Programming Lecture 8 Input-Output.
Interfacing. This Week In DIG II  Basic communications terminology  Communications protocols  Microprocessor interfacing: I/O addressing  Port and.
© 2006 Pearson Education, Upper Saddle River, NJ All Rights Reserved.Brey: The Intel Microprocessors, 7e Chapter 13 Direct Memory Access (DMA)
6-1 I/O Methods I/O – Transfer of data between memory of the system and the I/O device Most devices operate asynchronously from the CPU Most methods involve.
Architectural Support for Operating Systems. Announcements Most office hours are finalized Assignments up every Wednesday, due next week CS 415 section.
Computer System Structures memory memory controller disk controller disk controller printer controller printer controller tape-drive controller tape-drive.
CHAPTER 9: Input / Output
INPUT/OUTPUT ORGANIZATION INTERRUPTS CS147 Summer 2001 Professor: Sin-Min Lee Presented by: Jing Chen.
External Devices I/O Modules Programmed I/O Interrupt Driven I/O Direct Memory Access I/O Channels and Processors.
Group 7 Jhonathan Briceño Reginal Etienne Christian Kruger Felix Martinez Dane Minott Immer S Rivera Ander Sahonero.
2. Methods for I/O Operations
INTERRUPTS PROGRAMMING
1 Computer System Overview Chapter 1 Review of basic hardware concepts.
NS Training Hardware. System Controller Module.
Chapter 10: Input / Output Devices Dr Mohamed Menacer Taibah University
V 0.91 Polled IO versus Interrupt Driven IO Polled Input/Output (IO) – processor continually checks IO device to see if it is ready for data transfer –Inefficient,
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
Computer System Overview Chapter 1. Operating System Exploits the hardware resources of one or more processors Provides a set of services to system users.
CHAPTER 9: Input / Output
MICROPROCESSOR INPUT/OUTPUT
Interrupts and DMA CSCI The Role of the Operating System in Performing I/O Two main jobs of a computer are: –Processing –Performing I/O manage and.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Principles of I/0 hardware.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
1-1 Embedded Network Interface (ENI) API Concepts Shared RAM vs. FIFO modes ENI API’s.
Top Level View of Computer Function and Interconnection.
DMA Versus Polling or Interrupt Driven I/O
Chapter 1: Introduction. 1.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 1: Introduction What Operating Systems Do Computer-System.
I/O management is a major component of operating system design and operation Important aspect of computer operation I/O devices vary greatly Various methods.
I/O Interfacing A lot of handshaking is required between the CPU and most I/O devices. All I/O devices operate asynchronously with respect to the CPU.
 8251A is a USART (Universal Synchronous Asynchronous Receiver Transmitter) for serial data communication.  Programmable peripheral designed for synchronous.
Interrupts, Buses Chapter 6.2.5, Introduction to Interrupts Interrupts are a mechanism by which other modules (e.g. I/O) may interrupt normal.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
ATtiny23131 A SEMINAR ON AVR MICROCONTROLLER ATtiny2313.
© 2008, Renesas Technology America, Inc., All Rights Reserved 1 Module Introduction Purpose  This training module provides an overview of the peripherals.
Modes of transfer in computer
Embedded Network Interface (ENI). What is ENI? Embedded Network Interface Originally called DPO (Digital Product Option) card Printer without network.
1 ARM University Program Copyright © ARM Ltd 2013 Using Direct Memory Access to Improve Performance.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
12/16/  List the elements of 8255A Programmable Peripheral Interface (PPI)  Explain its various operating modes  Develop a simple program to.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Review of Computer System Organization. Computer Startup For a computer to start running when it is first powered up, it needs to execute an initial program.
Direct Memory Access (DMA). DMA Features  7 independently configurable channels  Software programmable priorities: Very high, High, Medium or Low. 
بسم الله الرحمن الرحيم MEMORY AND I/O.
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Amdahl’s Law & I/O Control Method 1. Amdahl’s Law The overall performance of a system is a result of the interaction of all of its components. System.
Introduction to Exceptions 1 Introduction to Exceptions ARM Advanced RISC Machines.
EE 345S Real-Time Digital Signal Processing Lab Fall 2008 Lab #3 Generating a Sine Wave Using the Hardware & Software Tools for the TI TMS320C6713 DSP.
1 Chapter 1 Basic Structures Of Computers. Computer : Introduction A computer is an electronic machine,devised for performing calculations and controlling.
Tiva C TM4C123GH6PM UART Embedded Systems ECE 4437 Fall 2015 Team 2:
UNIT – Microcontroller.
Direct Memory address and 8237 dma controller LECTURE 6
RX Data Transfer Controller (DTC)
EE 445S Real-Time Digital Signal Processing Lab Spring 2017
Serial Communication Interface: Using 8251
Computer System Overview
8237 DMA CONTROLLER.
Chapter 13 DMA Programming.
CPE 323 Introduction to Embedded Computer Systems: DMA Controller
8237 DMA CONTROLLER.
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
Md. Mojahidul Islam Lecturer Dept. of Computer Science & Engineering
Md. Mojahidul Islam Lecturer Dept. of Computer Science & Engineering
Moving Arrays -- 2 Completion of ideas needed for a general and complete program Final concepts needed for Final DMA.
COMP3221: Microprocessors and Embedded Systems
Chapter 13: I/O Systems.
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Using Direct Memory Access to Improve Performance

Overview Basic Concepts DMA Peripherals in STM32F4 DMA Applications Data Transfer Replacing ISRs

Basic Concepts Special hardware to read data from a source and write it to a destination Various configurable options Number of data items to copy Source and destination addresses can be fixed or change (e.g. increment, decrement) Size of data item When transfer starts Operation Initialization: Configure controller Transfer: Data is copied Termination: Channel indicates transfer has completed Post-transfer operation (assert an interrupt)

DMA Controller Features Two DMA controllers 16 streams in total (8 for each controller) Up to 8 channels (request) per stream Each channel is responsible for specific peripheral or memory requests Arbiter handle the priority between DMA requests 4 levels of software programmable priority 4 separate 32 first-in, first-out memory buffers(FIFOs) per stream

Channels and Streams Streams are concurrent Each stream can be only triggered by one of 8 channels at a time

DMA Controller

Registers DMA_SxCR DMA_SxPAR DMA_SxM0AR Stream x configuration register DMA_SxPAR 32-bit Stream x Peripheral address register DMA_SxM0AR 32-bit Stream x Memory address register Both source and destination transfers can address the entire 4 GB area, at address comprised between 0x0000 0000 and 0xFFFF FFFF

DMA Stream x Configuration Register DMA_SxCR(x=0..7) Configure the concerned stream CHESEL[2:0]: channel selection, can be written if EN is cleared. PL[1:0]: Priority level, with 11 being very high and 00 being low. MSIZE[1:0]: Memory data size PSIZE[1:0]: Peripheral data size DIR[1:0]: Data transfer direction 00: P to M; 01: M to P; 10: M to M; 11: reserved EN: Stream enable/ flag stream ready when read low

DMA Stream x Configuration Register DMA_SxCR(x=0..7) Configure the concerned stream MSIZE[1:0]; PSIZE[1:0] data size configuration 00: byte; 01:half-word; 10:word; 11:reserved TCIE: transfer complete interrupt enable Other configuration includes: Direct mode Error interrupt Circular mode Burst transfer Increment

DMA stream x number of data register DMA_SxNDTR(x = 0..7) NDT[15:0] Number of data items to transfer 0 up to 65535 Writable only when the stream is disabled When the stream is enabled, it is read-only, indicating the remaining data items to be transmitted When the value is zero, no transaction will be served

Basic Use of DMA Wait for the stream operation to be finished Clear DMA_LISR and DMA_HISR Set the source address Set the destination address Specify numbers of datum and width of each datum Configure the channel and stream Configure the priority Configure the FIFO usage Configure the data transfer direction Enable the DMA channel Start to transfer Wait for end of transfer

End of transfer It is possible to interrupt in case of the following events: If the interrupt is disabled, there are still two ways to check if the transfer has ended (in normal mode): The EN bit of CR will be hardware cleared at the end of the transfer The value of DMA_SxNDTR register will be 0, indicating no items to be transmitted

Demonstration: Flash to SRAM transfer Software-triggered by enabling the Channel Transfer word data buffer from Flash memory to embedded SRAM memory DMA2 Steam 0 Channel 0 is configured to transfer the 32-word data Only DMA2 Streams are able to perform memory to memory transfers Could be used as a fast version of memcpy function, but performed by DMA instead of CPU

DMA vs. ISR To communicate between CPU and peripherals, there are basically four approaches: Polling Interrupt DMA Channel I/O Interrupt improves the performance of the CPU in many ways compared to polling. However, the overhead of interrupt may scale up as the number of peripherals increases. DMA, on the other hand, take care of the peripheral once the configuration is made by the CPU. Exempt the CPU from being interrupt too frequently.

Triggering DMA Activity Using Peripherals In general cases, have to configure DMA and peripherals at the same time so that DMA can be triggered by a specific peripherals. Many peripherals have DMA supports (e.g. ADC), but it have to be at least enabled! Memory to peripheral or peripheral to memory mode.

DMA1 Requests Peripheral Requests S0 S1 S2 S3 S4 S5 S6 S7 C0 SPI3_RX SPI2_TX SPI3_TX C1 I2C1_RX TIM7_UP I2C1_TX C2 TIM4_CH1 I2S3_EXT_RX TIM4_CH2 I2S2_EXT_TX I2S3_EXT_TX TIM4_UP TIM4_CH3 C3 TIM2_UP TIM2_CH3 I2C3_RX I2S2_EXT_RX I2C3_TX TIM2_CH1 TIM2_CH2 TIM2_CH4 C4 UART5_RX USART3_RX UART4_RX UART3_TX UART4_TX USART2_RX USART2_TX UART5_TX C5 TIM3_CH4 TIM3_UP TIM3_CH1 TIM3_TRIG TIM3_CH2 TIM3_CH3 C6 TIM5_CH3 TIM5_UP TIM5_CH4 TIM5_TRIG TIM5_CH1 TIM5_CH2 C7 TIM6_UP I2C2_RX USART3_TX DAC1 DAC2 I2S2_TX

DMA2 Requests Peripheral Requests S0 S1 S2 S3 S4 S5 S6 S7 C0 ADC1 TIM8_CH1 TIM8_CH2 TIM8_CH3 C1 DCMI ADC2 C2 ADC3 CRYP_OUT CRYP_IN HASH_IN C3 SPI1_RX SPI1_TX C4 USART1_RX SDIO USART1_TX C5 USART6_RX USART6_TX C6 TIM1_TRIG TIM1_CH1 TIM1_CH2 TIM1_CH4 TIM1_COM TIM1_UP TIM1_CH3 C7 TIM8_IP TIM8_CH4 TIM8_TRIG TIM8_COM

Demonstration of ISR Replacement: ADC to DMA Recall that ADC can be read either by polling or can be interrupt-driven (EOC: end of conversion or EOCIE end of conversion interrupt enable) ADC converted result will be automatically read by DMA thus giving the CPU freedom to perform other tasks. DMA can interrupt the CPU whenever it is necessary, or a periodic interrupt can perform a regular check of the data collected. The most optimized way to make full use of both CPU and DMA is quite application-specific. Needs to be decided accordingly.

Performance Comparison(Test result from MKL25Z4) Traces Yellow: ISR is executing when trace is low Blue: DAC output Without DMA: Interrupt per sample 4.7 microseconds per 620 microseconds 0.758% of processor’s time With DMA: Interrupt per cycle 5.0 microseconds per 20 milliseconds 0.025% of processor’s time How is this useful? Saves CPU time Reduces timing vulnerability to interrupts being disabled Enables CPU to sleep longer, wake up less often (20 milliseconds vs. 620 microseconds)