4-1 Section 3 Data Address Generators (DAGs) DSP 技术与应用.

Slides:



Advertisements
Similar presentations
CPU Structure and Function
Advertisements

PIPELINE AND VECTOR PROCESSING
There are two types of addressing schemes:
Experiment 2 PIC Program Execution & Built-In Self-Test.
Real time DSP Professors: Eng. Julian S. Bruno Eng. Jerónimo F. Atencio Sr. Lucio Martinez Garbino.
Computer Organization and Architecture
Chapter 5 The LC-3.
Chapter 12 CPU Structure and Function. CPU Sequence Fetch instructions Interpret instructions Fetch data Process data Write data.
Computer Organization and Architecture
Computer Organization and Architecture
Zero Padding Most implementations of the FFT require that the length of x(n) be an integer power of 2 (i.e., 4, 8, 16, 32, …). What if x(n) is not an integer.
Chapter 6: Machine dependent Assembler Features
This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items during.
Some thoughts: If it is too good to be true, it isn’t. Success is temporary. It is hard work to make it simple. Knowing you did it right is enough reward.
Chapter 12 Pipelining Strategies Performance Hazards.
LC-3 Instruction Set Architecture (Textbook’s Chapter 5)
Pipelining Fetch instruction Decode instruction Calculate operands (i.e. EAs) Fetch operands Execute instructions Write result Overlap these operations.
Computer System Overview
Chapter 6 Programming in Machine Language The LC-3 Simulator
Alyssa Concha Microprocessors Final Project ADSP – SHARC Digital Signal Processor.
CH11 Instruction Sets: Addressing Modes and Formats
Chapter 12 CPU Structure and Function. Example Register Organizations.
LC-3 Instruction Set Architecture
Feb 12, 2004Tiger SHARC Memory Operations REV B 1 of 17 ENEL DSP Architectures Tiger SHARC Memory Operations.
Gursharan Singh Tatla Block Diagram of Intel 8086 Gursharan Singh Tatla 19-Apr-17.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
CH12 CPU Structure and Function
Computer Science 210 Computer Organization The Instruction Execution Cycle.
Processor Architecture Needed to handle FFT algoarithm M. Smith.
3-1 Section 2 Computational Unit DSP 技术与应用. 3-2 ADSP-219x Block Diagram 160 MHz Up to 64K words RAM 16K words ROM optional Boot ROM Multi-channel serial.
Edited By Miss Sarwat Iqbal (FUUAST) Last updated:21/1/13
CPU Design. Introduction – The CPU must perform three main tasks: Communication with memory – Fetching Instructions – Fetching and storing data Interpretation.
DSP Processors We have seen that the Multiply and Accumulate (MAC) operation is very prevalent in DSP computation computation of energy MA filters AR filters.
Chapter 5 The LC Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization.
The LC-3. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 5-2 Instruction Set Architecture ISA = All of the.
Multiple-Cycle Hardwired Control Digital Logic Design Instructor: Kasım Sinan YILDIRIM.
Chapter 5 The LC Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization.
Overview of Super-Harvard Architecture (SHARC) Daniel GlickDaniel Glick – May 15, 2002 for V (Dewar)
ECEG-3202 Computer Architecture and Organization Chapter 6 Instruction Sets: Addressing Modes and Formats.
DIGITAL SIGNAL PROCESSORS. Von Neumann Architecture Computers to be programmed by codes residing in memory. Single Memory to store data and program.
Nadathur R Satish and Pierre-Yves Droz EECS Department, University of California Berkeley.
Structure and Role of a Processor
Computer Organization Instructions Language of The Computer (MIPS) 2.
Introduction to Computing Systems and Programming The LC-2.
Memory Hierarchy— Five Ways to Reduce Miss Penalty.
DSP技术与应用 Section 4 ADSP-2191 Memory.
Machine dependent Assembler Features
William Stallings Computer Organization and Architecture 6th Edition
William Stallings Computer Organization and Architecture 8th Edition
Chapter 5 The LC-3.
DSP56800E System Architecture
Computer Organization and ASSEMBLY LANGUAGE
Chapter 5 The LC-3.
Lecture 11: Memory Data Flow Techniques
Introduction to Computer Engineering
ECEG-3202 Computer Architecture and Organization
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
* 07/16/96 This presentation will probably involve audience discussion, which will create action items. Use PowerPoint to keep track of these action items.
CPU Structure and Function
Chapter 11 Processor Structure and function
Presentation transcript:

4-1 Section 3 Data Address Generators (DAGs) DSP 技术与应用

4-2 ADSP-219x Block Diagram

4-3 Data Address Generator (DAG) Functions  DAGs Fetches/Stores to Data Memory and Program Memory  Registered Indirect Addressing  Automatic Pre-Modify and Post-Modify of Addresses  Modify Address  Circular Buffering  Bit-Reversal for FFT Support (DAG 1 Only) Features  Single-cycle context switch (sec_dag)  DMPGx to generate a 24-bit address range  Dual Data Fetch from Memory

4-4 Data Address Generator Block Diagram Bit reverse only available on DAG 1

4-5 Data Address Generators (DAGs) features  4 Index Registers (Ireg) per DAG  Contains address Index of data to be accessed – basically, a memory pointer  4 Modify Registers (Mreg) per DAG  Contains Modify value for pre or post modification of address pointer.  4 Length Registers (Lreg) per DAG  Contains the Length of the circular buffer.  4 Base Registers (Breg) per DAG  Contains the Base-address of the circular buffer. Notes 1. Secondary Register Set for all DAG registers (ena SEC_DAG or ena SD) 2. Within a DAG, any Modify register can be used with any Index register 3. Length registers are tied to their corresponding Index and Base registers 4. Length registers are not initialized at power-up, and must be set prior to corresponding Index register use 5. Length registers must be set to 0 if circular buffers are not used

4-6 DAG Registers

4-7 DAG Memory Page Registers (DMPGx)  DMPG1  This page register is associated with DAG1  Supports indirect memory accesses using DAG1  DMPG2  This page register is associated with DAG2  Supports indirect memory accesses using DAG2  Direct addressing uses page information (i.e. the 8 MSBs) from DMPG1

4-8 I0 I1 I2 I3 I4 I5 I6 I7 M0 M1 M2 M3 M4 M5 M6 M7 L0 L1 L2 L3 L4 L5 L6 L7 B0 B1 B2 B3 B4 B5 B6 B7 DAG1 Data Address Generators (DAGs) DAG2 24 PM Address Bus DM Address Bus 8 8 DMPG1 8 8 DMPG

4-9 Data Address Generators (DAGs) I M + I + M Only output, no update Dreg = DM (Ireg + Mreg) ; Dreg = PM (Ireg + Mreg) ; DM (Ireg + Mreg) = Dreg ; PM (Ireg + Mreg) = Dreg ; I M + I + M 1. output 2. update  Post-modify with Mreg register, update Ireg register  Pre-modify with Mreg register, no update Dreg = DM (Ireg += Mreg) ; Dreg = PM (Ireg += Mreg) ; DM (Ireg += Mreg) = Dreg ; PM (Ireg += Mreg) = Dreg ; Dreg = AX0, AX1, AY0, AY1, AR, MX0, MX1, MY0, MY1, MR, MR2, SR0, SR1, SR2, SI

4-10 I M + I + M only update, no output modify (Ireg += Mreg) Data Address Generators (DAGs)  Modify:  Ireg will be updated with Mreg, no memory access will be performed.

4-11 Data Address Generators (DAGs) examples AX0 = PM(i5 + m7); // Pre-modify with M register, no update MR1 = DM(i5 + 0x11); // Pre-modify with immediate modifier, no update DM(i5 + 27) = SR1; // Pre-modify with immediate modifier, no update AY1 = DM(i3 += m1); // Post-modify with M register, i3 gets updated PM(i4 += m6) = MR0; // Post-modify with M register, i4 gets updated MODIFY(i6 += m6); // Update the index register i6 without memory // access DM(0x123) = MR0; // Direct memory access DM(i2 += m5) = MR0; // using of mixed DAG registers are not allowed

4-12 Example DAG Instructions DMPG1 = 0x00; // load page 0 to access internal memory DMPG2 = 0x00; // load page 0 to access internal memory AX0 = DM(0x3800); // load AX0 with the contents of address // 0x3800. This is a data Memory READ // with a direct address) I0=0x3800; // setup I,M and L registers of DAG1 L0=0; // l0=0, therefore this buffer is NOT circular M0=1; AX0 = I0; // optional because L0 = 0 reg(B0) = AX0; AX0 = DM(I0+=M0); // DM Bus read (post modify) AY1 = DM(I4+M7); // DM Bus read (pre modify) using DAG2 AX1 = PM(I4+= 5); // PM Bus read (immediate modify value) MODIFY (I4+=M5); // add the value in M5 to I4 The upper 8 bits of the 24 bit address are in the DMPG1 (DAG1) and DMPG2 (DAG2) register

4-13 Data Move Instructions, Post Modify Indirect 16-bit memory read, post modify: Dreg = DM(Ireg += Mreg); G1reg G2reg G3reg Indirect 16-bit memory write, post modify: DM(Ireg += Mreg) = Dreg ; G1reg G2reg G3reg Indirect 24-bit memory read, post modify: Dreg = PM(Ireg += Mreg); G1reg G2reg G3reg Indirect 24-bit memory write, post modify: PM(Ireg += Mreg) = Dreg ; G1reg G2reg G3reg Dreg = AX0, AX1, MX0, MX1, AY0, AY1, MY0, MY1, MR2, SR2, AR, SI, MR1, SR1, MR0, SR0 G1reg = I0, I1, I2, I3, M0, M1, M2, M3, L0, L1, L2, L3, IMASK, IRPTL, ICNTL, STACKA G2reg = I4, I5, I6, I7, M4, M5, M6, M7, L4, L5, L6, L7, CNTR, LPSTACKA G3reg = ASTAT, MSTAT, SSTAT, LPSTACKP, CCODE, SE, SB, PX, DMPG1, DMPG2, IOPG, IJPG, STACKP Ireg = I0, I1, I2, I3, I4, I5, I6, I7 Mreg = M0, M1, M2, M3, M4, M5, M6, M7

4-14 Data Move Instructions, Pre Modify Indirect 16-bit memory read, premodify: Dreg = DM(Ireg + Mreg); G1reg G2reg G3reg Indirect 16-bit memory write, pre-modify: DM(Ireg + Mreg) = Dreg ; G1reg G2reg G3reg Indirect 24-bit memory read, pre-modify: Dreg = PM(Ireg + Mreg); G1reg G2reg G3reg Indirect 24-bit memory write, pre-modify: PM(Ireg + Mreg) = Dreg ; G1reg G2reg G3reg Dreg = AX0, AX1, MX0, MX1, AY0, AY1, MY0, MY1, MR2, SR2, AR, SI, MR1, SR1, MR0, SR0 G1reg = I0, I1, I2, I3, M0, M1, M2, M3, L0, L1, L2, L3, IMASK, IRPTL, ICNTL, STACKA G2reg = I4, I5, I6, I7, M4, M5, M6, M7, L4, L5, L6, L7, CNTR, LPSTACKA G3reg = ASTAT, MSTAT, SSTAT, LPSTACKP, CCODE, SE, SB, PX, DMPG1, DMPG2, IOPG, IJPG, STACKP Ireg = I0, I1, I2, I3, I4, I5, I6, I7 Mreg = M0, M1, M2, M3, M4, M5, M6, M7

4-15 Data Move Instructions, Immediate Values Indirect memory read/write, immediate postmodify value: Dreg = DM(Ireg += ); DM(Ireg += ) = Dreg; Indirect memory read/write, immediate premodify value: Dreg = DM(Ireg + ); DM(Ireg + ) = Dreg; Indirect 16-bit memory write, immediate data: (this op-code is two words long) DM(Ireg += Mreg) = ; Indirect 24-bit memory write, immediate data: (this op-code is two words long) PM(Ireg += Mreg) = :24; Dreg = AX0, AX1, MX0, MX1, AY0, AY1, MY0, MY1, MR2, SR2, AR, SI, MR1, SR1, MR0, SR0 G1reg = I0, I1, I2, I3, M0, M1, M2, M3, L0, L1, L2, L3, IMASK, IRPTL, ICNTL, STACKA G2reg = I4, I5, I6, I7, M4, M5, M6, M7, L4, L5, L6, L7, CNTR, LPSTACKA G3reg = ASTAT, MSTAT, SSTAT, LPSTACKP, CCODE, SE, SB, PX, DMPG1, DMPG2, IOPG, IJPG, STACKP Ireg = I0, I1, I2, I3, I4, I5, I6, I7, Mreg = M0, M1, M2, M3, M4, M5, M6, M7

4-16 Data Move Instructions Direct memory read, immediate address: Dreg = DM( ); Ireg Mreg Direct memory write, immediate address: DM( ) = Dreg ; Ireg Mreg Modify address register, indirect: MODIFY (Ireg += Mreg); Modify address register, direct: MODIFY(Ireg += ); Dreg = AX0, AX1, MX0, MX1, AY0, AY1, MY0, MY1, MR2, SR2, AR, SI, MR1, SR1, MR0, SR0 G1reg = I0, I1, I2, I3, M0, M1, M2, M3, L0, L1, L2, L3, IMASK, IRPTL, ICNTL, STACKA G2reg = I4, I5, I6, I7, M4, M5, M6, M7, L4, L5, L6, L7, CNTR, LPSTACKA G3reg = ASTAT, MSTAT, SSTAT, LPSTACKP, CCODE, SE, SB, PX, DMPG1, DMPG2, IOPG, IJPG, STACKP Ireg = I0, I1, I2, I3, I4, I5, I6, I7 Mreg = M0, M1, M2, M3, M4, M5, M6, M7

4-17 Data Move Instructions Register to register move: Dreg = Dreg ; G1reg G2reg G3reg Direct register load: Dreg = ; G1reg G2reg G3reg = ; Dreg = AX0, AX1, MX0, MX1, AY0, AY1, MY0, MY1, MR2, SR2, AR, SI, MR1, SR1, MR0, SR0 G1reg = I0, I1, I2, I3, M0, M1, M2, M3, L0, L1, L2, L3, IMASK, IRPTL, ICNTL, STACKA G2reg = I4, I5, I6, I7, M4, M5, M6, M7, L4, L5, L6, L7, CNTR, LPSTACKA G3reg = ASTAT, MSTAT, SSTAT, LPSTACKP, CCODE, SE, SB, PX, DMPG1, DMPG2, IOPG, IJPG, STACKP Ireg = I0, I1, I2, I3, I4, I5, I6, I7 Mreg = M0, M1, M2, M3, M4, M5, M6, M7

4-18 Data Address Generators (DAGs) Indirect DAG register write (pre / post modify), with DAG register move DM(Ireg1 + Mreg1) = Ireg2, Ireg2 = Ireg1; + = Mreg2 Mreg2 Lreg2 Lreg2 Register restrictions for this instruction: Ireg1 must be the same register Mreg1 must come from the same DAG as Ireg1 Ireg2, Mreg 2, or lreg2 must be the same register Ireg2, Mreg2, or Lreg2 must come from the same DAG as Ireg1, but may not be Ireg1 Example: DM( I4 += M5 ) = I5, I5 = I4; Not the same register Same register All registers must be from same DAG Same register

4-19 Circular Data Buffer Addressing DMPG1 = page(number); //Set the memory page I0 = data_buffer;//I0 = Current Address M2 = 1;//M2 = Modify Value L0 = Length(data_buffer); //L0 = Buffer Length //|M| < L//M must be smaller //than L AX0 = I0; reg(B0) = AX0;//reg(B0) = 0x0030 AX0 = DM(I0+=M2); //load data 0x0030 0x0037 I0 Memory  Circular buffer works with postmodify addressing only  You have to set up the Lreg register in any case

4-20 Address Sequence AX0=0x0030; reg(B0) = AX0; I0 = AX0; M0 = 3; L0 = 8; //|M| < L AX0 = DM(I0+=M0); Circular Data Buffer Addressing 0x0030 0x0037 I0 Memory Fetch 1 Fetch 3 Fetch 2 Fetch 5 Fetch 4 0x0036 0x0033 0x0037 0x0031 0x0030 0x0034 0x0037

4-21 Bit Reversal  Mostly used in FFT routines  Only available with DAG1  Enabled by setting bit 1 of MSTAT register  ENA BIT_REV or ENA BR;  Reverses all 16 bits of address normal order: bit-reversed: For a buffer of size 2^N, set M register to 2^(16-N) i.e. a buffer of size 8 = 2^3 locations, M = 2^(16-3) = 2^13 = 8192 = 0x2000 A15 A0 0x28 0x1400

4-22 Bit Reversal I register must be initialized with the bit reversed value of the starting address of the buffer (You must calculate or use the simulator to determine the value). The starting address for the data array must be an integer multiple of the FFT size (0, N, 2N.....).section/dm dm_data; //Address 0x8000.var destination[8];.var read_in[8];.section/pm program; start: i4 = read_in; // load the address read_in i0 = 0x01; // I0 must be calc M4 = 1; M0 = 0x2000; // Calculated value of M0 L4 = 0; L0 = 0; CNTR = 8; ENA BIT_REV; Do brev until CE; AY1 = DM(I4+=M4); // load the data brev: DM(I0+=M0) = AY1; // strore the data rev DIS BIT_REV; Read_in 0x Data 0x8001 0x8002 0x8003 0x8004 0x8005 0x8006 0x8007 Addr Data 0x8008 0x8009 0x800A 0x800B 0x800C 0x800D 0x800E 0x800F Addr Destination

4-23 PM Bus Exchange (PX) Register  Type 32 instruction reads 24 bits from address 0x2000. Upper 16 bits are stored in AR. Lower 8 bits go to the PX register.  24-bit indirect store  Hidden 24 bit copy  16-bit on chip memory PX is filled by zeroes AR=PM(I4+=M5); // I4 = 0x2000 PX=AX0; // lower 8 bits PM(I4+M5)=AY0; // upper 16 bits AR=PM(I4+M5); // writes 8 lower bits to PX PM(I5+M5)=AR; // reads 8 lower from PX AR=PM(I4+=M5); // I4 = 0x8000 PX PM DM

4-24 DAG Latency – Memory Pipeline Stalls  DAG usage immediately (or within 2 cycles) after initialization. I2 = 0x1234; AX0 = DM(I2,M2);  This includes I, M, L, B, DMPG registers, and the MODIFY() instruction.  Avoid the stall by inserting meaningful instructions I2 = 0x1234; AY0 = 0; AR = AX0 + AY0; AX0 = DM(I2,M2);  DAG bank switching does not cause any stalls Execute (stalls) Decode (stalls) Address Generation Fetch Pre-fetch Look-ahaed

4-25 Go To DAGS Exercises