Presentation is loading. Please wait.

Presentation is loading. Please wait.

Memory Arithmetic Unit Interface Jason M. Meier Justin S. Teller Tom J. Keeley.

Similar presentations


Presentation on theme: "Memory Arithmetic Unit Interface Jason M. Meier Justin S. Teller Tom J. Keeley."— Presentation transcript:

1 Memory Arithmetic Unit Interface Jason M. Meier Justin S. Teller Tom J. Keeley

2 Memory Controller Current Paradigm Task 1 CPU: Task 2 MEMORY: CPU MEMORY CTRL: DRAM System Done: Task 1

3 Active Pages Implementation Used Configurable DRAM - RADRAM Reconfigurable logic implements various memory functions “Active Page” consists of a page of data and a set of associated functions Works on individual DRAM chips Processor-centric and Memory-centric partitioning * Active Pages - Oskin, Chong, Sherwood – ISCA ‘98

4 MAUI Implementation Task 1 CPU: MEMORY: CPU MEMORY CTRL/MAUI: Task 1 DRAM System Task 2 MAUI Memory Controller MAU Done: Task 1

5 1) CPU sends an MAU_LOAD register command to the MC (along with the reg # and address to read) across the front-side bus. 2) MC interprets command and places a Read command in the transaction queue. 3) DRAM performs read. 4) Result is stored in appropriate register in the MAUI register file. MAUI Instruction Set LOAD REG CPU: DRAM: R MC/MAUI: DRAM System MAUI Memory Controller MAU 1 2 3 4 1 2 3 4 MAUI_LD,offset( )

6 1) CPU sends an MAU_LOADI register command to the MC (along with the reg # and integer to save) across the front-side bus. 2) MC interprets command and places integer in the appropriate register in the MAUI register file. MAUI Instruction Set II LOADI REG CPU: DRAM: MC/MAUI: DRAM System MAUI Memory Controller MAU 1 2 1 2 MAUI_LDI,

7 1) CPU invalidates addresses in the cache that fall within the range of the destination array. Addresses within the range of the source arrays are written back if dirty. 2) CPU sends an MAUI_ADD command to the MC (along with the reg #’s) across the front-side bus. 3) MC interprets command, MAUI adds the appropriate registers and places a Write command and next two Read commands in the transaction queue. 4) Step 3 repeats for the length of the array. MAUI Instruction Set III MAU_ADD CPU: DRAM: W MC/MAUI: 1 2 4 MAUI_ADD,,, CPU DRAM System MAUI Memory Controller MAU 1 2 3 3 RRW 4

8 Issues: Read & Write Locks

9 Issues: Address Mapping TLB Virtual Space Physical Space Memory that is Contiguous in Virtual Space may not be Contiguous in Physical Space MAUI assumes consecutive addressing (size register) MAUI operations which cross page boundaries must be split into separate operations for each page Programmer will not know mapping scheme Result: All MAUI operations will need to be privileged instructions, accessed by programs through a system call.

10 The compiler will be responsible for deciding when MAUI instructions should be used. This decision will be based on the size of the array, and if it’s likely to be in the cache, or if it’s likely to used by an instruction that isn’t implemented in the MAUI. Issues: Compiler Issues

11 Issues: Task Interrupts Task 1 CPU: Task 2 MEMORY: CPU MEMORY CTRL/MAUI: Task 1 DRAM System Task 2 MAUI Memory Controller MAU

12 Memory maui_ld r1, 0 Transaction Queue BIU maui_ld r1, 0 Example: maui_add I Memory Controller

13 Memory maui_ld r2, 5 Example: maui_add II Transaction Queue Memory Controller BIU

14 Memory maui_ld r3, 10 Example: maui_add III Transaction Queue Memory Controller BIU

15 Memory maui_ld r4, 2 Example: maui_add IV Transaction Queue Memory Controller BIU

16 Memory maui_add r3, r1, r2 R, 0 R, 5 maui_add r3, r1, r2 Example: maui_add V Transaction Queue Memory Controller BIU

17 Memory Read 10 D1[0] maui_add r3, r1, r2* Example: maui_add VI Transaction Queue Memory Controller BIU

18 Memory D2[0] Read 10 maui_add r3, r1, r2* Example: maui_add VII Transaction Queue Memory Controller BIU

19 Memory R, 1 R, 6 W,10, D1[0]+D2[0] Read 10 maui_add r3, r1, r2* Example: maui_add VIII Transaction Queue Memory Controller BIU

20 Memory Write 6, D D1[1] maui_add r3, r1, r2* Example: maui_add IX Transaction Queue Memory Controller BIU

21 Memory D2[1] Write 6, D maui_add r3, r1, r2* Example: maui_add X Transaction Queue Memory Controller BIU

22 Memory Next Instruction W,10, D1[1]+D2[1] Example: maui_add XI Transaction Queue Memory Controller BIU

23 Advantages & Disadvantages Advantages Better performance for DRAM latency bound computations Lower latency to DRAM compared to CPU Reduced traffic on front-side bus Concurrent execution Disadvantages MAUI operates at a lower clock frequency Increased compiler complexity Increased fabrication costs (More Logic = More $$) Recently used data may not be cached

24 Alternative Implementation MAUI Occupies its Own Read & Write Bus CPU DRAM System MAUI MAU Memory Controller MAUI Read & Write Bus Eliminate Contention with CPU for DRAM system resources. Create Circular Data flow resulting in increased performance Need Specialized Triple-Ported DRAM system leading to increased production costs üGOOD X BAD

25 Simulated on SimpleScalar version 4.0 One set of test benches with dual array operations running in both the MAUI and CPU with four different array sizes. This trial was repeated for both shared and independent memory access busses. Found up to a 43% speedup! Test Setup

26 Results Total CPU Cycles

27 Future Enhancements I DRAM System MAUI Memory Controller MAUS MAU Multi-tasking Task 1 CPU: Task 2 MEMORY: MEMORY CTRL/MAUI: Task 1 Task 2 Task 3 Larger Register File More MAUs for Parallelism Small Cache

28 Future Enhancements II MAU_ADD CPU: DRAM: W MC/MAUI: Better Pipelining RRWRRRRRRWW DRAM System MAUI Memory Controller MAU Larger Register File to Hold Intermediate Results


Download ppt "Memory Arithmetic Unit Interface Jason M. Meier Justin S. Teller Tom J. Keeley."

Similar presentations


Ads by Google