Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu,

Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu, Taiwan, R.O.C.

2 Outline  Introduction  Related Work  Motivation  Solving Problem  Proposed Algorithms  Experimental Results  Conclusions and Future Work

3 Introduction  Performance gap between memory and processor  Systems without cache  Application specific  Embedded DRAM  Optimize DRAM performance by utilize its special characteristics  SDRAM’s multi-bank architecture enables new optimizations in scheduling  We assign arrays to different SDRAM banks to increase data access rate

4 Related Work  Previous research eliminate memory bottleneck by  Using local memory (cache)  Prefetch data as fast as possible  Panda, Dutt, and Nicolau utilizing page mode access to improve scheduling using EDO DRAM  Research about array mapping to physical memories for low power, lower cost, better performance

5 Motivation  DRAM operations  Row decode  Column decode  Precharge  SDRAM characteristics  Multiple banks  Burst transfer  Synchronous Traditional DRAM2-bank SDRAM

6 Address Mapping Table Host Address: [a16:a0] Memory Address: [BA, A7-A0] Page Size for host: Page Size for DRAM: 128 words (a6:a0) 256 words (A7:A0) -If we exchange the mapping of a0 and a7...

7 Motivational Example BA=BankActive =RowDecode R/W=Read/Write =ColumnDecode BP=Precharge

8 Motivational Example BA=BankActive =RowDecode R/W=Read/Write =ColumnDecode BP=Precharge

9 Assumptions  Harvard architecture : Separated program/data memory  Paging policy of the DRAM controller  Does not perform precharge after read/write  If next access reference to different page, perform precharge, followed by bank active, before read/write  As many pages can be opened at once as the number of banks  Resource constraints

10 Problem Definition  Input a data flow graph, the resource constraints, and the memory configuration  Perform our bank allocation algorithm  Schedule the operations with a static list scheduling algorithm considering SDRAM timing constraints  Output a schedule of operations, a bank allocation table, and the total cycle counts

11 Bank Allocation Algorithm  Calculate Node distances  Calculate Array distances  Give arrays with the shorter distances higher priority  Allocate arrays to different banks if possible

12 Example: SOR main() { float a[N][N], b[N][N], C[N][N], d[N][N], e[N][N], f[N][N]; float omega, resid, u[N][N]; int j,l; for (j=2; j<N; j++) for (l=1;l<N;l+=2) { resid = a[j][l]*u[j+1][l]+ b[j][l]*u[j-1][l]+ c[j][l]*u[j][l+1]+ d[j][l]*u[j][l-1]+ e[j][l]*u[j][l] – f[j][l]; u[j][l] -= omega*resid/e[j][l]; }

13 Node Distance  Distances between current node and the nearest node that access array a, b, c,…. Shown in { }  Ex. {1,-,-,-,-,-,-,1,-} means the distances to the node that access array a[j] and u[j-1] are both 1.  ‘-’ means the distance is still unknown  When propagate downstream, the distance increases.

14 Array Distance  The distance between nodes that access arrays  Calculate from node distance of corresponding arrays  Get the minimum value  Ex. AD(a[j], u[j-1])=min(2,4)=2

15 Example: SOR Bank allocation : Bank 0: c,d,e,f Bank 1: a,b,u

16 Experimental Characteristics  We divided our benchmarks into two groups  First group benchmarks access multiple 1-D arrays  Apply our algorithm to arrays  Second group benchmarks access single 2-D arrays  Apply our algorithm to array rows  Memory configurations  Multi-bank configuration: 2 banks/ 4banks  Multi-chip configuration: 2 chips/ 4chips  Multi-chip vs mulit-bank: relieves bus contention  Utilizing page mode access or not

20 Experimental Results  From the average results, we can see that  Scheduling using SDRAM with our bank allocation algorithm do improve the performance  Utilizing page mode access relieves the traffic of address bus, thus the use of multiple chips does not make obvious improvement

21 Conclusions  We presented a bank allocation algorithm incorporated in our scheduler to take advantages of SDRAM  The scheduling results have a great improvement from the coarse one and beat Panda’s work in some cases  Our work is based on a common paging policy  Several different memory configurations are exploited  Scheduling results are verified and meet Intel’s PC SDRAM’s spec

22 Future Works  Extending our research to Rambus DRAM  Grouping arrays to incorporating burst transfer  Integration with other scheduling /allocation techniques

Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu,

Similar presentations

Presentation on theme: "Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu,

Similar presentations

Presentation on theme: "Array Allocation Taking into Account SDRAM Characteristics Hong-Kai Chang Youn-Long Lin Department of Computer Science National Tsing Hua University HsinChu,"— Presentation transcript:

Similar presentations

About project

Feedback