Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Access Profiling & Improved Structure Field Regrouping in Pegasus Vas Chellappa & Matt Moore May 2, 2005 / Optimizing Compilers / Project Poster Session.

Similar presentations


Presentation on theme: "Data Access Profiling & Improved Structure Field Regrouping in Pegasus Vas Chellappa & Matt Moore May 2, 2005 / Optimizing Compilers / Project Poster Session."— Presentation transcript:

1 Data Access Profiling & Improved Structure Field Regrouping in Pegasus Vas Chellappa & Matt Moore May 2, 2005 / Optimizing Compilers / Project Poster Session

2 Introduction Structure definitions group fields by semantics, not access contemporaneity Data access profiling can be used to improve cache performance by reordering for contemporaneity In this context, contemporaneity is a measure of how close in time two data accesses to structure fields occur

3 Problem Statement Obtaining contemporaneity information for structure fields Exploiting this information to improve the ordering of the fields Doing this within the CASH/Pegasus environment

4 Approach Pegasus Implementation  Data Access Profiling to track contemporaneous field accesses to build the Field Affinity Graphs  Modify Simulator interface to SimpleScalar (3 rd party cache simulator) to achieve this Regrouping Algorithm  Field Affinity Graphs built by the modified Simulator are then used to recommend reorderings based on a new regrouping algorithm

5 Project Design

6 Design Overview 1. Build stage: Tag structure field accesses in the Pegasus IR 2. Simulation stage: Propagate tag information through SimpleScalar to the new regroup library 3. Final stage: Invoke regrouping algorithm to calculate reordering recommendations

7 Build Stage, Tagging Accesses Objective: Identify and tag structure field accesses in the Pegasus IR Not trivial, since SUIF/C2DIL do not preserve required type information during transformation to IR Need to identify patterns that indicate structure field accesses

8 Field Accesses in Pegasus

9 Actual Pegasus Illustration int foo(struct my_t stestfoo) { int retval = stestfoo.f2; return(retval); } Which wire here should have struct type? int foo(struct my_t* stestfoo) { return(stestfoo->f2); } Which wire here has struct type?

10 Simulation Process Tag info on loads and stores is propagated through SimpleScalar to the regrouping library that builds the field affinity graph (done online, during simulation)

11 Regrouping Stage After simulation, analyze collected profiling data to produce reordering recommendation Can be done better than has been done in previous work (greedy) Cannot be done optimally (NP-hard) Field Affinity Graph (one per structure):  Vertices: fields in a structure  Edge weights: represent degree of contemporaneity of accesses between the fields

12 Matching Heuristic Find a maximum weight matching in the field affinity graph Fields that will not fit into a cache line together anyway are identified and ignored Structure is reordered by placing matched fields together

13 Greedy vs. Matching

14 NP-Hardness NP-Hardness is shown by reducing graph coloring problem to regrouping problem

15 Results Implemented successfully to handle structure field accesses done through pointers (ptr->fld) So far, only small programs have been tested Reordering is done manually and fed into simulator again to obtain the number of cycles for comparison

16 Results - Example Original: struct my_t { int f1; int f2; char nu[4096]; int f3; int f4; }; int foo(struct my_t *elt) { int i; elt->f1 = 2; elt->f4 = 100; for(i=0; i < 50; i++) { elt->f1++; elt->f4--; } return elt->f1+elt->f4; } 750 Cycles per Call 745 Cycles per Call (one less cache miss) Modified: struct my_t { int f1; int f4; int f2; char nu[4096]; int f3; }; int foo(struct my_t *elt) { int i; elt->f1 = 2; elt->f4 = 100; for(i=0; i < 50; i++) { elt->f1++; elt->f4--; } return elt->f1+elt->f4; }

17 Conclusion Performance improvements are achievable even on simple programs using reorganization recommendations Propagation of full type information in SUIF/c2dil from source would be required to optimize non-pointer accesses Less memory-exposed languages would allow for easy and quick implementation of the reordering recommendation

18 References Trishul M. Chilimbi, Bob Davidson, and James R. Larus, “Cache-Conscious Structure Definition,'' in Proceedings of the ACM SIGPLAN '99 Conference on Programming Language Design and Implementation, pages 13-24, May 1999. Mathprog (Weighted Matching Algorithm) http://elib.zib.de/pub/Packages/mathprog/matching/weighted/ Pegasus: http://www-2.cs.cmu.edu/~phoenix/ SUIF: http://suif.stanford.edu/ SimpleScalar Tool set: http://www.cs.wisc.edu/~mscalar/simplescalar.html


Download ppt "Data Access Profiling & Improved Structure Field Regrouping in Pegasus Vas Chellappa & Matt Moore May 2, 2005 / Optimizing Compilers / Project Poster Session."

Similar presentations


Ads by Google