Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instruction Selection Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Similar presentations


Presentation on theme: "Instruction Selection Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved."— Presentation transcript:

1 Instruction Selection Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

2 The Problem Writing a compiler is a lot of work Would like to reuse components whenever possible Would like to automate construction of components Front end construction is largely automated Middle is largely hand crafted (Parts of ) back end can be automated Front EndBack EndMiddle End Infrastructure Today’s lecture: Automating Instruction Selection

3 Definitions Instruction selection Mapping IR into assembly code Assumes a fixed storage mapping & code shape Combining operations, using address modes Instruction scheduling Reordering operations to hide latencies Assumes a fixed program (set of operations) Changes demand for registers Register allocation Deciding which values will reside in registers Changes the storage mapping, may add false sharing Concerns about placement of data & memory operations

4 The Problem Modern computers (still) have many ways to do anything Consider register-to-register copy in I LOC Obvious operation is i2i r i  r j Many others exist Human would ignore all of these Algorithm must look at all of them & find low-cost encoding  Take context into account ( busy functional unit? ) And I LOC is an overly-simplified case addI r i,0  r j subI r i,0  r j lshiftI r i,0  r j multI r i,1  r j divI r i,1  r j rshiftI r i,0  r j orI r i,0  r j xorI r i,0  r j … and others …

5 The Goal Want to automate generation of instruction selectors Machine description should also help with scheduling & allocation Front EndBack EndMiddle End Infrastructure Tables Pattern Matching Engine Back-end Generator Machine description Description-based retargeting

6 The Big Picture Need pattern matching techniques Must produce good code ( some metric for good ) Must run quickly A treewalk code generator runs quickly How good was the code? x IDENT IDENT loadI4  r 5 loadAOr arp,r 5  r 6 loadI8  r 7 loadAOr arp,r 7  r 8 multr 6,r 8  r 9 loadAIr arp,4  r 5 loadAIr arp,8  r 6 multr 5,r 6  r 7 TreeTreewalk CodeDesired Code

7 The Big Picture Need pattern matching techniques Must produce good code ( some metric for good ) Must run quickly A treewalk code generator runs quickly How good was the code? x IDENT IDENT loadI4  r 5 loadAOr arp,r 5  r 6 loadI8  r 7 loadAOr arp,r 7  r 8 multr 6,r 8  r 9 loadAIr arp,4  r 5 loadAIr arp,8  r 6 multr 5,r 6  r 7 TreeTreewalk CodeDesired Code Pretty easy to fix. See 1 st digression in Ch. 7

8 The Big Picture Need pattern matching techniques Must produce good code ( some metric for good ) Must run quickly A treewalk code generator runs quickly How good was the code? x IDENT NUMBER loadI4  r 5 loadAOr arp,r 5  r 6 loadI2  r 7 multr 6,r 7  r 8 loadAIr arp,4  r 5 multIr 5,2  r 7 TreeTreewalk CodeDesired Code

9 The Big Picture Need pattern matching techniques Must produce good code ( some metric for good ) Must run quickly A treewalk code generator runs quickly How good was the code? x IDENT NUMBER loadI4  r 5 loadAOr arp,r 5  r 6 loadI2  r 7 multr 6,r 7  r 8 loadAIr arp,4  r 5 multIr 5,2  r 7 TreeTreewalk CodeDesired Code Must combine these This is a nonlocal problem

10 The Big Picture Need pattern matching techniques Must produce good code ( some metric for good ) Must run quickly A treewalk code generator runs quickly How good was the code? x IDENT IDENT loadI @G  r 5 loadI 4  r 6 loadAOr 5,r 6  r 7 loadI @H  r 7 loadI4  r 8 loadAOr 8,r 9  r 10 multr 7,r 10  r 11 loadI4  r 5 loadAIr 5, @G  r 6 loadAIr 5, @H  r 7 multr 6,r 7  r 8 TreeTreewalk CodeDesired Code

11 The Big Picture Need pattern matching techniques Must produce good code ( some metric for good ) Must run quickly A treewalk code generator can meet the second criteria How did it do on the first ? x IDENT IDENT loadI @G  r 5 loadI 4  r 6 loadAOr 5,r 6  r 7 loadI @H  r 7 loadI4  r 8 loadAOr 8,r 9  r 10 multr 7,r 10  r 11 loadI4  r 5 loadAIr 5, @G  r 6 loadAIr 5, @H  r 7 multr 6,r 7  r 8 TreeTreewalk CodeDesired Code Common offset Again, a nonlocal problem

12 How do we perform this kind of matching ? Tree-oriented IR suggests pattern matching on trees Tree-patterns as input, matcher as output Each pattern maps to a target-machine instruction sequence Use dynamic programming or bottom-up rewrite systems Linear IR suggests using some sort of string matching Strings as input, matcher as output Each string maps to a target-machine instruction sequence Use text matching or peephole matching In practice, both work well; matchers are quite different

13 Peephole Matching Basic idea Compiler can discover local improvements locally  Look at a small set of adjacent operations  Move a “peephole” over code & search for improvement Classic example: store followed by load storeAI r 1  r arp,8 loadAI r arp,8  r 15 storeAI r 1  r arp,8 i2i r 1  r 15 Original codeImproved code

14 Peephole Matching Basic idea Compiler can discover local improvements locally  Look at a small set of adjacent operations  Move a “peephole” over code & search for improvement Classic example: store followed by load Simple algebraic identities addI r 2,0  r 7 mult r 4,r 7  r 10 mult r 4,r 2  r 10 Original codeImproved code

15 Peephole Matching Basic idea Compiler can discover local improvements locally  Look at a small set of adjacent operations  Move a “peephole” over code & search for improvement Classic example: store followed by load Simple algebraic identities Jump to a jump jumpI  L 10 L 10 : jumpI  L 11 Original codeImproved code

16 Peephole Matching Implementing it Early systems used limited set of hand-coded patterns Window size ensured quick processing Modern peephole instruction selectors (Davidson) Break problem into three tasks Expander IR  LLIR Simplifier LLIR  LLIR Matcher LLIR  ASM IR LLIR ASM

17 Peephole Matching Expander Turns IR code into a low-level IR (LLIR) such as RTL Operation-by-operation, template-driven rewriting LLIR form includes all direct effects ( e.g., setting cc ) Significant, albeit constant, expansion of size Expander IR  LLIR Simplifier LLIR  LLIR Matcher LLIR  ASM IR LLIR ASM

18 Peephole Matching Simplifier Looks at LLIR through window and rewrites is Uses forward substitution, algebraic simplification, local constant propagation, and dead-effect elimination Performs local optimization within window This is the heart of the peephole system  Benefit of peephole optimization shows up in this step Expander IR  LLIR Simplifier LLIR  LLIR Matcher LLIR  ASM IR LLIR ASM

19 Peephole Matching Matcher Compares simplified LLIR against a library of patterns Picks low-cost pattern that captures effects Must preserve LLIR effects, may add new ones ( e.g., set cc ) Generates the assembly code output Expander IR  LLIR Simplifier LLIR  LLIR Matcher LLIR  ASM IR LLIR ASM

20 Example Original IR Code OPArg 1 Arg 2 Result mult2Yt1t1 subxt1t1 w LLIR Code r 10  2 r 11  @ y r 12  r arp + r 11 r 13  M EM (r 12 ) r 14  r 10 x r 13 r 15  @ x r 16  r arp + r 15 r 17  M EM (r 16 ) r 18  r 17 - r 14 r 19  @ w r 20  r arp + r 19 M EM (r 20 )  r 18 Expand r 14 t 1 = w = r 20

21 Example LLIR Code r 10  2 r 11  @ y r 12  r arp + r 11 r 13  M EM (r 12 ) r 14  r 10 x r 13 r 15  @ x r 16  r arp + r 15 r 17  M EM (r 16 ) r 18  r 17 - r 14 r 19  @ w r 20  r arp + r 19 M EM (r 20 )  r 18 Simplify LLIR Code r 13  M EM (r arp + @ y) r 14  2 x r 13 r 17  M EM (r arp + @ x) r 18  r 17 - r 14 M EM (r arp + @ w)  r 18

22 Introduced all memory operations & temporary names Turned out pretty good code Example Match LLIR Code r 13  M EM (r arp + @ y) r 14  2 x r 13 r 17  M EM (r arp + @ x) r 18  r 17 - r 14 M EM (r arp + @ w)  r 18 I LOC (Assembly) Code loadAI r arp, @ y  r 13 multI 2 x r 13  r 14 loadAI r arp, @ x  r 17 sub r 17 - r 14  r 18 storeAI r 18  r arp, @ w

23 Making It All Work Details LLIR is largely machine independent ( RTL ) Target machine described as LLIR  ASM pattern Actual pattern matching  Use a hand-coded pattern matcher (gcc)  Turn patterns into grammar & use LR parser (VPO) Several important compilers use this technology It seems to produce good portable instruction selectors Key strength appears to be late low-level optimization


Download ppt "Instruction Selection Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved."

Similar presentations


Ads by Google