Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functionality of LANCE Software structure C frontend

Similar presentations


Presentation on theme: "Functionality of LANCE Software structure C frontend"— Presentation transcript:

1

2 Functionality of LANCE Software structure C frontend
Overview Functionality of LANCE Software structure C frontend Intermediate representation (IR) IR optimizations Control and data flow analysis Backend interface

3 The LANCE V2.0 compiler system
Purpose of LANCE: Facilitate C compiler development for new target processors Give insight into compiler structure Tasks covered by LANCE: Source code analysis Generation of IR Machine-independent optimizations Data flow graph generation Tasks not covered by LANCE: Assembly code generation (backend) Machine-specific optimizations Code assembly and linking

4 Full ANSI C coverage (C 89) Modular tool and library structure
Key features Full ANSI C coverage (C 89) Modular tool and library structure Simple three address code IR (C subset) Plug & play IR optimizations Backend interface compatible to OLIVE Proven in numerous compiler projects

5 LANCE software structure
LANCE library LANCE tools C frontend lance2.h header file IR optimization 1 common IR used by liblance2.a C++ library IR optimization n machine- specific backend

6 ANSI C frontend Functionality:
Lexical, syntactical, and semantical analysis of C source Generation of three address code IR for a C file Emission of error messages if required (gcc style) Machine-specific constants (type bitwidth, alignment) stored in a configuration file Implementation: Based on a context-free C grammar, according to K&R spec C source automatically generated with attribute grammar compiling system (OX, extension of lex & yacc) In total approx. 26,000 lines of C source code Validated with comprehensive test suite

7 Setup and IR generation
Environment variables: setenv LANCE2_CPP „gcc –E“ setenv LANCE2_CONFIG „config.sparc“ file test.ir.c Call C frontend by „compile“ command: file test.c >compile test.c config.sparc

8 General IR format One IR file (*.ir.c) generated for each C source file (*.c) External IR format: C subset (compilable !) Internal IR format: Accessible via LANCE library IR contains a symbol table + three address code (3AC) for each C function defined in the source code 3AC is a sequence of IR statements 3AC = at most two operands, one result per statement IR statements (mostly) consist of IR expressions blocks of 3AC augmented with source information (C code, source line no.) for debugging purposes

9 Classes of IR statements
Assignment: a = b + c; *p = !a; x = f(y,z); cond = *x; Jump: goto lab; Conditional jump: if (cond) goto lab; Label: lab: Return void: return; Return value: return x;

10 Classes of IR expressions
Symbol: „a“, „b“, „main“, „count“, ... Binary expression: a * b, x / 2, 3 ^ v, f &4, q % r, ... Unary expression: !a, *p, ~x, -z, ... Function call: f1(), f2(a,b), f3(*x, 1, y), ... Type cast: (char)z, (int)a, (float*)b, ... String constant: „compiler“, „design“, „is“, „fun“, ... Integer constant: 1000, 3456, -234, -112, ... Float constant: „ “, „ “, ...

11 Why is the LANCE IR a C subset ?
Validation of frontend (or any IR optimization): C source frontend IR-C source CC exe 1 test input exe 2 CC output 1 = ? output 2 C-to-C optimization: optimized C source IR optimization tools CC

12 IR data structure overview
function list IR statement list fun 1 „name1“ stm 1 stm 2 stm m Class: cond. jump ID: 4124 Target: „L1“ Condition: c Class: assignment ID: 4123 Left hand side: *p Right hand side: a + b ... stm info fun n „name n“ IR expression Local symbol table int a,b,c; ... Class: binary ID: 10034 Left arg: a Right arg: b Oper: + Type: int GLOBAL SYMBOL TABLE int x1,x2,x3; double y1,y2,y3; exp info

13 The IR type class C++ class IRType stores type info for all symbols and expressions Primary type: void, char, short, int, array, pointer, struct, function, ... Secondary type: subtype of arrays and pointers Storage class: extern, static, register, ... Qualifiers: const, volatile Example: const int* A[100]; Type->Class() = IRTYPE_ARRAY // primary type Type->IsConst() = true Type->Subtype()->Class() = IRTYPE_POINTER Type->Subtype()->Subtype()->Class() = IRTYPE_INT Type->ArrayDim() = 100 Type->SizeOf() = // in bytes, for 32-bit pointers Type->MemoryWords() = 200 // for a 16-bit word memory

14 The symbol table class Symbol table stores all relevant information for symbols/identifiers Two hierarchy levels: Global symbol table IR->GlobalSymbolTable() One local symbol table per function fun->LocalSymbolTable() All local symbols get a unique numerical suffix, e.g. int f(int x) { int a,b; } int f(int x_1) { int a_2, b_3; } Important access methods: ST->LookupSymbol(char* name) IRSymbol* ST->CreateSymbol(IRType* tp) Iterators: ST->FirstObject(), ST->NextObject() Information stored in a table entry (class IRSymbol): Symbol type: IRType* sym->Type() Symbol name: char* sym->Name()

15 IR generation example source file IR file forward declaration
automatic conversion suffix 3 for parameter i auxiliary vars debug info source file IR file

16 IR optimization tools Purpose: perform machine-independent optimizations on IR Identical IR format for all tools, „plug & play“ concept Currently available tools: Constant folding cfold tool Constant propagation constprop tool Copy propagation copyprop tool Common subexpression elimination cse tool Dead code elimination dce tool Jump optimization jmpopt tool Loop invariant code motion licm tool Induction variable elimination ive tool Automatic iteration of IR optimizations via „iropt“ shell script

17 IR optimization example
C source code compile unoptimized IR

18 Constant folding cfold

19 Constant propagation constprop

20 Copy propagation copyprop

21 Common subexpression elimination
cse

22 Dead code elimination dce

23 Jump optimization jmpopt

24 Loop invariant code motion
licm

25 Induction variable elimination
ive

26 Control flow analysis Purpose: identify basic block structure of a C function Basic block (BB): IR statement sequence with unique entry and exit points Control flow graph (CFG): One node per BB, edge (BB1, BB2) iff BB2 may be an immediate successor of BB1 during execution Assembly code generation usually done BB after BB Example: BB1 while (x) { BB1; if (x) then BB2; else BB3; BB4; } BB2 BB3 BB4

27 CFG generation by LANCE
Class ControlFlowGraph contained in LANCE library Constructor ControlFlowGraph(Function* fun) generates CFG for any function fun LANCE tool showcfg exports CFGs in the VCG text format VCG can be used to visualize generated CFGs showcfg xvcg IR file VCG file CFG

28 CFG visualization example
showcfg + VCG tool

29 Data flow analysis Goal: convert IR into data flow graph (DFG) representation for assembly code generation by tree pattern matching Performed by def/use analysis between IR statements/expressions LANCE lib class DataFlowAnalysis provides required methods Constructor DataFlowAnalysis(Function* fun) constructs data flow information for any function fun Example: x = 5; goto lab; ... x = 6; lab: y = x + 1; z = 1 – y; u = y / 5; x has two definitions: x and x y has two uses: y and y

30 DFG visualization example
showdfg + VCG tool

31 Backend interface LANCE lib classes LANCEDataFlowTree and DFTManager provide link between LANCE IR and tree pattern matching OLIVE/IBURG accept only trees instead of general DFGs Hence: split DFGs at the common subexpressions (CSEs) a b a b CSE * auxiliary variable c * 2 t + + c t t 2 x y + + x y

32 Data structure overview
Constructor DFTManager(Function* fun) generates data flow tree (DFT) representation for an entire function fun DFTManager contains internal list of basic blocks Each BB in turn is a list of DFTs DFT 1 DFT 2 DFT m BB 1 BB 2 ... BB n

33 DFT covering with OLIVE
DFTs are directly in the format required by code generators produced by OLIVE All DFTs consist of a fixed set of terminal symbols (e.g. cs_STORE) (specified in file INCL/termlist.c) Example (only a single DFT): C file DFT representation IR file

34 Example (cont.) DFT in OLIVE format assembly code for hypothetical
machine simplified OLIVE spec

35 Summary LANCE provides you with ... C frontend IR optimizations
C++ library for IR access (+ important basic classes) interface to OLIVE data flow trees Full C compiler additionally requires ... OLIVE based backend for the concrete target machine target-specific optimizations (e.g. scheduling, address gen.)


Download ppt "Functionality of LANCE Software structure C frontend"

Similar presentations


Ads by Google