Runtime Organization
Overview Program Organization Memory pools Activation Records Static Automatic Dynamic Activation Records Parameter Passing Modes Symbol Table
Terminology Executable Native executable A file containing machine code Examples A bash script, a perl script, a ‘compiled’ java program, a compiled C program,... Native executable A file containing machine code that the CPU understand without any intervening “layers” of abstractions A compiled C program, a Java program compiled natively (with GCJ)
Virtual Address Space Traditional Organization Code Area at the bottom Static Data above Constants Static strings Static variables Heap Grows upward Stack Grows downward Lot’s of free VM in between 0xffffffff 0x0
Zooming In. Close look on the code area
Execution Stack A memory area at the top of the VM Purpose Grows downward Grows on demand (with OS collaboration) Purpose Automatic storage for local variables
Overview Program Organization Memory pools Activation Records Static Automatic Dynamic Activation Records Parameter Passing Modes Symbol Table
Memory Pools Where does memory comes from ? Three pools Static Automatic Dynamic Automatic Dynamic Static
Static Pool Content Allocation ? All the static “strings” that appear in the program All the static constants used in the program All the static variables declared in the program static int static arrays static records static .... Allocation ? Well... it is static, i.e., All the sizes are determined at compile time. Cannot grow or shrink
Dynamic Pool Content Allocation Deallocation Anything allocated by the program at runtime Allocation Depends on the language C malloc C++/Java/C# new ML/Lisp/Scheme implicit Deallocation C free C++ delete Java/C#/ML/Lisp/Scheme Garbage collection
Automatic Pool Content Allocation Deallocation Management policy Local variables Actuals (arguments to methods/functions/procedures) Allocation Automatic when calling a method/function/procedure Deallocation Automatic when returning from a method/function/procedure Management policy Stack-like
Overview Program Organization Memory pools Activation Records Static Automatic Dynamic Activation Records Parameter Passing Modes
Activation Records Also known as “Frames” A record pushed on the execution stack
Creating the Frame Three actors The caller The CPU The callee int foo(int x,int y) { ... } bar() { x = foo(3,y); Three actors The caller The CPU The callee
Creating the Frame Three actors The caller The CPU The callee int foo(int x,int y) { ... } bar() { x = foo(3,y); Three actors The caller The CPU The callee Actual Function Call
Creating the Frame Three actors The caller The CPU The callee int foo(int x,int y) { ... } bar() { x = foo(3,y); Three actors The caller The CPU The callee
Closeup on management data
Returning From a Call Easy The RET instruction simply Access MGMT Area from FP Restores SP Restores FP Transfer control to return address
Returning From a Call Easy The RET instruction simply Access MGMT Area from FP Restores SP Restores FP Transfer control to return address
Returning From a Call Easy The RET instruction simply Access MGMT Area from FP Restores SP Restores FP Transfer control to return address
Returning From a Call Easy The RET instruction simply Access MGMT Area from FP Restores SP Restores FP Transfer control to return address
Overview Program Organization Memory pools Layout Activation Records Static Automatic Dynamic Layout Activation Records Parameter Passing Modes Symbol Table
Parameter Passing Modes A Few Options By value By reference By value/result By name
Call By Value Simple strategy Push on the stack a copy of the argument Size depends on argument type Any write operation overwrites the copy Copy automatically discarded on exit
Call By Reference Simple strategy too Advantages? Place the address of the actual on the stack A write operation simply follows the pointer By-reference is identical to By-address Only difference is in the syntax. Advantages?
Value vs. Reference Pass by Value Pass by Reference Called routine cannot modify the Actual Parameter Pass by Reference Called routine can modify Actual Parameter +safe - Copying may be time consuming +Only have to pass an address, efficient -Requires an extra level of indirection
Language Specific Variations Pascal: Call by Value is the default, the keyword VAR denotes Call by Reference Fortran: all parameters passed by Reference Smalltalk, Lisp: Actual Parameter is already a reference to the object C: always passed by Value Java: Pass by value for primitive data type Pass by reference for object
Call By Name Slightly more involved The actual is not evaluated The program fragment that evaluates the actual is “wrapped up” so that it can be evaluated later on Each time the formal is used the actual gets evaluated It is a lot more like macro expansion! Advantage ?
Call by name Pretty much only in Algol Arguments are not evaluated until their actual use in the called program. Re-evaluates the actual parameter on every use For actual parameters that are simple variables, it’s the same as call by reference For actual parameters that are expressions, the expression is re- evaluated on each access No other language ever used call by name…
Ada Parameter Modes Three parameter passing modes In Out Inout Passes information from the caller to the callee, can read but not write Call by Value Out Passes information from the callee to the caller, can write but not read Call by Result (formal parameter is copied to actual parameter when subroutine exits) Inout passes information both directions Call by Value/Result
Class Practice a=2; f(a,a) Print a; f(y,z){ y=y+1; z=z+1;}
Class Practice Int k=0,i=0,j=0; procedure sub1(x: int; y: int; z: int); begin k := 1; y := x; k := 5; z := x; end; sub1(k+1, j, i); Print(i,j,k);
Class Practice X:int; X=2; Foo(X); Print x; Procedure foo (y:int) Y=3;
Scope Rules Scope of a binding: a program region (textually) in which binding is active. Scope: a program region of maximal size where no bindings change. Scoping can be: Lexical or static (bindings known at compile time). Dynamic (bindings depend on flow of execution at run time).
Static Scope Current binding for name: the one encountered most recently in top-to-bottom scan of the program text. Nested subroutines
Nested Subroutines in Pascal Visible: P2, P4 within P1 P3 within P2 F1 within P4 Which X: In F1? In P4? In P2?
Static Chains Nested calls: A, E, B, D, C Static link: to the frame of the most recent invocation of the lexically surrounding subroutine
Dynamic Scope The current binding for a given name is the one encountered most recently during execution, and not yet destroyed by returning from its scope.
Dynamic scope example What does the program print? Under static scoping? 1 regardless of value read Under dynamic scoping? 2: if value read is >0 1: if value read is <= 0 Dynamic scoping is usually a bad idea
Shallow & Deep binding When subroutine is passed as a parameter, when is the referencing environment bound? Shallow binding: when the function is called Deep binding: when the function is first passed as a parameter. Important in both dynamic and static scoping.
function is called reference is created int max_score; float scale_score (int raw_score) { return (float) raw_score / (float) max_score; } float highest_score (int[] scores, function_ptr scaling_function) { float max_score = 0; foreach score in scores { float percent = scaling_function (score); if ( percent > max_score ) max_score = percent; return max_score; main() { max_score = 50; int[] scores = ... print highest_score (scores, scale_score); function is called reference is created
Example Most appropriate for: older_than: deep binding (to get global threshold) print_person: shallow binding (to get locally set line_length) (dynamic scoping assumed) threshold: integer
var x : integer; /* global variable */ procedure Update x := x + 3; procedure DoCall(P: procedure) var x : integer; x := 4; P(); write_integer(x); begin /* body of main program */ x := 0; DoCall(Update); end /* body of main program */
Overview Program Organization Memory pools Activation Records Static Automatic Dynamic Activation Records Parameter Passing Modes Symbol Table
Symbol Table A “Dictionary” that maps names to info the compiler knows about that name. What names? Variable and procedure names Literal constants and strings What info? Textual name Data type Declaring procedure Lexical level of declaration If array, number and size of dimensions If procedure, number and type of parameters
Sample program Program gcd (input, output); Var I,j: integer; Begin Read(I,j); While I <> j to If I > j then I := I – j Else j := j – I; Writeln (i) End.
1.4 ast for gcd name, category (var, const, type, procedure, field name, parameter), scope #, type (a pointer to another symbol table entry), and other stuff specific for that category. Later – location in memory.
Symbol Table Implementation Usually implemented as hash tables Return closest lexical declaration to handle nested lexical scoping How to handle multiple scopes? May need to save info for debugging ( how do I print out student1.grade[4] ?)
One option Use one symbol table per scope Chain tables to enclosing scope Insert names in tables for current scope Start name lookup in current table, checking enclosing scopes in order if needed
LeBlanc-Cook symbol tables Give each scope a number, All names in a hash table, keyed by name Also have a scope stack – to show current referencing environment. As analyzer looks at programs, it pushes and pops this stack as it enters and leaves scopes. To look up – scan down the appropriate hash chain, for each matching entry, scan down the scope stack to see if that is visible. We look no deeper than the top-most scope. Scope includes subroutines having their own symbol table. 0 = pre-defined lang stuff 1 = globals Etc. (numbers are distinct, not indicating order of nesting)