CS 363 – Chapter 3 What do we mean by names, scopes, bindings? Names Aspect of PL design (3.1) How does an object get a name? Binding time & object life-cycle (3.2) Where does an object live? Memory allocation strategies Book has chapters on design, and other chapters on implementation (compiler)
“Hi, what’s your name?” “When did you get that name?” “When I was compiled.” “Bob.”
Time Choices can happen at any of 7 possible times One choice is “binding” – giving name to object Lang design Available types Lang implementation I/O issues At program writing You choose var names Compile time Alloc registers/mem Link time Virtual addresses Load time Initial physical addr. Run time Mem addresses change
Names of objects At program-writing time, we give variables names like “maxValue” or “a[5]” At compile time, compiler gives a low-level names like %o1, or 16($sp), or an address like 0x401c High level names are kept in symbol table. At run time, A name may be re-used for other objects. Address of your object may change.
Binding time When a value/object gets its name Early decision = More efficient code if we know where something is at compile time E.g. 2 variables can share same register Less flexible ( That name gets reserved.) Ex. Allocating a value to a register Ex. Compiler can optimize operations for pre-existing types, maybe not for types we define
Object life-cycle Object created Name given to object (By object, we mean memory location) Object created Name given to object Can be bound when object created Object used via its name Name disassociated from object Object reclaimed Can be at diff time, e.g. param to function A beginning, a middle and an end.
* Memory Allocation * How we determine memory addresses of objects 3 ways to do allocate memory Static (address will never change) Stack (used for nested structure) Heap (dynamic allocation) You can have all 3 in the same program!
Static allocation Static = as it looks on paper Used for global variables, real-number constants, string literals. They are unique. Fortran subroutines (p. 119) Each function has its own chunk of memory set at compile time. Called frame or activation record. The address of anything in a function is fixed. Even if A and B both call C, the two instances of C never overlap in time. (C must return before it can be called again.) Does not support recursion! f.p. constants and strings stored statically because they’re big – can’t fit in instruction
Stack Allocation A form of “dynamic” memory allocation One segment of memory dedicated to be the run-time stack. Keep track of all live functions. Each object within function has offset. Compiler must generate code to allocate correct amount of space for frame. In Computer Organization, you learn about pushing/popping return address, some parameters, temporaries. Can support recursion.
Heap allocation Heap = another segment of memory. Anytime you need some space, OS can give it to you. Using “new” or alloc( ) function Enlarging a dynamic type like ArrayList. Implementation problem: fragmentation Internal (allocate more than really need) External (p. 123, no one chunk is big enough)
Implementation Free list: list of addresses & sizes of available holes Strategies: First fit: find first hole big enough Best fit: find smallest hole big enough , to minimize external fragmentation Worst fit: fit into largest hole, to create largest possible remaining hole Allocation creates up to 2 small free areas De-allocation: coalesce free areas into larger holes
Multiple free-lists One free list is usually too big to search thru Use several lists, each of ~ same size Buddy system Maintain list of free blocks of size 1,2,4,8, … bytes up to some maximum e.g. 1 MB Initially, we have just 1 free block, the entire 1 MB. Over time, this may get split up into smaller pieces (buddies). Heap request: round it up to next power of 2. If region of size S is unavailable, take half of one from the 2S list. When de-allocated, check to see if its buddy also free, then return to 2S list. Fibonacci heap: optimizes space not time
Example 1024 Request A = 70K A 128 256 512 B = 35K B 64 C = 80K C Return A D = 60K D Return B Return D Return C When a block of size 2k is freed, memory manager only has to search other 2k blocks to see if a merge is possible.
De-allocating objects Means we don’t need the space anymore. Should this be explicit or implicit? Explicit: done by programmer. Faster, but mistakes are costly. De-allocate too soon: my memory is being used by different object (dangling reference). De-allocate too late: memory “leak” = waste Implicit: generally worth it.
Summary Objects in your program are allocated statically, to the run-time stack, or to the heap. Binding time By the time we compile a program, we know addresses of static data, and data on run-time stack (unless there’s recursion!) At run time, we determine addresses of dynamically allocated (heap) objects.
Scope (where the “binding is active”) Variable names can be re-used Scope = Where you can use a variable name (where the “binding is active”) Usually it’s the body of some block of code Static scope is most common (done at compile time) Dynamic scope (determined as program runs)
(3.3) Static scope Determined at compile time Looking at code, you can tell when var is alive Simplest scope is to avoid issue: No declarations all variables are global Typos hard to find Fortran choices Declarations optional: assumes i-n is integer “Common”: share data between functions “Equivalence”: 2 arrays can share same memory Local variables can be “saved” Like “static” variable in C “equivalence” is not needed in today’s machines.
Nested blocks We’re used to this idea Can reference anything in enclosing scope, but not inward Compiler searches outward until it finds declaration. Compiler inserts code to maintain “static link” to parent function. Local declaration hides outer one Scope resolution operator: Ada, C++ Similar to “super.” in Java inheritance Block is body of { }.
Declaration order Does the order of var decl affect scope? Two approaches Scope is entire block where declared Declare all variables at beginning. Makes sense to say vars are declared before used. Scope is from decl to end of block Fixes confusing error, p. 132 C still required all var decl at beginning, as legacy of old style. p. 132: We would not have an error in C. The m = n would refer to outer n.
Recursive declarations If I have to declare something before I use it, how do I say: A lake has 1+ islands An island has 1+ lakes Which should I declare first??? We can use a “forward declaration” which declares just the name of something. Or the PL can dispense with the “declare before use” requirement.
(3.3.4) Modules (just skim this section) (3.3.6) Dynamic scope (3.5) Deep and shallow binding
Modules Large programs written in multiple files Mechanism to encapsulate variables, data types & functions “Module interface” specifies how to share (import/export) objects Global variables that can travel Called “packages” in Ada, “namespaces” in C++
Dynamic scope Binding between names & objects depends on program execution! Must use “most recent” binding Compiler cannot determine type of variable! Mainly used for interpreted languages Ex. p. 143
Example Code Execution a: integer main() a := 2 second() a: integer procedure first a := 1 procedure second first() main() a := 2 second() print(a) Execution a: integer main() a := 2 second() a: integer first() a := 1 print(a) Static scoping: second’s a is irrelevant. But in dynamic, we remember it when we go to first().
Another example p.145 max_score: integer function scaled_score(raw:integer):real return raw/max_score*100 // which max_score is this? main() max_score: real := 0 … foreach student in class student.percent := scaled_score(student.points) With dynamic scoping, we confuse global with local variable.
Implementing dynamic scope *** Maintain stack of bindings (declarations) Each time a function is called, its local vars pushed onto stack with their bindings When we reference a variable, look down the stack for its binding When function returns, pop bindings
Referencing environment In dynamic scope, what happens when we pass a function B as a parameter to another function A? Deep binding: Apply B’s scope when B is passed as parameter. Shallow binding: Apply B’s scope when we actually call B. (shallow = seen more recently) In static scope, only deep binding makes sense. Subtle problem
Example Execution: threshold: integer function older(p: person): boolean return p.age > threshold procedure show(p: person, c: function) threshold: integer := 20 if c(p) print(p) main( ) threshold := 35 show(p, older) Execution: main( ) threshold := 35 show(p, older) threshold : integer := 20 older(p) return p.age > threshold if <return value is true> print(p) Deep binding = older’s reference environment is set at the time we call show(p, older).
Restrictions on names Some PL are more flexible than others… Primitive types are usually 1st class Arrays usually 2nd class Functions traditionally 3rd class, but 1st class in C, C++, Ada More in chapters 7-9 Type Param? Return value? Assign? 1st class Yes 2nd class No 3rd class
(3.5) Names that are not distinct Aliases, overloading, coercion
Aliasing Two names for the same object. Usually done with pointer (reference). For example, in C, we can define a pointer to an integer. int a, *p; a = 4; p = &a; So now, a and *p mean the same thing! Why have pointers? Chapter 8 on params.
Be Careful Having an alias for an object can hinder a compiler optimization. C code Assembly: Optimized a = *p; lw a, (p) lw a, (p) b = *p; lw b, (p) move b, a This is okay, but what if we add one more statement….?
Be Careful (con’d) C code Assembly: Optimized a = *p; lw a, (p) lw a, (p) *q = 3; sw 3, (q) sw 3, (q) b = *p; lw b, (p) move b, a Compiler not sure if p & q point to same place! The “move” optimization is not safe.
Overloading A name or symbol can mean more than one thing, depending on context. What can be overloaded? Operators In C++, we can define our own meanings for operators too. Functions (that have the same name) Enumerated constants Some languages don’t allow you to do reuse a name. Ex. Using “oct” as a base name or as a month, p. 147
Coercion Compiler automatically changes type of a object. Usually temporary, doesn’t redeclare variable. Common example: Changing integer to real to complete a calculation. Sometimes called “promotion” Languages differ in how much they coerce. Ada does not coerce variables. Alternative is explicit cast or just use different variable, rather than compiler doing it automatically.