Download presentation
Presentation is loading. Please wait.
1
CS 3304 Comparative Languages
Lecture 5: Names, Scopes, and Binding 31 January 2012
2
Introduction The early development of programming languages was driven by two complementary goals, machine independence and ease of programming. Machine Independence: a programming language should not rely on the features of any particular instruction set for its efficient implementation (e.g., Java). Ease of programming: more elusive and a matter of science than of aesthetics and trial and error. Core issues: Names, scopes, and bindings: Chapter 3. Control-flow constructs: Chapter 6. Data types: Chapter 7. Subroutines and classes: Chapters 8 and 9.
3
Name, Scope, and Binding A name is a mnemonic character string used to represent something else: Most names are identifiers. Symbols (like '+') can also be names. A binding is an association between two things, such as a name and the thing it names. The scope of a binding is the part of the program (textually)in which the binding is active.
4
Binding Time Binding Time is the point at which a binding is created or, more generally, the point at which any implementation decision is made: Language design time: program structure, possible type. Language implementation time: I/O, arithmetic overflow, type equality (if unspecified in manual). Program writing time: algorithms, names. Compile time: plan for data layout. Link time: layout of whole program in memory. Load time: choice of physical addresses. Run time: Value/variable bindings, sizes of strings. Subsumes program start-up time, module entry time, elaboration time (point a which a declaration is first “seen”), procedure entry time, block entry time, and statement execution time.
5
Binding Types The terms “static” and “dynamic” are generally used to refer to things bound before run time and at run time, respectively. It is difficult to overstate the importance of binding times in programming languages. Early binding times are associated with greater efficiency. Later binding times are associated with greater flexibility. Compiled languages tend to have early binding times. Interpreted languages tend to have later binding times. Binding of identifiers to the variables they name. Scope rules and control bindings: Fundamental to all programming languages is the ability to name data, i.e., refer to data using symbolic identifiers instead addresses. Not all data is named: for example, dynamic storage in C or Pascal is referenced by pointers, not names.
6
Object Lifetime and Storage Management
Key events: creation of objects, creation of bindings, references to variables (which use bindings), (temporary) deactivation of bindings, reactivation of bindings, destruction of bindings, and destruction of objects. Binding lifetime: the period of time from creation to destruction of a name-to-object binding. Object lifetime: the time between the creation and destruction of an objects is the object’s lifetime: If object outlives binding it's garbage. If binding outlives object it's a dangling reference. Scope: the textual region of the program in which the binding is active; we sometimes use the word scope as a noun all by itself, without an indirect object.
7
Storage Allocation Mechanisms
Static: objects are given an absolute address that is retained throughout the program’s execution. Stack: objects are allocated and deallocated in last-in, first- out order, usually in conjunction with subroutine calls and returns. Heap: objects may be allocated and deallocated at arbitrary times. They require a more general (and expensive) storage management algorithm.
8
Static Allocation Examples
Global variables: accessible throughout the program. Code: the machine instructions. Static local variables: retain their values from one invocation to the next. Explicit constants (including strings, sets, etc.): Small constants may be stored in the instructions. Tables: most compilers produce a variety of tables used by runtime support routines (debugging, dynamic-type checking, garbage collection, exception handling).
9
Stack Central stack for parameters, local variables and temporaries.
Why a stack? Allocate space for recursive routines: not necessary if no recursion. Reuse space: in all programming languages. Contents of a stack frame (Figure 3.1): arguments and returns, local variables, temporaries, bookkeeping (saved registers, line number static link, etc.). Local variables and arguments are assigned fixed offsets from the stack pointer or frame pointer at compile time. Maintenance of stack is responsibility of calling sequence and subroutine prolog and epilog: Saving space: putting as much in the prolog and epilog as possible. Time may be saved by putting stuff in the caller instead or combining what's known in both places (interprocedural optimization).
10
Stack-Based Allocation for Subroutines
11
Heap-Based Allocation
Heap: a region of storage in which subblocks can be allocated and deallocated at arbitrary times. Heap space management: speed vs. space tradeoffs. Space concerns: Internal fragmentation: allocating a block large than required. External fragmentation: unused space is fragmented so the ability to meet allocation requests degrades over time.
12
Garbage Collection Allocation of heap-based objects: triggered by some specific operation in a program (e.g., object instantiation). Deallocation: explicit in some languages (e.g., C++), implicit in others (e.g., Java). Garbage collection mechanism identifies and reclaims unreachable objects (implicitly deallocated). Explicit deallocation benefits: simplicity and execution speed provided that the programmer can correctly identify the end of an object’s lifetime. Implicit deallocation (automatic garbage collection) benefits: eliminates manual allocation errors such as dangling reference and memory leak.
13
Scope Rules Scope: a program section of maximal size in which no bindings change, or at least no re-declarations are permitted. In most languages with subroutines, we open a new scope on subroutine entry: Create bindings for new local variables. Deactivate bindings for global variables that are re-declared (these variable are said to have a “hole” in their scope). Make references to variables. On subroutine exit destroy bindings for local variables and reactivate bindings for global variables that were deactivated. Algol 68: elaboration is a process of creating bindings when entering a scope. Ada: storage may be allocated, tasks started, or exceptions propagated as a result of the elaboration of declarations.
14
Referencing Environment
At any given point in a program’s execution, the set of active bindings is called the current referencing environment. The referencing environment is principally determined by static or dynamic scope rules. Sometimes it may depend on deep and shallow binding related to the passing of parameters to subroutines.
15
Static Scoping Static (lexical) scope rules: a scope is defined in terms of the physical (lexical) structure of the program: The determination of scopes can be made by the compiler. All bindings for identifiers can be resolved by examining the program. Typically, we choose the most recent, active binding made at compile time. Most compiled languages, C and Pascal included, employ static scope rules. The classical example of static scope rules is the most closely nested rule used in block structured languages such as Algol 60 and Pascal (nested subroutines): An identifier is known in the scope in which it is declared and in each enclosed scope, unless it is re-declared in an enclosed scope. To resolve a reference to an identifier, we examine the local scope and statically enclosing scopes until a binding is found.
16
Static Chains Access to non-local variables static links:
Each frame points to the frame of the (correct instance of) the routine inside which it was declared. In the absence of formal subroutines, correct means closest to the top of the stack. You access a variable in a scope k levels out by following k static links and then using the known offset within the frame thus found.
17
Declaration Order All declarations appear at the beginning of the scope: some early languages such as Algol-60, Lisp. The names must be declared before the use: Pascal. Relaxing declare before the use: C++ and Java. The scope of a declaration is the entire block in which it appears: Modula-3. No declarations within a block: Python. Recursive types and subroutines: names have to be declared before they can be used. If declaration is not complete enough to be a definition, a separate definition must appear. Nested blocks: local variables also can be defined at the top of any block.
18
Modules Information hiding makes objects and algorithms invisible:
Properly modularized code reduces the “cognitive load” on the programmer. Reduces the risk of name conflicts. Safeguards integrity of data abstraction. Helps to compartmentalize run-time errors. A module allows a collection of objects to be encapsulated affecting their visibility but not lifetime: Objects inside are visible to each others. Objects on the inside are not visible on the outside unless explicitly exported. Objects on the outside are not visible on the inside unless explicitly imported. Examples: the stack abstraction (Figure 3.6).
19
Using Modules Module scopes:
Closed scope: names must be explicitly imported. Open scope: no import required. Imports: increase modularity by requiring a module to specify dependence on the rest of the program. Subroutines are open scopes in most Algol-like languages except Euclid, Turing, Modula, Perl, and Clu. Manager idiom: requires additional constructs (subroutines) to make the module a “manager” for instances.
20
Module Types and Classes
We will see classes (a relative of modules) later on, when discussing abstraction and object-oriented languages: These have even more sophisticated (static) scope rules. Euclid is an example of a language with lexically-nested scopes in which all scopes are closed: Rules were designed to avoid aliases, which complicate optimization and correctness arguments. Note that the bindings created in a subroutine are destroyed at subroutine exit: The modules of Modula, Ada, etc., give you closed scopes without the limited lifetime. Bindings to variables declared in a module are inactive outside the module, not destroyed. The same sort of effect can be achieved in many languages with own (Algol term) or static (C term) variables.
21
Dynamic Scoping Rules The key idea in static scope rules is that bindings are defined by the physical (lexical) structure of the program. With dynamic scope rules, bindings depend on the current state of program execution: They cannot always be resolved by examining the program because they are dependent on calling sequences. To resolve a reference, we use the most recent, active binding made at run time. Dynamic scope rules are usually encountered in interpreted languages: Early LISP dialects assumed dynamic scope rules. Such languages do not normally have type checking at compile time because type determination isn’t always possible when dynamic scope rules are in effect.
22
Example: Static vs. Dynamic
program scopes (input, output ); var a : integer; procedure first; begin a := 1; end; procedure second; var a : integer; begin first; end; begin a := 2; second; write(a); end. If static scope rules are in effect (as would be the case in Pascal), the program prints 1. If dynamic scope rules are in effect, the program prints 2. The issue is whether the assignment to the variable a in procedure first changes the variable a declared in the main program or the variable a declared in procedure second.
23
Using Scoping Rules Static scope rules require that the reference resolve to the most recent, compile-time binding, i.e., the global variable a. Dynamic scope rules, on the other hand, require that we choose the most recent, active binding at run time: Perhaps the most common use of dynamic scope rules is to provide implicit parameters to subroutines. This is generally considered bad programming practice nowadays: Alternative mechanisms exist: static variables that can be modified by auxiliary routines or default and optional parameters. At run time a binding for a when in main program. Another binding for a when in procedure second: The most recent, active binding when executing procedure first. Modify the variable local to procedure second, not global variable. However, we write the global variable because the variable a local to procedure second is no longer active.
24
Implementing Scope Symbol table: a data abstraction, a dictionary, used for compiling statically scoped programs (keeps track of names). Insert a new mapping: a name-to-object mapping. Look up the information that is already present for a given name. Static scope: increased complexity since a given name corresponds to different objects. The visibility handled by augmenting the symbol table with enter_scope and leave_scope operations. Dynamic scope: a symbol table also can be used as an association list or a central reference table.
25
Meaning of Names What are aliases good for? (consider uses of FORTRAN equivalence): Space saving: modern data allocation methods are better. Multiple representations: unions are better. Linked data structures: legitimate. Also, aliases arise in parameter passing as an unfortunate side effect: Euclid scope rules are designed to prevent this. Some overloading happens in almost all languages: Integer + versus real +. Read and write in Pascal. Function return in Pascal. Some languages get into overloading in a big way: Ada, C++. Redefining built-in operators: Ada, C++, C#, …
26
Polymorphism and Related Concepts
Overloaded functions are two different functions with the same name (C++ example): overload norm int norm (int a){return a>0 ? a : -a;) complex norm (complex c ) { // ... Polymorphic functions work in more than one way: In Modula-2: function min (A : array of integer); … In Smalltalk. Generic functions (modules, etc.) - a syntactic template that can be instantiated in more than one way at compile time: Via macro processors in C++. Built-in in C++. In Clu. In Ada.
27
Accessing Variables with Dynamic Scope
Keep a stack (association list) of all active variables - slow access but fast calls: When you need to find a variable, hunt down from top of stack. Equivalent to searching the activation records on the dynamic chain. Keep a central table with one slot for every variable name - slow calls but fast access: If names cannot be created at run time, the table layout (and the location of every slot) can be fixed at compile time. Otherwise, you'll need a hash function or something to do lookup. Subroutine changes the table entries for its locals at entry and exit. Variable lookup in a dynamically-scoped language is like symbol table lookup in a statically-scoped language. Static scope rules tend to be more complicated: the data structure and lookup algorithm are also more complicated.
28
Binding of Referencing Environments
Referencing environment of a statement at run time is the set of active bindings. A referencing environment corresponds to a collection of scopes that are examined (in order) to find a binding. Scope rules determine that collection and its order. Binding rules determine which instance of a scope should be used to resolve references when calling a procedure that was passed as a parameter: They govern the binding of referencing environments to formal procedures.
29
Binding of Referencing Environments
Referencing environment of a statement at run time is the set of active bindings. A referencing environment corresponds to a collection of scopes that are examined (in order) to find a binding. Scope rules determine that collection and its order. Binding rules determine which instance of a scope should be used to resolve references when calling a procedure that was passed as a parameter: They govern the binding of referencing environments to formal procedures. Shallow binding: the referencing environment created only when the subroutine is actually called. Deep binding: the referencing environment when the subroutine was passed as a parameter.
30
Subroutine Closures Subroutine closure: the bundle of an explicit representation of a referencing environment and a reference to the subroutine. Shallow binding is usually the default in languages with dynamic scoping. Deep binding is usually the default in languages with static scoping. Binding rules matter with static scoping only when accessing objects that are neither global or local.
31
Deep Binding Example B I == 2 A P == B I == 1 A P == C main program
program binding example(input, output) procedure A(I : integer; procedure P); procedure B; begin writeln(I); end; begin (* A *) if I > 1 then P else A(2, B); <-- here procedure C; begin end; begin (* main *) A(1, C); B I == 2 P == B A I == 1 P == C A main program
32
First-Class Values Types of values:
First-class status: it can be passed as a parameter, returned from a subroutine, or assigned into a variable. Second-class status: it only can be passed as a parameter. Third-class status: cannot be passed as a parameter. First class subroutines: a reference to a subroutine may outlive the execution of the scope in which that routine was declared. Unlimited extent: most functional languages specify that the lifetime of local objects continue indefinitely. x = 2 rtn = anon plus_x anon y = 3 main program main program
33
Object Closures The referencing environment in closure is nontrivial when passing a nested subroutine. The implementation of first-class subroutines is trivial in a language with nested subroutines but a subroutine is passed without its context. A subroutine can be encapsulated as a method of a simple object. Such an objects is called an object closure, a function object, or a functor. C++: an object of a class that overrides operator(). C# 3.0: an alternative lambda expression.
34
Macro Expansion Macro expansion facilities: reduce the need to write repetitive code. C macros: eliminate subroutine overhead but are implemented as textual substitution and not understood by the rest of the compiler (side effects). Modern languages: Macros are abandoned: an anachronism. Named constants: type safe. Inline subroutines: macros performance without macros limitations. Integrate macros within a language: Scheme. Hygienic macros: implicitly encapsulate their arguments. Generics/templates as macros: similar to hygienic macros.
35
Separate Compilation in C
Separately-compiled files: a sort of poor person's modules. Rules for how variables work are messy. Language is jerry-rigged to match the behavior of the linker. Static on a function or variable outside a function means it is usable only in the current source file: A different notion from the static variables inside a function. Extern on a variable or function: declared in another file: Functions headers without bodies are extern by default. Extern declarations are interpreted as forward declarations if a later declaration overrides them. Variables or functions (with bodies) that don't say static or extern are either global or common (a Fortran term): Functions and variables that are given initial values are global. Variables that are not given initial values are common. Matching common declarations in different files is the same variable: They also refer to the same variable as a matching global declaration.
36
Summary Object lifetime and name-to-object lifetime need not be the same. Scope rules can be static (lexical) or dynamic (runtime). Bindings relate one another in several ways: aliases, overloading, and polymorphism. Implementation complexity or run-time cost can cause that a feature is not included in a language. Simple features can have surprising implications. Tradeoffs between semantic utility and ease of implementation.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.