CS 363 Comparative Programming Languages Names, Type Checking, and Scopes
CS 363 Spring 2005 GMU2 Names User-defined names include variables, functions, classes, types… Design issues for names: –Maximum length? –Are connector characters (_,-,…) allowed? –Are names case sensitive? –Are special words reserved words or keywords?
CS 363 Spring 2005 GMU3 Names Length –If too short, they cannot be connotative –Language examples: FORTRAN I: maximum 6 COBOL: maximum 30 FORTRAN 90 and ANSI C: maximum 31 Ada and Java: no limit, and all are significant C++: no limit, but implementers often impose one
CS 363 Spring 2005 GMU4 Names Case sensitivity –Disadvantage: readability (names that look alike are different) In C++ /Java because predefined names are mixed case (e.g. IndexOutOfBoundsException ) –C, C++, and Java names are case sensitive (b and B are different variables) –The names in some languages are not
CS 363 Spring 2005 GMU5 Names Special words: keywords, reserved words –Ex: while, for, … –An aid to readability; used to delimit or separate statement clauses Def: A keyword is a word that is special only in certain contexts –Disadvantage: poor readability, compiling Def: A reserved word is a special word that cannot be used as a user-defined name
CS 363 Spring 2005 GMU6 Variables A variable is an abstraction of a memory cell(s) Variables can be characterized as a sextuple of attributes: (name, address, value, type, lifetime, and scope) Not all variables have names (anonymous)
CS 363 Spring 2005 GMU7 Variables Address - the memory address with which a variable is associated –A variable may have different addresses at different times during execution (variable local to a function) –A variable may have different addresses at different places in a program (variable name used in multiple scopes) –l-value of a variable (x := …)
CS 363 Spring 2005 GMU8 Variables If two variable names can be used to access the same memory location, they are called aliases –Aliases are harmful to readability (program readers must remember all of them) How aliases can be created: –Pointers, reference variables, C and C++ unions, (and through parameters - discussed in Chapter 9) –Some of the original justifications for aliases are no longer valid; e.g. memory reuse in FORTRAN –Replace them with dynamic allocation
CS 363 Spring 2005 GMU9 Variables Type - determines the size of memory location, range of values of variables and the set of operations that are defined for values of that type, precision (floating point) Value - the contents of the location with which the variable is associated –r-value of a variable (… := x …)
CS 363 Spring 2005 GMU10 Binding A binding is an association, such as between an attribute and an entity, or between an operation and a symbol Binding time is the time at which a binding takes place.
CS 363 Spring 2005 GMU11 Possible Binding Times Language design time – e.g., operator symbols to operations Language implementation time – e.g., bind floating point type to a representation Compile time – e.g., bind a variable to a type Load time – e.g., bind a FORTRAN 77 variable to a memory cell (or a C static variable) Runtime – e.g., bind a local variable to a memory cell Different languages make different choices about binding times.
CS 363 Spring 2005 GMU12 The Concept of Binding Def: A binding is static if it first occurs before run time and remains unchanged throughout program execution. Def: A binding is dynamic if it first occurs during execution or can change during execution of the program.
CS 363 Spring 2005 GMU13 Overloading More than one binding for a name in a given scope. All languages offer limited overloading (+ for example) Subroutine names (Ada, C++, Java) – differentiated by the arguments Built-in Operators (Ada, C++, Fortran 90)
CS 363 Spring 2005 GMU14 Type Bindings How is a type specified? When does the binding take place? If static, the type may be specified by either an explicit or an implicit declaration
CS 363 Spring 2005 GMU15 Types Def: An explicit declaration is a program statement used for declaring the types of variables Def: An implicit declaration is a default mechanism for specifying types of variables (the first appearance of the variable in the program) FORTRAN, PL/I, BASIC, and Perl provide implicit declarations –Advantage: writability –Disadvantage: reliability (less trouble with Perl)
CS 363 Spring 2005 GMU16 Types Dynamic Type Binding (JavaScript and PHP) Specified through an assignment statement e.g., JavaScript list = [2, 4.33, 6, 8]; list = 17.3; –Advantage: flexibility (generic program units) –Disadvantages: High cost (dynamic type checking and interpretation) Type error detection by the compiler is difficult
CS 363 Spring 2005 GMU17 Types Type Inferencing (ML, Miranda, and Haskell) –Rather than by assignment statement, types are determined from the context of the reference
CS 363 Spring 2005 GMU18 Type Checking Generalize the concept of operands and operators to include subprograms and assignments Def: Type checking is the activity of ensuring that the operands of an operator are of compatible types Def: A compatible type is one that is either legal for the operator, or is allowed under language rules to be implicitly converted, by compiler- generated code, to a legal type. This automatic conversion is called a coercion. Def: A type error is the application of an operator to an operand of an inappropriate type
CS 363 Spring 2005 GMU19 Type Checking If all type bindings are static, nearly all type checking can be static If type bindings are dynamic, type checking must be dynamic Def: A programming language is strongly typed if type errors are always detected
CS 363 Spring 2005 GMU20 Strong Typing Advantage of strong typing: allows the detection of the misuses of variables that result in type errors What languages are strongly typed? –FORTRAN 77 is not: parameters, EQUIVALENCE –Pascal is not: variant records –C and C++ are not: parameter type checking can be avoided; unions are not type checked –Ada is, almost ( UNCHECKED CONVERSION is explicit loophole) (Java is similar)
CS 363 Spring 2005 GMU21 Strong Typing Coercion rules strongly affect strong typing- -they can weaken it considerably (C++ versus Ada) Although Java has just half the assignment coercions of C++, its strong typing is still far less effective than that of Ada
CS 363 Spring 2005 GMU22 Type Compatibility Our concern is primarily for structured types Def: Name type compatibility means the two variables have compatible types if they are in either the same declaration or in declarations that use the same type name Easy to implement but highly restrictive: –Subranges of integer types are not compatible with integer types –Formal parameters must be the same type as their corresponding actual parameters (Pascal)
CS 363 Spring 2005 GMU23 Type Compatibility Def: Structure type compatibility means that two variables have compatible types if their types have identical structures More flexible, but harder to implement
CS 363 Spring 2005 GMU24 Type Compatibility Consider the problem of two structured types: –Are two record types compatible if they are structurally the same but use different field names? –Are two array types compatible if they are the same except that the subscripts are different? (e.g. [1..10] and [0..9]) –Are two enumeration types compatible if their components are spelled differently? –With structural type compatibility, you cannot differentiate between types of the same structure (e.g. different units of speed, both float)
CS 363 Spring 2005 GMU25 Type Compatibility Language examples: –Pascal: usually structure, but in some cases name is used (formal parameters) –C: structure, except for records –Ada: restricted form of name Derived types allow types with the same structure to be different Anonymous types are all unique, even in: A, B : array (1..10) of INTEGER :
CS 363 Spring 2005 GMU26 Variable Lifetime Storage Bindings & Lifetime –Allocation - getting a cell from some pool of available cells –Deallocation - putting a cell back into the pool Def: The lifetime of a variable is the time during which it is bound to a particular memory cell Lifetime dictated by the type of variable: static, stack, explicit heap, implicit heap.
CS 363 Spring 2005 GMU27 Lifetime Categories Static--bound to memory cells before execution begins and remains bound to the same memory cell throughout execution. e.g. all FORTRAN 77 variables, C static variables –Advantages: efficiency (direct addressing), history- sensitive subprogram support –Disadvantage: lack of flexibility (no recursion)
CS 363 Spring 2005 GMU28 Lifetime Categories Stack-dynamic--Storage bindings are created for variables when their declaration statements are elaborated. –If scalar, all attributes except address are statically bound e.g. local variables in C subprograms and Java methods –Advantage: allows recursion; conserves storage –Disadvantages: Overhead of allocation and deallocation Subprograms cannot be history sensitive Inefficient references (indirect addressing)
CS 363 Spring 2005 GMU29 Lifetime Categories Explicit heap-dynamic--Allocated and deallocated by explicit directives, specified by the programmer, which take effect during execution –Referenced only through pointers or references e.g. dynamic objects in C++ (via new and delete) all objects in Java –Advantage: provides for dynamic storage management –Disadvantage: inefficient and unreliable
CS 363 Spring 2005 GMU30 Lifetime Categories Implicit heap-dynamic--Allocation and deallocation caused by assignment statements e.g. all variables in APL; all strings and arrays in Perl and JavaScript –Advantage: flexibility –Disadvantages: Inefficient, because all attributes are dynamic Loss of error detection
CS 363 Spring 2005 GMU31 Scope Def: The scope of a variable declaration is the range of program statements over which it is visible The scope rules of a language determine how references to names are associated with variables The terms ‘scope’ and ‘name space’ are sometimes used interchangably. Two approaches: static and dynamic
CS 363 Spring 2005 GMU32 Fortran 77 Name Space f1() variables parameters labels f2() variables parameters labels f3() variables parameters labels common block a common block b Global Global scope holds procedure names and common block names. Procedures have local variables parameters, labels and can import common blocks
CS 363 Spring 2005 GMU33 Scheme Name Space All objects (built-in and user-defined) reside in single global namespace ‘let’ expressions create nested lexical scopes Global map 2 cons var f1() f2() let
CS 363 Spring 2005 GMU34 C Name Space Global scope holds variables and functions No function nesting Block level scope introduces variables and labels File level scope with static variables that are not visible outside the file (global otherwise) Global a,b,c,d,... File scope static names x,y,z File scope static names w,x,y f1() f2() f3() variables parameters labels variables variables, param Block Scope variables labels Block scope Block scope
CS 363 Spring 2005 GMU35 Java Name Space Limited global name space with only public classes Fields and methods in a public class can be public visible to classes in other packages Fields and methods in a class are visible to all classes in the same package unless declared private Class variables visible to all objects of the same class. Public Classes package p1 package p2 package p3 public class c1 class c2 fields: f1,f2 method: m1 locals method: m2 locals fields: f3 method: m3
CS 363 Spring 2005 GMU36 Scope Understanding scope rules of a given language allows us to answer the following: Where is a given variable visible? What variables are visible at a given statement in the program?
CS 363 Spring 2005 GMU37 Static Scope Based on program text To connect a name reference to a variable, you (or the compiler) must find the declaration Search process: search declarations, first locally, then in increasingly larger enclosing scopes, until one is found for the given name –A variable is local to a procedure if the declaration occurs in that procedure –A variable is nonlocal to a procedure if it is visible in the procedure but not declared there
CS 363 Spring 2005 GMU38 Scope Variables can be hidden from a unit by having a "closer" variable with the same name C++ and Ada allow access to these "hidden" variables –In Ada: unit.name –In C++: class_name::name
CS 363 Spring 2005 GMU39 Referencing Environments Def: The referencing environment of a statement is the collection of all names that are visible to the statement In a static-scoped language, it is the local variables plus all of the visible variables in all of the enclosing scopes
CS 363 Spring 2005 GMU40 Example: Pascal-like language Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) body of sub3 body of sub1 body of main Main sub1 sub2sub3
CS 363 Spring 2005 GMU41 Example Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) body of sub3 body of sub1 body of main Main has local variables a,b,c, and sub1
CS 363 Spring 2005 GMU42 Example Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) body of sub3 body of sub1 body of main sub1 has local variables a,d, sub2 and sub3, as well as non-local variables b and c
CS 363 Spring 2005 GMU43 Example Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) body of sub3 body of sub1 body of main sub2 has local variables c,d and non-local variables a,b and sub1 (and potentially sub3 depending on the rules of the language)
CS 363 Spring 2005 GMU44 Example Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) body of sub3 body of sub1 body of main sub3 has local variable a and non-local variables b,c,d,sub2, and sub1
CS 363 Spring 2005 GMU45 Static Scope Advantages –Readability –Based on program text can be evaluated by a compiler –Constant time implementation Disadvantages: –Encourages global variables
CS 363 Spring 2005 GMU46 Dynamic Scope Based on calling sequences of program units, not their textual layout (temporal versus spatial) References to variables are connected to declarations by searching the chain of subprogram calls (runtime stack) that forced execution to this point
CS 363 Spring 2005 GMU47 Scope Example MAIN - declaration of x SUB1 - declaration of x -... call SUB2... SUB reference to x -... call SUB1 … MAIN calls SUB1 SUB1 calls SUB2 SUB2 uses x Which x??
CS 363 Spring 2005 GMU48 Scope Example MAIN - declaration of x SUB1 - declaration of x -... call SUB2... SUB reference to x -... call SUB1 … MAIN calls SUB1 SUB1 calls SUB2 SUB2 uses x For static scoping, it is main’s x
CS 363 Spring 2005 GMU49 Scope Example In a dynamic-scoped language, the referencing environment is the local variables plus all visible variables in all active subprograms. A subprogram is active if its execution has begun but has not yet terminated.
CS 363 Spring 2005 GMU50 Scope Example MAIN - declaration of x SUB1 - declaration of x -... call SUB2... SUB reference to x -... call SUB1 … MAIN calls SUB1 SUB1 calls SUB2 SUB2 uses x For dynamic scoping, it is sub1’s x MAIN (x) SUB1 (x) SUB2
CS 363 Spring 2005 GMU51 Dynamic Scoping Evaluation of Dynamic Scoping: –Advantage: convenience (easy to implement) –Disadvantage: poor readability, unbounded search time
CS 363 Spring 2005 GMU52 Scope and Lifetime Scope and lifetime are closely related, but are different concepts Consider a static variable in a C or C++ function –Lifetime = entire program execution –Scope = limited to statements in the function
CS 363 Spring 2005 GMU53 Static Scope & Runtime Activation record – keep information associated with each procedure call instance: parameters, local variables, return address, return values … Procedure call time – new activation pushed onto runtime stack Procedure return time – activation popped off runtime stack
CS 363 Spring 2005 GMU54 Static Scope & Runtime At runtime, we need to be able to find the correct instance of a variable being used. Additional field in activation record –a pointer (static link) to the activation record for the closest instance of enclosing scope. –Pointers form a static chain back to the ‘main’. –‘Search’ back along these enclosing link pointers to find non-local variables –Chain never gets longer than the scope depth.
CS 363 Spring 2005 GMU55 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main a,b,c
CS 363 Spring 2005 GMU56 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main sub1 a,d Main a,b,c
CS 363 Spring 2005 GMU57 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main sub1 a,d Main a,b,c sub1 a,d
CS 363 Spring 2005 GMU58 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main sub1 a,d Main a,b,c sub1 a,d sub1 a,d
CS 363 Spring 2005 GMU59 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main sub1 a,d sub1 a,d sub1 a,d sub3 a Main a,b,c
CS 363 Spring 2005 GMU60 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main sub1 a,d sub1 sub3 a sub2 c,d Main a,b,c sub1 a,d sub1 a,d
CS 363 Spring 2005 GMU61 Static Scope & Runtime Static Chain. –Chain never gets longer than the maximum scope depth. –For a given function, the compiler can compute 1.the exact number of links to traverse to find the required instance and 2.The variable offset (location) in the given activation record
CS 363 Spring 2005 GMU62 Static links Program main; a,b,c: real; procedure sub1(a: real); d: int; procedure sub2(c: int); d: real; body of sub2 procedure sub3(a:int) call sub2 if E call sub1 else call sub3 call sub1 Main sub1 a,d sub1 sub3 a sub2 c,d Main a,b,c sub1 a,d sub1 a,d In sub2, variable a is always 1 link back and variable b is always 2 links back.