Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5 - Basic Semantics

Similar presentations


Presentation on theme: "Chapter 5 - Basic Semantics"— Presentation transcript:

1 Chapter 5 - Basic Semantics
4/20/2017 Chapter 5 - Basic Semantics Programming Languages: Principles and Practice, 2nd Ed. Kenneth C. Louden © Kenneth C. Louden, 2003

2 K. Louden, Programming Languages
Attributes Properties of language entities, especially identifiers used in a program. Important examples: Value of an expression Data type of an identifier Maximum number of digits in an integer Location of a variable Code body of a function or method Declarations ("definitions") bind attributes to identifiers. Different declarations may bind the same identifier to different sets of attributes. Chapter 5 K. Louden, Programming Languages

3 Binding times can vary widely:
Value of an expression: during execution or during translation (constant expression). Data type of an identifier: translation time (Java) or execution time (Smalltalk, Lisp). Maximum number of digits in an integer: language definition time or language implementation time. Location of a variable: load or execution time. Code body of a function or method: translation time or link time or execution time. Chapter 5 K. Louden, Programming Languages

4 Symbol table and environment
A dictionary or table is used to maintain the identifier/attribute bindings. It can be maintained either during translation or execution or both. (Pre-translation entities are entered into the initial or default table.) During translation this table is called the symbol table. During execution this table is called the environment. If both are maintained, the environment can usually dispense with names, keeping track only of locations (names are maintained implicitly). Chapter 5 K. Louden, Programming Languages

5 K. Louden, Programming Languages
Semantic functions Formally, we can think of the symbol table and environment as functions. If the symbol table and environment are maintained separately (compiler): SymbolTable: Names   Static Attributes Environment: Names  Locations Memory: Locations  Values If the symbol table and environment are maintained together (interpreter): Environment: Names   Attributes (including locations and values) Chapter 5 K. Louden, Programming Languages

6 K. Louden, Programming Languages
Declarations Declarations bind identifiers to attributes. The collection of bound attributes (including the identifier) can be viewed as equivalent to the declaration itself. Attributes may be explicit or implicit in a declaration. A declaration may fail to fully specify all necessary attributes—a supplemental declaration is then necessary elsewhere. By abuse of language, we sometimes refer to the declaration of x, or just x—using the identifier to stand for the declaration. Chapter 5 K. Louden, Programming Languages

7 Examples of declarations (C)
int x = 0; Explicitly specifies data type and initial value. Implicitly specifies scope (see next slide) and location in memory. int f(double); Explicitly specifies type (double  int). Implicitly specifies nothing else: needs another declaration specifying code. The former is called a definition in C, the latter is simply a declaration. Chapter 5 K. Louden, Programming Languages

8 K. Louden, Programming Languages
Scope The scope of a declaration is the region of the program to which the bindings established by the declaration apply. (If individual bindings apply over different regions: scope of a binding.) Scope is typically indicated implicitly by the position of the declaration in the code, though keywords can modify it. In a block-structured language, the scope is typically the code from the end of the declaration to the end of the "block" (indicated by braces {…} in C and Java) in which the declaration occurs. Scope can extend backwards to the beginning of the block in certain cases (class declarations in Java and C++, top-level declarations in Scheme). Chapter 5 K. Louden, Programming Languages

9 Lexical vs. dynamic scope
Scope is maintained by the properties of the lookup operation in the symbol table or environment. If scope is managed statically (prior to execution), the language is said to have static or lexical scope ("lexical" because it follows the layout of the code in the file). If scope is managed directly during execution, then the language is said to have dynamic scope. The next slide has an example showing the difference. It is possible to maintain lexical scope during execution (i.e. by the environment), but it requires extra links and a somewhat unusual lookup operation (see Chapter 8). (Scheme does this.) Chapter 5 K. Louden, Programming Languages

10 K. Louden, Programming Languages
Java scope example public class Scope { public static int x = 2; public static void f() { System.out.println(x); } public static void main(String[] args) { int x = 3; f(); } } Of course, this prints 2, but under dynamic scope it would print 3 (the most recent declaration of x in the execution path is found). Chapter 5 K. Louden, Programming Languages

11 Dynamic scope evaluated
Almost all languages use lexical scope: with dynamic scope the meaning of a variable cannot be known until execution time, thus there cannot be any static checking. In particular, no static type checking. Originally used in Lisp. Scheme could still use it, but doesn't. Some languages still use it: VBScript, Javascript, Perl (older versions). Lisp inventor (McCarthy) now calls it a bug. Still useful as a pedagogical tool to understand the workings of scope. In some ways a lot like dynamic binding of methods. Chapter 5 K. Louden, Programming Languages

12 K. Louden, Programming Languages
Scope holes Under either lexical or dynamic scope, a nested or more recent declaration can mask a prior declaration. Indeed, in slide 10, the local declaration of x in main masks the static declaration of x in the Scope class. How would you access the static x inside main in Java? Use Scope.x in place of x: public static void main(String[] args) { int x = 3; Scope.x = 4; ...} Exercise: how to do this for non-statics? Chapter 5 K. Louden, Programming Languages

13 Symbol table structure
A table of little stacks of declarations under each name. For example the table for the Scope class of slide 10 would look as follows inside main (using lexical scope): Chapter 5 K. Louden, Programming Languages

14 Symbol table structure (2)
Alternatively, a stack of little tables, one for each scope. For example, the previous example would look as follows (lexical scope): Can be deleted after leaving f Current table inside main Chapter 5 K. Louden, Programming Languages

15 Symbol table structure (3)
Symbol table is constructed as declarations are encountered (insert operation). Insertions follow static structure of source code with lexical scope. Insertions follow execution path with dynamic scope. Lookups occur as names are encountered in dynamic scope (in symbol table to that point). In lexical scope, lookups occur either as names are encountered in symbol table to that point (declaration before use—C), or all lookups are delayed until after the symbol table is fully constructed and then performed (Java class—scope applies backwards to beginning of class). Chapter 5 K. Louden, Programming Languages

16 Symbol table structure (4)
Using dynamic scope, the same example would look as follows: Current table inside f Current table inside main Chapter 5 K. Louden, Programming Languages

17 Symbol table structure evaluated
Which organization is better? Table of little stacks is simpler (C, Pascal). Stack of little tables is more versatile, and helpful when you need to recover outer scopes from within inner ones or from elsewhere in the code (Ada, Java, C++). Normally, no specific table structure is part of a language specification: any structure that provides the appropriate properties will do. Chapter 5 K. Louden, Programming Languages

18 K. Louden, Programming Languages
Ada example (Fig. 5.17, p.151): Chapter 5 K. Louden, Programming Languages

19 K. Louden, Programming Languages
Overloading Overloading is a property of symbol tables that allows them to successfully handle declarations that use the same name within the same scope. It is the job of the symbol table to pick the correct choice from among the declarations for the same name in the same scope. This is called overload resolution. It must do so by using extra information, typically the data type of each declaration, which it compares to the probable type at the use site, picking the best match. If it cannot successfully do this, a static semantic error occurs. Chapter 5 K. Louden, Programming Languages

20 K. Louden, Programming Languages
Overloading (2) Overloading typically applies only to functions or methods. Overloading must be distinguished from dynamic binding in an OO language. Overloading is made difficult by weak typing, particularly automatic conversions. In the presence of partially specified types, such as in ML, overload resolution becomes even more difficult, which is why ML disallows it. Scheme disallows it for a different reason: there are no types on which to base overload resolution, even during execution. Chapter 5 K. Louden, Programming Languages

21 K. Louden, Programming Languages
Overloading (3) An example in Java: public class Overload { public static int max(int x, int y) { return x > y ? x : y;} public static double max(double x, double y) { return x > y ? x : y;} public static int max(int x, int y, int z) { return max(max(x,y),z);} public static void main(String[] args) { System.out.println(max(1,2)); System.out.println(max(1,2,3)); System.out.println(max(4,1.3)); }} Adding more max functions that mix double and int parameters is ok. But adding ones that mix double and int return values is not! Chapter 5 K. Louden, Programming Languages

22 K. Louden, Programming Languages
Overloading (4) C++ and Ada are even more challenging for overload resolution: C++, because it allows many more automatic conversions, Ada because the return type is also used to resolve overloading (Ada gets away with this only because it allows no automatic conversions). It is possible for languages to also keep different symbol tables for different kinds of declarations. In Java these are called "name spaces," and they also represent a kind of overloading. Java is particularly ugly in this respect: there are different name spaces for classes, methods, params/vars, labels, and even packages -- see Figure 5.23, page 158. Chapter 5 K. Louden, Programming Languages

23 K. Louden, Programming Languages
The Environment Can be constructed entirely statically (Fortran): all vars and functions have fixed locations for the duration of execution. Can also be entirely dynamic: functional languages like Scheme and ML. Most language use a mix: C, C++, Java, Ada. Consists of three components: A fixed area for static allocation A stack area for lifo allocation (usually the processor stack) A "heap" area for on-demand dynamic allocation (with or without garbage collection) Chapter 5 K. Louden, Programming Languages

24 Typical environment organization (possible C) [Figure 5.25, p. 165)]
© 2003 Brooks/Cole - Thomson Learning™ Typical environment organization (possible C) [Figure 5.25, p. 165)] Chapter 5 K. Louden, Programming Languages

25 K. Louden, Programming Languages
The Runtime Stack Used for: Procedure/function/method calls temporaries local variables Temporaries: intermediate results that cannot be kept in registers; not considered here further. Procedure calls: Chapter 8. Local variables: part of calls, but can be considered independently, showing LIFO behavior for nested scopes (next slide). Chapter 5 K. Louden, Programming Languages

26 Example of stack-based allocation in C within a procedure:
(1) A: { int x; (2) char y; (4) B: { double x; (5) int a; (7) } /* end B */ (8) C: { char y; (9) int b; (11) D: { int x; (12) double y; (14) } /* end D */ (16) } /* end C */ (18) } /* end A */ Point #1 Point #2 Chapter 5 K. Louden, Programming Languages

27 K. Louden, Programming Languages
Stack at Point #1: © 2003 Brooks/Cole - Thomson Learning™ Chapter 5 K. Louden, Programming Languages

28 K. Louden, Programming Languages
Stack at Point #2: © 2003 Brooks/Cole - Thomson Learning™ Chapter 5 K. Louden, Programming Languages

29 An alternative: a flat local space
All local variables allocated at once, regardless of nesting. Wastes some space, but not critically. With this approach, only complete function/method calls get allocated on the stack. Even with the previous approach, the primary structure of the stack is still the call structure: a complete record of a call on the stack is called an activation record or frame, and the stack is referred to as the call stack. (Chapter 8) Java promotes a flat space by forbidding nested redeclarations, but this is not an essential property: a symbol table can easily distinguish nested declarations as A.x, A.y, A.B.x, A.B.a, etc. Chapter 5 K. Louden, Programming Languages

30 K. Louden, Programming Languages
Heap Allocation In "standard" languages (C, C++, Java) heap allocation requires a special operation: new. Any kind of data can be allocated on the heap in C/C++; in Java all objects and only objects are allocated on the heap. Even with heap allocation available in Java & C/C++, the stack is still used to represent calls. In C/C++, deallocation is typically by hand (destructors), but it is hard to do right. Java uses a garbage collector that periodically sweeps the heap looking for data that cannot be accessed any more by the program and adding it back to free space. Chapter 5 K. Louden, Programming Languages

31 K. Louden, Programming Languages
Heap Allocation (2) In functional languages (Scheme, ML) heap allocation is performed automatically, and virtually everything, including function calls, are allocated on the heap. Of course, functional languages also use garbage collection, since deallocation is automatic as well. (Indeed, SML/NJ as its default still quaintly announces calls to the garbage collector, including some statistics.) A lot of study and effort has been made by both the functional language and OO language community to make garbage collection efficient in both time and space. Sadly, C and C++ still lack standard garbage collectors. Chapter 5 K. Louden, Programming Languages

32 K. Louden, Programming Languages
Lifetime/Extent The lifetime or extent of a program entity is the duration of its allocation in the environment. Allocation is static when the lifetime is the duration of the entire program execution. Lifetime is related to but not identical to scope. With scope holes, lifetime can extend to regions of the program where the program entity is not accessible. It is also possible for scope to exceed lifetime when a language allows locations to be manipulated directly (as for example manual deallocation). This is of course very dangerous! Chapter 5 K. Louden, Programming Languages

33 Variables and Constants
A variable is an object whose stored value can change during execution. A constant is an object whose value does not change throughout its lifetime. Constants are often confused with literals: constants have names, literals do not. Constants may be: compile-time static (may not ever be allocated) load-time static dynamic Chapter 5 K. Louden, Programming Languages

34 K. Louden, Programming Languages
Constants (2) Compile-time constant in Java: static final int zero = 0; Load-time constant in Java: static final Date now = new Date(); Dynamic constant in Java: any non-static final assigned in a constructor. Java takes a very general view of constants, since it is not very worried about getting rid of them during compilation. C takes a much stricter view of constants, essentially forcing them to be capable of elimination during compilation. Chapter 5 K. Louden, Programming Languages

35 Aliases, Dangling References, and Garbage
An alias occurs when the same object is bound to two different names at the same time. This is fairly common with Java objects. A dangling reference is a location that has been deallocated from the environment, but is still accessible within the program. Dangling references are impossible in a garbage-collected environment with no direct access to addresses. Garbage is memory that is still allocated in the environment but has become inaccessible to the program. Garbage can be a problem in a non-garbage collected environment, but is much less serious than dangling references. Chapter 5 K. Louden, Programming Languages


Download ppt "Chapter 5 - Basic Semantics"

Similar presentations


Ads by Google