© Kenneth C. Louden, 20031 Chapter 5 - Basic Semantics Programming Languages: Principles and Practice, 2nd Ed. Kenneth C. Louden.

Slides:



Advertisements
Similar presentations
Chapter 3:: Names, Scopes, and Bindings
Advertisements

Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Names and Bindings.
Chapter 7: User-Defined Functions II
Chapter 7: User-Defined Functions II Instructor: Mohammad Mojaddam.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Chapter 5 Basic Semantics
Programming Languages Third Edition
CS 330 Programming Languages 10 / 16 / 2008 Instructor: Michael Eckmann.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
CS 536 Spring Run-time organization Lecture 19.
Chapter 5K. Louden, Programming Languages1 Chapter 5 Basic Semantics.
Honors Compilers Addressing of Local Variables Mar 19 th, 2002.
Run time vs. Compile time
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Chapter 9: Subprogram Control
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Names and Bindings Introduction Names Variables The concept of binding Chapter 5-a.
Chapter 5 - Basic Semantics
1 Chapter 5: Names, Bindings and Scopes Lionel Williams Jr. and Victoria Yan CSci 210, Advanced Software Paradigms September 26, 2010.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes.
Chapter TwelveModern Programming Languages1 Memory Locations For Variables.
C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 14: Pointers, Classes, Virtual Functions, and Abstract Classes.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
5-1 Chapter 5: Names, Bindings, Type Checking, and Scopes Variables The Concept of Binding Type Checking Strong Typing Type Compatibility Scope and Lifetime.
Compiler Construction
1 Names, Scopes and Bindings Aaron Bloomfield CS 415 Fall
Chapter 8 - Control II: Procedures and Environments
Basic Semantics Attributes, Bindings, and Semantic Functions
1 Scope Rules (Section 3.3) CSCI 431 Programming Languages Fall 2003 A compilation of material developed by Felix Hernandez-Campos and Michael Scott.
Introduction A variable can be characterized by a collection of properties, or attributes, the most important of which is type, a fundamental concept in.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Basic Semantics Associating meaning with language entities.
Runtime Environments. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 9.
COMP3190: Principle of Programming Languages
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
Semantics CSE 340 – Principles of Programming Languages Fall 2015 Adam Doupé Arizona State University
ISBN Chapter 6 Data Types Pointer Types Reference Types Memory Management.
Object-Oriented Programming Chapter Chapter
Concepts of programming languages Chapter 5 Names, Bindings, and Scopes Lec. 12 Lecturer: Dr. Emad Nabil 1-1.
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
Variables reference, coding, visibility. Rules for making names  permitted character set  maximum length, significant length  case sensitivity  special.
1 Structure of Compilers Lexical Analyzer (scanner) Modified Source Program Parser Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator.
CS 330 Programming Languages 10 / 23 / 2007 Instructor: Michael Eckmann.
Constructs for Data Organization and Program Control, Scope, Binding, and Parameter Passing. Expression Evaluation.
Names, Scope, and Bindings Programming Languages and Paradigms.
Procedure Definitions and Semantics Procedures support control abstraction in programming languages. In most programming languages, a procedure is defined.
Internet Computing Module II. Syllabus Creating & Using classes in Java – Methods and Classes – Inheritance – Super Class – Method Overriding – Packages.
Runtime Environments Chapter 7. Support of Execution  Activation Tree  Control Stack  Scope  Binding of Names –Data object (values in storage) –Environment.
Data Types Chapter 6: Data Types Lectures # 13. Topics Chapter 6: Data Types 2 Introduction Primitive Data Types Character String Types Array Types Associative.
Advanced Programming in C
Object Lifetime and Pointers
Data Types In Text: Chapter 6.
Names and Attributes Names are a key programming language feature
CS 326 Programming Languages, Concepts and Implementation
Type Checking, and Scopes
CS 326 Programming Languages, Concepts and Implementation
CSE 3302 Programming Languages
Concepts of programming languages
Implementing Subprograms
CSE 3302 Programming Languages
Names and Binding In Text: Chapter 5.
Symbol Table 薛智文 (textbook ch#2.7 and 6.5) 薛智文 96 Spring.
COMPILERS Semantic Analysis
Implementing Subprograms
Presentation transcript:

© Kenneth C. Louden, Chapter 5 - Basic Semantics Programming Languages: Principles and Practice, 2nd Ed. Kenneth C. Louden

Chapter 5Louden, Programming Languages2 Attributes Properties of language entities, especially identifiers used in a program. Important examples: –Value of an expression –Data type of an identifier –Maximum number of digits in an integer –Location of a variable –Code body of a function or method Declarations ("definitions") bind attributes to identifiers. Different declarations may bind the same identifier to different sets of attributes.

Chapter 5Louden, Programming Languages3 Binding times can vary widely: Value of an expression: during execution or during translation (constant expression). Data type of an identifier: translation time (Java) or execution time (Smalltalk, Lisp). Maximum number of digits in an integer: language definition time or language implementation time. Location of a variable: load or execution time. Code body of a function or method: translation time or link time or execution time.

Chapter 5Louden, Programming Languages4 Symbol table and environment A dictionary or table is used to maintain the identifier/attribute bindings. It can be maintained either during translation or execution or both. (Pre-translation entities are entered into the initial or default table.) During translation this table is called the symbol table. During execution this table is called the environment. If both are maintained, the environment can usually dispense with names, keeping track only of locations (names are maintained implicitly).

Chapter 5Louden, Programming Languages5 Semantic functions Formally, we can think of the symbol table and environment as functions. If the symbol table and environment are maintained separately (compiler): SymbolTable: Names  Static Attributes Environment: Names  Locations Memory: Locations  Values If the symbol table and environment are maintained together (interpreter): Environment: Names  Attributes (including locations and values)

Chapter 5Louden, Programming Languages6 Declarations Declarations bind identifiers to attributes. The collection of bound attributes (including the identifier) can be viewed as equivalent to the declaration itself. Attributes may be explicit or implicit in a declaration. A declaration may fail to fully specify all necessary attributes—a supplemental declaration is then necessary elsewhere. By abuse of language, we sometimes refer to the declaration of x, or just x —using the identifier to stand for the declaration.

Chapter 5Louden, Programming Languages7 Examples of declarations (C) int x = 0; Explicitly specifies data type and initial value. Implicitly specifies scope (see next slide) and location in memory. int f(double); Explicitly specifies type ( double  int ). Implicitly specifies nothing else: needs another declaration specifying code. The former is called a definition in C, the latter is simply a declaration.

Chapter 5Louden, Programming Languages8 Scope The scope of a declaration is the region of the program to which the bindings established by the declaration apply. (If individual bindings apply over different regions: scope of a binding.) Scope is typically indicated implicitly by the position of the declaration in the code, though keywords can modify it. In a block-structured language, the scope is typically the code from the end of the declaration to the end of the "block" (indicated by braces {…} in C and Java) in which the declaration occurs. Scope can extend backwards to the beginning of the block in certain cases (class declarations in Java and C++, top-level declarations in Scheme).

Chapter 5Louden, Programming Languages9 Lexical vs. dynamic scope Scope is maintained by the properties of the lookup operation in the symbol table or environment. If scope is managed statically (prior to execution), the language is said to have static or lexical scope ("lexical" because it follows the layout of the code in the file). If scope is managed directly during execution, then the language is said to have dynamic scope. The next slide has an example showing the difference. It is possible to maintain lexical scope during execution (i.e. by the environment), but it requires extra links and a somewhat unusual lookup operation (see Chapter 8). (Scheme does this.)

Chapter 5Louden, Programming Languages10 Java scope example public class Scope { public static int x = 2; public static void f() { System.out.println(x); } public static void main(String[] args) { int x = 3; f(); } } Of course, this prints 2, but under dynamic scope it would print 3 (the most recent declaration of x in the execution path is found).

Chapter 5Louden, Programming Languages11 Dynamic scope evaluated Almost all languages use lexical scope: with dynamic scope the meaning of a variable cannot be known until execution time, thus there cannot be any static checking. In particular, no static type checking. Originally used in Lisp. Scheme could still use it, but doesn't. Some languages still use it: VBScript, Javascript, Perl (older versions). Lisp inventor (McCarthy) now calls it a bug. Still useful as a pedagogical tool to understand the workings of scope. In some ways a lot like dynamic binding of methods.

Chapter 5Louden, Programming Languages12 Scope holes Under either lexical or dynamic scope, a nested or more recent declaration can mask a prior declaration. Indeed, in slide 10, the local declaration of x in main masks the static declaration of x in the Scope class. How would you access the static x inside main in Java? Use Scope.x in place of x : public static void main(String[] args) { int x = 3; Scope.x = 4;...} Exercise: how to do this for non-statics?

Chapter 5Louden, Programming Languages13 Symbol table structure A table of little stacks of declarations under each name. For example the table for the Scope class of slide 10 would look as follows inside main (using lexical scope):

Chapter 5Louden, Programming Languages14 Symbol table structure (2) Alternatively, a stack of little tables, one for each scope. For example, the previous example would look as follows (lexical scope): Can be deleted after leaving f Current table inside main

Chapter 5Louden, Programming Languages15 Symbol table structure (3) Symbol table is constructed as declarations are encountered (insert operation). Insertions follow static structure of source code with lexical scope. Insertions follow execution path with dynamic scope. Lookups occur as names are encountered in dynamic scope (in symbol table to that point). In lexical scope, lookups occur either as names are encountered in symbol table to that point (declaration before use—C), or all lookups are delayed until after the symbol table is fully constructed and then performed (Java class—scope applies backwards to beginning of class).

Chapter 5Louden, Programming Languages16 Symbol table structure (4) Using dynamic scope, the same example would look as follows: Current table inside f Current table inside main

Chapter 5Louden, Programming Languages17 Symbol table structure evaluated Which organization is better? Table of little stacks is simpler (C, Pascal). Stack of little tables is more versatile, and helpful when you need to recover outer scopes from within inner ones or from elsewhere in the code (Ada, Java, C++). Normally, no specific table structure is part of a language specification: any structure that provides the appropriate properties will do.

Chapter 5Louden, Programming Languages18 Ada example (Fig. 5.17, p.151):

Chapter 5Louden, Programming Languages19 Overloading Overloading is a property of symbol tables that allows them to successfully handle declarations that use the same name within the same scope. It is the job of the symbol table to pick the correct choice from among the declarations for the same name in the same scope. This is called overload resolution. It must do so by using extra information, typically the data type of each declaration, which it compares to the probable type at the use site, picking the best match. If it cannot successfully do this, a static semantic error occurs.

Chapter 5Louden, Programming Languages20 Overloading (2) Overloading typically applies only to functions or methods. Overloading must be distinguished from dynamic binding in an OO language. Overloading is made difficult by weak typing, particularly automatic conversions. In the presence of partially specified types, such as in ML, overload resolution becomes even more difficult, which is why ML disallows it. Scheme disallows it for a different reason: there are no types on which to base overload resolution, even during execution.

Chapter 5Louden, Programming Languages21 Overloading (3) An example in Java: public class Overload { public static int max(int x, int y) { return x > y ? x : y;} public static double max(double x, double y) { return x > y ? x : y;} public static int max(int x, int y, int z) { return max(max(x,y),z);} public static void main(String[] args) { System.out.println(max(1,2)); System.out.println(max(1,2,3)); System.out.println(max(4,1.3)); }} Adding more max functions that mix double and int parameters is ok. But adding ones that mix double and int return values is not!

Chapter 5Louden, Programming Languages22 Overloading (4) C++ and Ada are even more challenging for overload resolution: C++, because it allows many more automatic conversions, Ada because the return type is also used to resolve overloading (Ada gets away with this only because it allows no automatic conversions). It is possible for languages to also keep different symbol tables for different kinds of declarations. In Java these are called "name spaces," and they also represent a kind of overloading. Java is particularly ugly in this respect: there are different name spaces for classes, methods, params/vars, labels, and even packages -- see Figure 5.23, page 158.

Chapter 5Louden, Programming Languages23 The Environment Can be constructed entirely statically (Fortran): all vars and functions have fixed locations for the duration of execution. Can also be entirely dynamic: functional languages like Scheme and ML. Most language use a mix: C, C++, Java, Ada. Consists of three components: –A fixed area for static allocation –A stack area for lifo allocation (usually the processor stack) –A "heap" area for on-demand dynamic allocation (with or without garbage collection)

Chapter 5Louden, Programming Languages24 Typical environment organization (possible C) [Figure 5.25, p. 165)] © 2003 Brooks/Cole - Thomson Learning ™

Chapter 5Louden, Programming Languages25 The Runtime Stack Used for: –Procedure/function/method calls –temporaries –local variables Temporaries: intermediate results that cannot be kept in registers; not considered here further. Procedure calls: Chapter 8. Local variables: part of calls, but can be considered independently, showing LIFO behavior for nested scopes (next slide).

Chapter 5Louden, Programming Languages26 Example of stack-based allocation in C within a procedure: (1)A: { int x; (2) char y; (4) B: { double x; (5) int a; (7) } /* end B */ (8) C: { char y; (9) int b; (11) D: { int x; (12) double y; (14) } /* end D */ (16) } /* end C */ (18) } /* end A */ Point #1 Point #2

Chapter 5Louden, Programming Languages27 Stack at Point #1: © 2003 Brooks/Cole - Thomson Learning ™

Chapter 5Louden, Programming Languages28 Stack at Point #2: © 2003 Brooks/Cole - Thomson Learning ™

Chapter 5Louden, Programming Languages29 An alternative: a flat local space All local variables allocated at once, regardless of nesting. Wastes some space, but not critically. With this approach, only complete function/method calls get allocated on the stack. Even with the previous approach, the primary structure of the stack is still the call structure: a complete record of a call on the stack is called an activation record or frame, and the stack is referred to as the call stack. (Chapter 8) Java promotes a flat space by forbidding nested redeclarations, but this is not an essential property: a symbol table can easily distinguish nested declarations as A.x, A.y, A.B.x, A.B.a, etc.

Chapter 5Louden, Programming Languages30 Heap Allocation In "standard" languages (C, C++, Java) heap allocation requires a special operation: new. Any kind of data can be allocated on the heap in C/C++; in Java all objects and only objects are allocated on the heap. Even with heap allocation available in Java & C/C++, the stack is still used to represent calls. In C/C++, deallocation is typically by hand (destructors), but it is hard to do right. Java uses a garbage collector that periodically sweeps the heap looking for data that cannot be accessed any more by the program and adding it back to free space.

Chapter 5Louden, Programming Languages31 Heap Allocation (2) In functional languages (Scheme, ML) heap allocation is performed automatically, and virtually everything, including function calls, are allocated on the heap. Of course, functional languages also use garbage collection, since deallocation is automatic as well. (Indeed, SML/NJ as its default still quaintly announces calls to the garbage collector, including some statistics.) A lot of study and effort has been made by both the functional language and OO language community to make garbage collection efficient in both time and space. Sadly, C and C++ still lack standard garbage collectors.

Chapter 5Louden, Programming Languages32 Benefit (of system controlled storage management): ability to delay the binding of a storage segment's size and/or location reuse of a storage segment for different jobs (from system supervisor point of view) reuse of storage for different data structures increased generality, not have to specify maximum data structure size dynamic data structures recursive procedures - garbage collection is automatic Benefits of programmer controlled storage management Disadvantage: burden on programmer & may interfere with necessary system control May lead to subtle errors May interfere with system-controlled storage management Advantage: difficult for system to determine when storage may be most effectively allocated and freed

Chapter 5Louden, Programming Languages33 Heap management –Single-size cells vs. variable-size cells –Reference counters (eager approach) vs. garbage collection (lazy approach) 1. Reference counters: maintain a counter in every cell that store the number of pointers currently pointing at the cell –Disadvantages: space required, complications for cells connected circularly Expensive - when making a pointer assignment p=q 1.decrement count for old value of p 2.if 0, return to free storage. Check if contains references to other blocks. Could be recursive 3.do pointer assignment 4.Increment reference count for q

Chapter 5Louden, Programming Languages34 One-bit reference counting Another variation on reference counting, called one-bit reference counting, uses a single bit flag to indicate whether each object has either "one" or "many" references. If a reference to an object with "one" reference is removed, then the object can be recycled. If an object has "many" references, then removing references does not change this, and that object will never be recycled. It is possible to store the flag as part of the pointer to the object, so no additional space is required in each object to store the count. One-bit reference counting is effective in practice because most actual objects have a reference count of one.

Chapter 5Louden, Programming Languages35 2. Garbage collection: allocate until all available cells allocated; then begin gathering all garbage –Every heap cell has an extra bit used by collection algorithm –All cells initially set to garbage –All pointers traced into heap, and reachable cells marked as not garbage –All garbage cells returned to list of available cells

Chapter 5Louden, Programming Languages36 Disadvantages: when you need it most, it works worst (takes most time when program needs most of cells in heap)

Chapter 5Louden, Programming Languages37 Mark-Sweep - Java uses In a mark-sweep collection, the collector first examines the program variables; any blocks of memory pointed to are added to a list of blocks to be examined. For each block on that list, it sets a flag (the mark) on the block to show that it is still required, and also that it has been processed. It also adds to the list any blocks pointed to by that block that have not yet been marked. In this way, all blocks that can be reached by the program are marked. In the second phase, the collector sweeps all allocated memory, searching for blocks that have not been marked. If it finds any, it returns them to the allocator for reuse Can find circular references. Easy if regular use of pointers (Like in LISP) All elements must be reachable by a chain of pointers which begins outside the heap Have to be able to know where all pointers are - both inside the heap and outside. How can a chain be followed from a pointer if there is no predefined location for that pointer in the pointed-to cell?

Chapter 5Louden, Programming Languages38 Conservative garbage collection Although garbage collection was first invented in 1958, many languages have been designed and implemented without the possibility of garbage collection in mind. It is usually difficult to add normal garbage collection to such a system, but there is a technique, known as conservative garbage collection, that can be used. The usual problem with such a language is that it doesn't provide the collector with information about the data types, and the collector cannot therefore determine what is a pointer and what isn't. A conservative collector assumes that anything might be a pointer. It regards any data value that looks like a pointer to or into a block of allocated memory as preventing the recycling of that block. You might think that conservative garbage collection could easily perform quite poorly, leaving a lot of garbage uncollected. In practice, it does quite well, and there are refinements that improve matters further. Because references are ambiguous, some objects may be retained despite being actually unreachable. In practice, this happens rarely, and refinements such as black-listing can further reduce the odds.

Chapter 5Louden, Programming Languages39Lifetime/Extent The lifetime or extent of a program entity is the duration of its allocation in the environment. Allocation is static when the lifetime is the duration of the entire program execution. Lifetime is related to but not identical to scope. With scope holes, lifetime can extend to regions of the program where the program entity is not accessible. It is also possible for scope to exceed lifetime when a language allows locations to be manipulated directly (as for example manual deallocation). This is of course very dangerous!

Chapter 5Louden, Programming Languages40 Variables and Constants A variable is an object whose stored value can change during execution. A constant is an object whose value does not change throughout its lifetime. Constants are often confused with literals: constants have names, literals do not. Constants may be: –compile-time static (may not ever be allocated) –load-time static –dynamic

Chapter 5Louden, Programming Languages41 Constants (2) Compile-time constant in Java: static final int zero = 0; Load-time constant in Java: static final Date now = new Date(); Dynamic constant in Java: any non-static final assigned in a constructor. Java takes a very general view of constants, since it is not very worried about getting rid of them during compilation. C takes a much stricter view of constants, essentially forcing them to be capable of elimination during compilation.

Chapter 5Louden, Programming Languages42 Aliases, Dangling References, and Garbage An alias occurs when the same object is bound to two different names at the same time. This is fairly common with Java objects. A dangling reference is a location that has been deallocated from the environment, but is still accessible within the program. Dangling references are impossible in a garbage-collected environment with no direct access to addresses. Garbage is memory that is still allocated in the environment but has become inaccessible to the program. Garbage can be a problem in a non- garbage collected environment, but is much less serious than dangling references.