CS 3304 Comparative Languages

CS 3304 Comparative Languages
Lecture 9: Control Flow 14 February 2012

Introduction Control flow (or ordering) in program execution.
Eight principal categories of language mechanisms used to specify ordering: Sequencing. Selection. Iteration. Procedural abstraction. Recursion. Concurrency. Exception handling and speculation. Nondeterminacy. The relative importance of different categories of control flow varies significantly among the different classes of programming languages. Name major categories of control-flow mechanisms.

Expression Evaluation
A simple object: e.g., a literal constant, a named variable or constant. An operator or function applied to a collection of operands or arguments, each of which in turn is an expression. Function calls: a function name followed by a parenthesized, comma-separated list of arguments. Operator: built-in function that uses special, simple syntax – one or two arguments, no parenthesis or commas. Sometimes they are “syntactic sugar” for more “normal” looking functions (in C++ a+b is shprt for a.operator+(b)) Operand: an argument of an operator. 2. What distinguishes operators from other sorts of functions?

Function Call Notation
Where the function name appears? What are notation type? Prefix: before its arguments: op a b or op(a,b) or (op a b). Infix: Among its arguments: a op b. Postfix: After its arguments: a b op. Most imperative languages use infix notation for binary and prefix for unary operators Lisp uses prefix notation (Cambridge Polish): (op a b). ML and the R scripting language allow the user to create new infix operators. Smalltalk uses infix notation for all functions (multiword infix): myBox displayOn: myScreen at: Postscript and Forth use postfix notations for most of its functions: other examples include C post-increment and decrement operators. 3. Explain the difference between prefix, infix, and postfix notation. What is Cambridge Polish notation? Name two programming languages that use postfix notations.

Precedence and Associativity
Infix notation requires the use of parenthesis to avoid ambiguity. The choice among alternative evaluation orders depends on the precedence and associativity of the operators: C has very rich precedence structure: problems with remembering all the precedence levels (15 levels). Pascal has relatively flat precedence hierarchy (3 levels). APL and Smalltalk: all operators are of equal precedence. Associativity rules specify whether sequences of operators of equal precedence group to the right or to the left: Usually the operators associate left-to-right. Fortran: the exponentiation operator ** associates right-to-left. C: the assignment operator associates right-to-left. 4. Why don’t issues of associativity and precedence arise in Postscript or Forth?

Precedence in Fortran, C, Ada, Pascal

Assignments Functional language: expressions are the building blocks:
lf computation is expression evaluation that depends only on the referencing environment for that evaluation. Expressions in a purely functional language are referentially transparent: the value depends only on the referencing environment. Imperative language: computation is usually and ordered series of changes to the values of variables in memory. Assignments provide the principal means for these changes. Side effect: a programming construct influences subsequent computation in any way other than by returning a value for use in the surrounding context. Expressions: always produce value and may have side effects. Statements: executed solely for the side effects. Imperative programming: computing by means of side effects. 5. What does it mean for an expression to be referentially transparent?

References and Values Subtle but important differences in the semantics of assignment in different imperative languages. Based on the context, a variable may refer to the value of the variable (r-value) or its location (l-value) – a named container for a value. Value model of variables: an expression can be either an l- value or an r-value, based on the context in which it appears. Built-in types can’t be passed uniformly to methods expecting class type parameters: wrapper classes, automatic boxing/unboxing. Reference model of variables: a variable is a named reference for a value – every variable is an l-value. E.g., integer values (like 2) are immutable. A variable has to be dereferenced to obtain its value. 6. What is the difference between a value model of variables and a reference model of variables? Why is the distinction important. 7. What is an l-value? An r-value? 8. Why is the distinction between mutable and immutable values important in the implementation of a language with a reference model of variables? a a b b 2 c c

Orthogonality Orthogonality: features can be used in any combination, the combinations all make sense, and the meaning of a given feature is consistent. Algol 68: orthogonality was a principal design goal. Expression-oriented - no separate notion of statement: begin a := if b < c then d else e; a := begin f(b); g(c) end; g(d); end Pascal: everything is a statement. C distinguishes between statements and expressions but has expression statement. Assignment with an expression: problems in C since it uses = for the assignment operator. 9. Define orthogonality in the context of programming language design? 10. What does it mean for a language to be expression-oriented?

Combination Assignment Operators
Imperative languages frequently update variables and can use statements like a = a + 1; that result in redundant address calculations. If the address calculation has a side effect, It has to be rewritten using additional statement(s). Starting with Algol 68, many languages provide assignment operators to update variables, e.g., a += 1;. C provides 10 different assignment operators, one for each of it binary arithmetic and bit-wise operators. Additionally, prefix and postfix increment and decrement operators. Multiway assignment - tuples: a,b = c,d means a = c; b = d; a,b = b,a; swapping variable values. a,b,c = foo(d,e,f); functions return tuples and single values. 11. What are the advantages of updating a variable with an assignment operator, rather than with a regular assignment in which the variable appears on both the left- and right-hand sides?

Initialization Imperative languages do not always initialize the values of variables in declarations - three reasons why they should: A static variable local to a subroutine. Statically allocated variable: initialization at compile time. Prevents accidental use of uninitialized variables. In addition to built-in types, to provide an orthogonal approach, aggregates (built-up structured values of user- defined composite types) are needed (C, Ada, ML). A language can provide a default value. Use of an uninitialized variable as a dynamic semantic error. Run-time detection could be expensive. Definite assignment: no use of uninitialized variables. Every possible control path assigns a value. Constructors: initialization versus assignment. 12. Given the ability to assign a value into a variable, why is it useful to be able to specify an initial value? 13. What are aggregates? Why are they useful? 14. Explain the notion of definite assignment in Java and C#. 15. Why is it generally expensive to catch all uses of uninitialized variables at run time? 16. Why is it impossible to catch all uses of uninitialized variables at compile time?

Ordering with Expressions
Precedence and associativity not sufficient: Operand evaluation order. Subroutine arguments evaluation order. Why is the evaluation order important? Side effects: an operand that is a function can modify other operands. Code improvement: impact on register allocation and instruction scheduling. Most languages leave the evaluation order undefined. Java represents a shift away from performance as the overriding design goal. Some implementations: the compiler can rearrange the expressions with commutative/associative/distributive operators to generate faster code. Problem: limited precision of computer arithmetic, arithmetic overflow. 17. Why do most languages leave unspecified the order in which the arguments of an operator or function are evaluated?

Short-Circuit Evaluation
Short-circuit evaluation of Boolean expressions: skipping the rest of the computation if the value can be determined: (a > b) or (b > c): if a is greater than b, the value of the Boolean expression is true regardless of the values of b and c. Can save a significant amount of time in some situations. It changes the semantics of Boolean expressions. Possible problems with side effects. Some languages provide both regular and short-circuit Boolean operators (Ada). Can be considered an example of lazy evaluation. 18. What is short-circuit Boolean evaluation? Why is it useful?

Structured and Unstructured Flow
Assembly language: conditional and unconditional branches. Early Fortran: relied heavily on goto statements (and labels): IF (A .LT. B) GOTO … 10 Late 1960s: Abandoning of GOTO statements started. Move to structured programming in 1970s: Top-down design (progressive refinement). Modularization of code. Descriptive variable. Within a subroutine, a well-designed imperative algorithm can be expressed with only sequencing, selection, and iteration. Most of the structured control-flow constructs were introduced by Algol 60.

Structured Alternatives to goto
With the structured constructs available, there was a small number of special cases where goto was replaced by special constructs: return, break, continue. Multilevel returns: branching outside the current subroutine. Unwinding: the repair operation that restores the run-time stack of subroutine information, including the restoration of register contents. Errors and other exceptions within nested subroutines: Auxiliary Boolean variable. Nonlocal GOTOs. Multilevel returns. Exception handling. Continuations: a generalization of nonlocal gotos that unwind the stack – fundamental to denotational semantics. 19. List the principal uses of goto, and the structured alternatives to each? 20. Explain the distinction between exceptions and multilevel returns. 21. What are continuations? What other language features do they subsume?

Sequencing The principal means of controlling the order in which side effects occur. Compound statement: a delimited list of statements. Block: a compound statement optionally preceded by a set of declarations. The value of a list of statements: The value of its final element (Algol 68). Programmers choice (Common Lisp – not purely functional). Can have side effects; very imperative, von Neumann. There are situations where side effects in functions are desirable: random number generators. Euclid and Turing: functions are not permitted to have side effects. 22. Why is sequencing a comparatively unimportant form of control flow in Lisp? 23. Explain why it may sometimes be useful for a function to have side effects.

Selection Selection statement: mostly some variant of if…then…else.
Languages differ in the details of the syntax. Short-circuited conditions: The Boolean expression is not used to compute a value but to cause control to branch to various locations. Provides a way to generate efficient (jump) code. Parse tree: inherited attributes of the root inform it of the address to which control should branch: if ((A > B) and (C > D)) or (E ≠ F) then r1 := A r2 := B then_clause if r1 <= r2 goto L4 else r1 := C r2 := D else_clause if r1 > r2 goto L L4: r1 := E r2 := F if r1 = r2 goto L L1: then_clause goto L L2: else_clause L3: 24. Describe the jump code implementation of the short-circuit Boolean evaluation.

Case/Switch Statements
Alternative syntax for a special case of nested if..then..else. CASE … (* expression *) : clause_A | 2, 7: clause_B | : clause_C | 10: clause_D ELSE clause_E END Code fragments (clauses): the arms of the CASE statement. The list of constants are CASE statement labels: The constants must be disjoint. The constants must of a type compatible with the tested expression. The principal motivation is to facilitate the generation of efficient target code: meant to compute the address in which to jump in a single instruction. A jump table: a table of addresses. 25. Why do imperative languages commonly provide case statement in addition to if…then…else?

Alternative Implementations
A non-dense set of labels results in a very large jump table. Other approaches include: Sequential testing: number n of case statement labels is small, O(n). Hashing: the range of label values is large but many missing values and no large ranges, O(1). Binary search: large value ranges, O(log n). Compilers needs to use a variety of strategies. Syntactic details vary from language to language. Pascal: no default clause. Modula: optional else clause. C: ranges not allowed, fall-through provision. Case statements are one of the clearest examples of language design driven by implementation: generation of jump tables. 26. Describe three different search strategies that might be employed in the implementation of a case statement, and the circumstances in which each would be desirable.

Iteration Iteration: a mechanism that allows a computer to perform similar operations repeatedly. Favored in imperative languages. Mostly some form of loops executed for their side effects: Enumeration-controlled loops: executed once of every value in a given finite set. Logically controlled loops: executed until some Boolean condition changes value. Combination loops: combines the properties of enumeration- controlled and logically controlled loops (Algol 60). Iterators: executed over the elements of a well-defined set (often called containers or collections in object-oriented code). 33. What is a container (a collection)?

Enumeration-Controlled Loops
Originated with the DO loop in Fortran I. Adopted in almost every language but with varying syntax and semantics. Many modern languages allow iteration over much more general finite sets. Semantic complications: Can control enter or leave the loop in any way other than through the enumeration mechanism? What happens if the loop body modifies variables that were used to compute the end-of-loop bound? What happens if the loop body modifies the index variable itself? Can the program read the index variable after the loop has completed, and if so, what will its value be? Solution: the loop header contains a declaration of the index. 27. Describe three subtleties in the implementation of enumeration-controlled loops. 28. Why do most languages not allow the bounds or increment of an enumeration-controlled loop to be floating-point numbers? 29. Why do many languages require the step size of an enumeration-controlled loop to be a compile-time constant? 30. Describe the “iteration count” loop implementation. What problem(s) does it solve?

Combination Loops Algol 60: can specify an arbitrary number of “enumerators” – a single value, a range of values, or an expression. Common Lisp: four separate sets of clauses – initialize index variables, test for loop termination, evaluate body expressions, and cleanup at loop termination. C: semantically, for loop is logically controlled but makes enumeration easy - it is the programmer’s responsibility to test the terminating condition. The index and any variables in the terminating condition can be modified within the loop. All the code affecting the flow of control is localized within the header. The index can be made local by declaring it within the loop thus it is not visible outside the loop. 31. What are advantages of making an index variable local to the loop it controls? 32. Does C have enumeration-controlled loops?

Iterators True iterators: a container abstraction provides an iterator that enumerates its items (Clu, Python, Ruby, C#). An iterator is a separate thread of control, with its own program counter, whose execution is interleaved with that of the loop. for i in range(first, last, step): Iterator objects: iteration involves both a special from of a for loop and a mechanisms to enumerate the values for the loop: Java: an object that supports Iterable interface – includes an iterator() method that returns an Iterator object. for (iterator<Integer> it = myTree.iterator(); it.hasNext();) { Integer i = it.next(); System.out.println(i); } C++: overloading operators so that iterating over the elements is like using pointer arithmetic. 34. Explain the difference between true iterators and iterator objects. 35. Cite two advantages of iterator objects over the use of programming conventions in a language like C.

Iterating First-class functions: functional languages support “in line” function specification. The body of a loop is written as a function, the loop index as an argument – the function is then passed to an iterator. Scheme: a lambda expression. Smalltalk: a square-bracketed block. Iterating without iterators: languages without true iterators and iterator objects – use programming conventions, e.g., define a type and associated functions (C). The syntax of the loop is not elegant and, probably, more prone to accidental errors. The code for the iterator is simply a type and some associated functions 36. Describe the approach to iteration typically employed in language with first-class functions.

Logically Controlled Loops
The only issue: where within the body of the loop the termination condition is tested. Before each iteration: the familiar while loop syntax – using an explicit concluding keyword or bracket the body with delimiters. Post-test loops: test the terminating condition at the bottom of a loop – the body is always executed at least once. Midtest loops: often accomplished with a special statement nested inside a conditional – break (C), exit (Ada), or last (Perl). 37. Give an example in which midtest loop results in more elegant code than does a pretest or post-test loop.

Recursion Recursion requires no special syntax: why?
Recursion and iteration are equally powerful. Most languages provide both iteration (more “imperative”) and recursion (more “functional”). Tail-recursive function: additional computation never follows a recursive call. The compiler can reuse the space, i.e., no need for dynamic allocation of stack space. int gcd(int a, int b) { if (a == b) return a; else if (a > b) return gcd(a - b,b); else return gcd(a, b – a); } Sometimes simple transformations are sufficient to produce tai-recursive code: continuation-passing style. 38. What is a tail-recursive function? Why is tail recursion important?

Evaluation Order It is possible to pass unevaluated arguments to subroutines and evaluate them when needed. Applicative-order evaluation: evaluation before the subroutine call – cleare and more efficient. Normal-order evaluation: evaluation only when the value is needed – occurs in macros, short-circuit Boolean evaluation, call-by-name parameters, and some functional languages. Example: Algol 60 uses normal-order evaluation by default for user-defined functions to mimic the behavior of macros. 39. Explain the difference between applicative and normal order evaluation of expressions. Under what circumstances is each desirable?

Lazy Evaluation In the absence of side effects the same semantics as normal order evaluation. Scheme provides delay and force for optional normal-order evaluation. The implementation keeps track of evaluated expressions and reuse the values if needed again. Promise: a delayed expression. Memoization: the mechanism used to keep track of which promises have already been evaluated. Often used to create infinite or lazy data structures that are “fleshed out” on demand: (define naturals (letrec ((next (lambda (n) (cons n (delay (next (+n 1))))))) (next 1))) (define head car) (define tail (lambda (stream) (force (cdr stream)))) 40. What is lazy evaluation? What are promises? What is memoization? 41. Give two reasons why lazy evaluation may be desirable. – infinite data structures, combinatorial search problems. 42. Name a language in which parameters are always evaluated lazily. – Miranda, Haskell.

Nondeterminacy Dijkstra suggested the use of nondeterminacy for selection and logically controlled loops (guarded commands): if condition -> stmt list do condition -> stmt list [] condition -> stmt list [] condition -> stmt list [] condition -> stmt list [] condition -> stmt list … … fi fi Guard: each of the conditions in these constructs. A nondeterministic choice is made among the guards that evaluate to true, and the statement list following the chosen guard is executed. Nondeterminacy in concurrent programs can affect correctness. Ideally, what we should like in a nondeterministic construct is a guarantee of fairness. 43. Give two reasons why a programmer might sometimes want control flow to be nondeterministic? – concurrent programs (avoiding deadlock) 48. What is a guarded command? 49. Explain why nondeterminacy is particularly important for concurrent programs. 50. Givethreealternativedefinitionsoffairnessinthecontextofnondeterminacy. 51. Describe three possible ways of implementing the choice among guards that evaluate to true. What are the tradeoffs among these?

Summary Distinction between l-values and r-values, as well as the value model and the reference model of variables. Sequencing and iteration are fundamental to imperative programming. Recursion is fundamental to functional programming. The evolution of constructs is driven by ease of programming, semantic elegance, ease of implementation, and run-time efficiency. Improvements in language semantics is worth a small cost in run-time efficiency (e.g., iterators). Programming conventions can help in older, comparatively primitive languages.

CS 3304 Comparative Languages

Similar presentations

Presentation on theme: "CS 3304 Comparative Languages"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 3304 Comparative Languages

Similar presentations

Presentation on theme: "CS 3304 Comparative Languages"— Presentation transcript:

Similar presentations

About project

Feedback