Subprograms Support process abstraction and modularity

Subprograms Support process abstraction and modularity
characteristics of subprograms are that they have a single entry point the calling entity is suspended when control is transferred to subprogram and control returns to calling entity upon termination of subprogram concurrent subprograms have multiple entry points Two types: procedures and functions Subprogram design issues what mode(s) of parameter passing is used? should parameters be type checked? are optional parameters allowed? can a parameter be a subprogram name? are local variables static or stack dynamic (or other)? static variables do not permit recursion is separate compilation of subprograms allowable? can subprograms be generic or overloaded? In chapter 13, we look at concurrency where subprograms or modules have a different behavior than what is described above (single entry point, etc). C/C++ refers to all subprograms as functions for simplicity. In C, a void function is equivalent to other languages’ procedures. A procedure is merely a subprogram that does not return a value. In Pascal-like languages, these are common. Return values can be passed through parameters whether you are using a procedure or function, so the only real reason to have a distinction between the two is that functions act like mathematical functions.

Subprogram Header and Parameters
The subprogram header defines the type of syntactic structure (function or procedure) C-languages only have functions but can simulate procedures using void provides a name for the subprogram specifies the parameter profile the number of and types of parameters There are two ways that a subprogram can access data from the calling entity nonlocal (global) variables we discourage global variables because they make program execution unpredictable parameters Here is an example Pascal procedure for point of comparison to C: procedure foo(param1, param2 : integer; var param3 : real); var local_1 : integer; local_2 : real; begin local_1 := param1 + param2; local_2 := param3 * local_1; end; Notice that the procedure is defined by the type, procedure, that parameters (like local variables) are declared as name : type, and the word var in the parameter list means that the parameter (param3) is a parameter whose value is returned to the calling unit. Assume a = 1, b = 2, c = 3.2, then calling foo(a, b, c) will cause c to become 3. 2 * 3 = 9.6 after the procedure terminates. We will examine how parameter values are returned later in this chapter and in chapter 10. Keyword params are available in Python, Ada, Fortran 95. The idea is that you assign the actual parameter to a formal parameter in the procedure call, such as by doing foo(a, param3 = b, param2 = c) // Python example Here, a and c are positional parameters and so are passed into the first and third parameters respectively, but in this case we have to indicate what parameters b and c be passed into. You are free to combine positional and keyword params in a call as long as all keyword params appear after any positional params (for instance, (param3 = a, b, param2 = c) is illegal because b is a positional param and it appears after a keyword param). You can find further examples of supplying default values for params and optional params on pages

Parameters We consider parameters in two places
in the header where they are called formal parameters in the call where they are called actual parameters There are many different forms of parameter passing that we will consider most languages pass parameters by position (positional params) some languages also allow for keyword params so that order is not important There are other options available actual parameter lists may include optional parameters available in C/C++, C#, Java, Common Lisp, Python, Ruby formal parameter lists may have initializations available in C++, Common Lisp, FORTRAN 95, Ada, PHP, Python, Ruby

Separate vs. Independent Compilation
Separate compilation allows for large software systems for this to be available, the compiler needs access to info about parameters and data structures used in the unit being compiled separate compilation requires that subprograms be compiled in a proper order Java, Ada, Modula-2 and FORTRAN 90 are all like this Independent compilation allows a subprogram to be compiled without any (or with limited) knowledge of other subprograms found in C, FORTRAN 77, LISP in C, the compiler still needs to type check parameters/return types so prototypes are required in FORTRAN 77 and earlier, type checking was not performed in LISP, there is no compile-time type checking Pascal does not allow for either separate or independent compilation The advantage of separate compilation of subprograms is that one can test out portions of a program and thus build it piecemeal. In order to make this work correctly, stub subprograms must be available for the compiler to ensure proper parameter passing. In independent compilation, the process goes further in that one may compile a function which has other function calls without having first written those functions in any form. In C, prototypes ensure proper parameter passing. Obviously you cannot run the program until all of the subprograms have been implemented, but compilation (for syntax errors) is permitted. In Java, for instance, you can compile a class definition without having written the user class, but you cannot compile the user class without having first written and compiled the class(es) that the user imports and uses. In C, you can write and compile any function no matter what it calls upon as long as you have made the function prototypes available. Lisp provides an even more flexible approach than C. You can write and compile functions irrelevant to what else has been written.

Local Variables Static storage allows for compile-time memory allocation and deallocation ensures type checking and has less run-time overhead but does not allow for recursion Stack Dynamic allows for recursion at the cost of run-time allocation/deallocation and initialization because these are stored on a stack, referencing is indirect based on stack position and possibly time-consuming, we examine this in chapter 10

Language Implementations
FORTRAN I-IV: static only FORTRAN 77: programmer chooses (although still no recursion) FORTRAN 90/95: static unless the subprogram is explicitly declared as recursive ALGOL 60: first to offer stack-dynamic Pascal, Modula-2, Ada, Java: only have stack-dynamic C, C++: defaults to stack dynamic, but reserved word static forces variable to be stored statically in C++, variables declared in methods can only be stack-dynamic as in Java LISP: implicit heap dynamic

Parameter Passing Methods
Three modes of parameter passing: in mode (params going from prog to subprog) out mode (params going from subprog to prog) inout mode (both) Implementations: Pass by Value (in) Pass by Result (out) Pass by Value-Result (inout) Pass by Reference (inout) Pass by Name (varies)

Implementations Pass by value Pass by result
actual param used to initialize formal param from that point forward, formal param acts as a local variable and is entirely independent of the actual param implemented by physical data transfer (copying the value) this can be inefficient if the variable is an array or record (structure) Pass by result formal param acts as a local variable (uninitialized at the beginning of the subprogram) value is passed back to the actual param once the subprogram terminates again, passing is done by physical data transfer One alternative approach to implementing pass-by-value is to have the formal parameter refer to the location of the actual parameter while enforcing write-protection to ensure that the actual param cannot change Pass-by-result can lead to the following problem: sub(p1, p1) p1 could be assigned two different values on return The big disadvantage of pass by reference is that any access to the parameter in the subprogram requires at least 2 memory references, one to obtain the address of the pointer, and the other to obtain the actual datum. This makes it less efficient than pass by value/result/value-result with the exception of the actual copying of data on subprogram call and return. The disadvantage of an alias being created is not usually a concern unless the actual parameter can also be referenced in the subprogram. However, pass by reference can lead to subtle problems such as if we have a situation like one of the following: void fun(int *a, int *b) called with fun(&x, &x); {…} void fun2(int *a, int b[ ]) called with fun2(&x[i], x);

Continued Pass by value-result Pass by reference
combines pass by value and pass by result also referred to as pass by copy requires two copying actions, yielding the same disadvantages as with pass-by-value and pass-by-result Pass by reference pass an access path to the memory location storing the variable (that is, pass a pointer) physical copying is limited to the address, not the datum, so this is more efficient if the param is an array or structure and nothing is passed back disadvantages are creates an alias between the formal and actual params requires indirect addressing to access the parameter’s value

Pass by Name In-out mode implemented in ALGOL 60
it is so confusing, it has been largely been discarded as obsolete (even dangerous) Rather than passing a parameter (a value or pointer), the name of the variable is passed in the subprogram, textual substitution takes place replace all instances of the formal param with the characters that make up the actual param This approach is used via macro substitution (in languages that support macro expansion) but not as a form of parameter passing C has compile-time macro expansion by using #define C++ and Ada use this for generic subprograms Lisp has run-time macro substitution

Pass By Name Example Assume the following subprogram in an Algol-like language Procedure Sum(Adder, Index, Length) Temp = 0 For Index = 1 to Length do Temp = Temp + Adder Behavior differs based on type of variable(s) passed if called by Sum(A, I, L) and A is a scalar, this computes A*L if called by Sum(A[I], I, L) then this computes the sum of the first L elements in array A if called by Sum(A[I]*B[I], I, L) then this computes dot product of arrays A and B The time that the parameter is bound to the actual variable is very important in this parameter-passing method

Parameter Passing in Various Languages
pre FORTRAN 77: pass by reference more recent FORTRAN: pass by reference for structures and pass by value-result for simple values ALGOL 60: pass by name + pass by value as an option APL and Simula 67: pass by name ALGOL 68, C, PHP and others: pass by value, but where pass by reference can be simulated by passing a pointer C++ is the same as C but has a pass by reference form that does not require explicit dereferencing ALGOL-W: pass by value-result Pascal, Modula 2: pass-by-value with pass-by-value-result as an option the word var in the parameter list denotes pass-by-value-result

Continued Python, Ruby – pass by assignment which is essentially pass by reference because all parameters are objects being pointed to by a reference variable however, objects are immutable so changing the object in a subprogram actually causes the formal parameter to point to a new object Java: pass by value as the default, pass by reference (implicitly) when objects are passed C#: pass by value as the default, pass by reference (implicitly) with objects, pass by reference when specified (similar to C++) Perl: all parameters are combined into an array and the array is passed by reference Ada: 3 explicit types - in, out, in out method used is based on compiler writer’s implementation!

Implementing Parameter Passing
Parameter passing through runtime stack by Value - copy value onto stack by Result – copy value from stack by Reference – copy address of parameter, automatically dereference parameter in subprogram by Name - implemented using “Thunks” We will see how the run-time stack works in the next chapter. What this figure shows is how parameters are copied. For pass-by-value, the actual param is copied into the formal param at subprogram start. For pass-by-result, the formal param value is copied from the run-time stack location just before it is popped off the run-time stack into the actual param, at subprogram end. For pass-by-reference, the address of the actual param is copied into the formal param and the formal param is treated as a pointer during the duration of the subprogram (that is, we dereference the formal param every time it is referenced so that we are actually accessing/manipulating the actual param) For pass-by-name, a “thunk” is executed. This is a parameterless subprogram set up by the compiler so that, when the parameter is first referenced, it is executed to perform the textual substitution of the variable name throughout the function with the actual parameter that it references.

Parameter Type Checking
Software reliability demands type checking Type checking compares formal and actual parameters for types, numbers, order, etc… usually at compile time different languages have different rules for type checking makes some languages safer than others makes some more flexible than others Language rules F77 and earlier: no type checking at all Pascal, Modula-2, F90, Ada, Java: type checking required Original C, Perl, JavaScript, PHP: no type checking at all and checking number of parameters is also omitted in original C C++: type checking required unless you use ellipses (…) for optional parameters Python, Ruby: no type checking since all params are pointers

Arrays as Parameters A mapping function must be set up by the compiler for the runtime environment to compute an array index’ location if separate compilation of subprograms is available, the original definition of the array will have already been compiled so that the mapping function is available when compiling the subprogram in languages where independent compilation is available, we need another approach in C, C++, this is taken care of by requiring that all of the array dimensions (except for the first) be specified in the function header and function prototype as in void fun(int array[ ][5][10]) in FORTRAN, any array that is passed to a subprogram must then be redeclared in that subprogram in Lisp, the mapping function can be generated at run-time, or if the arrays can change dynamically, then they are implemented as linked lists and do not use a mapping function at all Note for C/C++: that the first array dimension is not necessary because the mapping function does not require it. Refer back to chapter 6 and you will see that the 2-D array mapping function does not use the number of rows in the array (for a row-major order language), just the number of columns

Parameters that are Subprogram names
Subprogram names might be passed as parameters example, we want a subprogram that performs an integration estimate for area under a curve we want the subprogram to execute independently of the function, so we write it so that the we pass function’s name that we want to integrate to an integrate subprogram rather than writing a version of the subprogram for each function we might want to use double integrate(double *(f)(double x), double lowerbd, double upperbd) { double result, i; … result = f(i); } This is available in C/C++, Common Lisp, Fortran 90/95, JavaScript, F# and other functional languages

Passing Subprogram Names
Passing the subprogram’s name requires that the subprogram’s parameters and their types be passed to the subprogram along with the subprogram’s return type in the previous example, function f’s return type and its parameter’s type must all be specified this provides necessary information for type checking in C, C++, pointers to a function are passed rather than the function name itself When passed, the correct referencing environment must be identified: environment of the subprogram call (shallow binding) environment of the definition of the subprogram (deep binding) environment of the subprogram which called the subprogram passed (ad hoc binding)

Delegates In C/C++, instead of passing subprogram names, you pass a pointer to a function this allows you to assign the pointer a function at runtime and thus control which function is being passed from our previous example of integrate, imagine we have the following code double function1(double); double function2(double); if(x==0) f = function1; else f = function2; result = integrate(f, lower, upper); we can also pass f as (*f), either notation is permitted In C#, you can have a pointer point at methods in which case they are called delegates because you choose the method you want to act based on some condition

Overloaded Subprograms
Overloaded subprograms share the same name To differentiate at runtime which subprogram should be invoked, each subprogram must have a different parameter profile overloaded subprograms provides a programmer with the flexibility to define multiple subprograms of the same name, each of which operates on different forms of data (making the language more readable) for instance, you might do this to implement different sort routines, one for each type of array that is passed all of the function calls appear similar: sort(array, n); but each allows a different type of array (int, double, string) C#, C++, Java, Ada, Common Lisp have overloaded subprogs

Generic Subprograms In the overloaded subprogram, the program must write a subprogram for each type of parameter desired Another approach is to replace the type, as specified in both the formal header and any local variables with a placeholder to be filled in later the generic subprogram simplifies things for the program although the notation gets to be somewhat confusing in a way, this is a form of polymorphism and is more commonly used when implementing methods that receive objects rather than primitive data Dynamically bound languages (Common Lisp, APL) do not have typed variables, so in a way, all subprograms are generic

Examples In Ada, you declare a subprogram to be generic and declare the type of data as < > the compiler generates specific instances of the subprogram, filling in the < > placeholders with the specific type that can be passed this is not particularly practical and so other languages do this better In C++, generic subprograms are called templates like in Ada, you specify a placeholder in < > using <type name> where type is the word class or typename and name is the placeholder you are using throughout your function unlike Ada, the compiler does not generate specific functions, instead as each new type of class is passed to the template, a new runtime instance of the function is generated so this is less wasteful, but takes runtime overhead

Continued In Java, the only type available for a generic is an object
Java’s version is more efficient than Ada or C++ because only one version of the code is produced and used but upon calling the generic method, the runtime environment inserts any appropriate casts needed (if omitted from the user code) in addition, Java permits some variations in the placeholder <T extends classname> allows you to limit objects to those at or below a certain level of the class hierarchy Collection<?> allows any collection type C# is similar to Java except there are no wildcards and you can omit the type in the call and let the compiler infer the type F# uses type inference but if a parameter or function return is not inferable, F# automatically uses a generic

Functions Functions are side effects allowed?
not allowed in Ada in pure functional languages like Haskell, there are no variables so there can be no side effects what types of values can be returned? Python, Ruby, Lua, C/C++/Java/C# allow any type to be returned C/C++ can even return pointers to functions FORTRAN 77, Pascal, Modula-2, Ada functions allow only primitive types (no structures) how many values can be returned? most languages only permit 1 (although as noted above, this can be a structure or collection in a language like C/C++/Java Ruby, Lua, Python, F# all allow multiple values to be returned, usually as a list/tuple

Other Topics User-overloaded operators Co-routines Macros
available in Ada, Python, Ruby and C++ these functions are defined just like any function except that the name is an operator such as int operator * (…) to define * to be used say on vectors too much of this is not a good thing as it can harm readability Co-routines special kind of subprogram that is used to implement concurrency concurrency is covered in chapter 13 Macros most languages now have facilities for defining macros to be expanded to generate code, introduced in COBOL and extensively used in Lisp

Subprograms Support process abstraction and modularity

Similar presentations

Presentation on theme: "Subprograms Support process abstraction and modularity"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Subprograms Support process abstraction and modularity

Similar presentations

Presentation on theme: "Subprograms Support process abstraction and modularity"— Presentation transcript:

Similar presentations

About project

Feedback