CS 3304 Comparative Languages

CS 3304 Comparative Languages
Lecture 10: Simple Data Types 16 February 2012

Introduction Most programming languages include a notion of type for expressions and/or objects. We all have developed an intuitive notion of what types are; what's behind the intuition?

Why Data Types? Implicit context for many operations:
The programmer does not have to specify the context explicitly. Example: in C, the expressions a+b will use integer addition if a and b are integers, floating point addition if a and b are floating points. Limit the set of operations that may be performed in a semantically valid program: Example: prevent from adding a character and a record. Type checking cannot prevent all meaningless operations. It catches enough of them to be useful. 1. What purpose(s) do types serve in a programming language?

Type Systems A type system consists of:
A mechanism to defines types and associate them with certain language constructs. A set of rules for type equivalence, type compatibility, type inference: Type equivalence: when are the types of two values the same? Type compatibility: when can a value of type A be used in a context that expects type B? Type inference: what is the type of an expression, given the types of the operands? Compatibility is the more useful concept, because it tells you what you can do. Polymorphism results when the compiler finds that it doesn't need to know certain things. Subroutines need to have types if they are first- or second- class value.

Type Checking Type checking is the process of ensuring that a program obeys the language’s type compatibility rules. Type clash: a violation of these rules. Strong typing means that the language prevents you from applying an operation to data on which it is not appropriate. Static typing: compiler can check all at compile time. Examples: Common Lisp is strongly typed, but not statically typed. Ada is statically typed, Pascal is almost statically typed. C less strongly typed than Pascal, e.g. unions, subroutines with variable numbers of parameters. Java is strongly typed, with a non-trivial mix of things that can be checked statically and things that have to be checked dynamically. Scripting languages are generally dynamically typed. 2. What dies it mean for a language to ne strongly type? Statically typed? What prevents, say, C from being strongly typed? 3. Name Two important programming languages that are strongly but dynamically typed. 4. What is a type clash?

Polymorphism Polymorphism allows a single body of code to works with objects of multiple types: May or may not imply the need for run-time type checking. Fully dynamic typing: arbitrary operations on arbitrary objects. Only at run time the check is performed. Types of objects are implied: implicit parametric polymorphism. Significant run-time cost and delayed error reporting. Type inference: infers for every object and expression a type (e.g., ML). Subtype polymorphism: allows a variable X of type T to refer to an object of any type derived from T. Explicit parametric polymorphism (generics): define classes with type parameters. Dynamic versus static typing.

Meaning of “Type” Collection of values from a “domain” (the denotational approach) Internal structure of a bunch of data, described down to the level of a small set of fundamental types (the structural approach). Equivalence class of objects (the implementor's approach). Collection of well-defined operations that can be applied to objects of that type (the abstraction approach). 5. Discuss the differences among the denotational, constructive, and abstraction-based views of types.

Classification of Types
Discrete (ordinal) types – countable: integer, boolean, char, enumeration, and subrange. Scalar (simple) types - one-dimensional: discrete, rational, real, and complex. Composite types: Records (structures). Variant records (unions). Arrays; strings are arrays of characters. Sets: the mathematical powerset of their base types. Pointers: l-values. Lists: no notion of mapping or indexing. Files. 6. What is the difference between discrete and scalar types? 7. Give two examples of languages that lack a Boolean type. What do they use instead? 8. In what ways may an enumeration type be preferable to a collection of named constants? In what ways may a subrange type be preferable to its base type? In what ways may a string be preferable to an array of characters?

Orthogonality Orthogonality is a useful goal in the design of a language, particularly its type system: A collection of features is orthogonal if there are no restrictions on the ways in which the features can be combined (analogy to vectors). For example: Pascal is more orthogonal than Fortran, (because it allows arrays of anything, for instance), but it does not permit variant records as arbitrary fields of other records (for instance). Orthogonality is nice primarily because it makes a language easy to understand, easy to use, and easy to reason about.

Type Checking In most statically typed languages every definition of an object must specify the object’s type. Many of the contexts in which an object might appear are also typed. Type equivalence, type compatibility, and type inference. Type compatibility is the most critical. Objects and contexts are often compatible even when their types are different. 11. What is the difference between type equivalence and type compatibility.

Type Equivalence Two major approaches – structural and name equivalence: Name equivalence is based on declarations. Structural equivalence is based on some notion of meaning behind those declarations. The exact definition varies. Name equivalence is more fashionable these days. There are at least two common variants on name equivalence: The differences between all these approaches boils down to where you draw the line between important and unimportant differences between type descriptions. In all three schemes described in the book, we begin by putting every type description in a standard form that takes care of “obviously unimportant” distinctions like those on the next slide. 12. Discuss the comparative advantages of structural and name equivalence for types. Name three languages that use each approach.

Type Equivalence Example
Certainly format does not matter: struct { int a, b; } Is the same as: struct { int a, b; } We certainly want them to be the same as: struct { int a; int b; } How about this?

Name Equivalence How about type aliasing? Program dependent? TYPE new_type = old_type; Examples when not the same: TYPE celisus_temp = REAL; fahrenheit_temp = REAL; VAR c : celsius_temp; f : fahrenheit_temp; f := c; Equivalence types: Strict name equivalence: aliased types considered distinct. Loose name equivalence: aliased types considered equivalent. Tricky to implement in the presence of separate compilation. 13. Explain the differences among strict and loose name equivalence.

Type Conversion and Casts
Expected and provided types: if different an explicit type conversion (type cast) is needed: Types structurally equivalent but the language uses name equivalence: the conversion purely conceptual operation. Types have different sets of values but the intersecting values are represented in the same way: a run time check. Different low-level representations but some sort of correspondence among their values: machine instructions that effect this conversion. In C, a type conversion is specified by using the name of the desired type. No run-time checks for arithmetic overflow. 14. Explain the distinction between derived types and subtypes in Ada. 17. Under what ciorcumstances does a type conversion require a run-time check?

Nonconverting Type Casts
Interpreting the bits of a value of one type as if they were another type: Memory allocation example. Reinterpreting a floating point number as an integer or record. Nonconverting type cast (type pun) - a change of type that does not alter the underlying bits: Ada: a built-in generic subroutine unchecked_conversion. C++: in addition to C casting, a family of alternatives: Type conversion: static_cast. Nonconverting type cast: reinterpret_cast. Manipulating pointers of polymorphic type: dynamic_cast. Removing read-only qualification: const_cast. A nonconverting type constitutes a dangerous subversion of the language’s type system. 15. Explain the differences among type conversion, type coercion, and nonconverting type casts.

Type Compatibility Instead of type equivalence, a value’s type must be compatible with that of the context in which it appears: Assignment statement: the right-hand side type must be compatible with that of the left-hand side. Arithmetic operator operands types must be compatible with some common type that supports the arithmetic operation. The definition of type compatibility varies: Ada: type S is compatible with an expected type T if and only if: S and T are equivalent. One is a subtype of the other or both are subtypes of the same base type. Both are arrays, with the same numbers and types of elements in each dimension.

Coercion I When an expression of one type is used in a context where a different type is expected, one normally gets a type error. But what about: var a : integer; b, c : real; c := a + b; Many languages allow things like this, and coerce an expression to be of the proper type. Coercion can be based just on types of operands, or can take into account expected type from surrounding context as well. Fortran has lots of coercion, all based on operand type. C has lots of coercion, too, but with simpler rules: All floats in expressions become doubles. short int and char become int in expressions. If necessary, precision is removed when assigning into LHS.

Coercion II In effect, coercion rules are a relaxation of type checking: Recent thought is that this is probably a bad idea. Languages such as Modula-2 and Ada do not permit coercions. C++, however, goes hog-wild with them. They're one of the hardest parts of the language to understand. Make sure you understand the difference between: Type conversions (explicit). Type coercions (implicit). Sometimes the word ‘cast’ is used for conversions (C is guilty here). 16. Summarize the arguments for and against coercion.

Overloading and Coercion
Overloading and coercion sometimes used to similar effect. An overloaded name can refer to more than one object: the ambiguity resolved by the context. Example: addition of numeric quantities: Without coercion: both operands must be of the same type. With coercion: if either operand is real, a floating-point addition. If the operator is not overloaded, the conversion from integer is always required: overhead. In most languages literal constants or the null pointer can be intermixed in expressions with values of many types. More commonly, constants are treated as a special case in the language’s type-checking rules.

Universal Reference Types
Used for systems programming or for general-purpose container objects: C, C++: void *. Clu: any. Java: Object. Arbitrary l-values can be assigned into an object of universal reference type with no concern about type safety since the compiler will not allow any operation to be performed. Problems with the assignment: Make objects self-descriptive. If not, no way to identify their type at run time. What purpose is served by universal reference types?

Type Inference What determines the type of an overall expression?
Arithmetic operator: the result has the same type as the operands. Comparison: usually Boolean. Function call: type declared in the function’s header. Assignment: the same type as the left-hand side. Sometimes the answer is not obvious: Subranges. Composite objects. 19. What is type inference? Describe three contexts in which it occurs.

Subranges One or more operands have subrange types.
Pascal: the result of any arithmetic operation on a subrange has the subrange’s base type. If the result of an arithmetic operation is assigned into a variable of a subrange type: a dynamic semantic check may be required. In languages like Ada, special significance of the arithmetic expression’s type in the header of a for loop.

Composite Types Some operators can be applied to values of composite types. Example: Character strings (Pascal, Ada). Sets (Pascal, Modula). ML type system: Programmers have the option of declaring the types in these languages (more like traditional statically typed language). Programmers may choose not to declare certain types. ML-style type inference.

Summary General issues of type systems and type checking.
A type system consists of a set of built-in types, a mechanism to define new types, and rules for type equivalence, type compatibility, and type inference. Denotational, constructive, and abstraction-based points of view which regard types in terms of their values, their substructure, and the operations they support (respectively).

CS 3304 Comparative Languages

Similar presentations

Presentation on theme: "CS 3304 Comparative Languages"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 3304 Comparative Languages

Similar presentations

Presentation on theme: "CS 3304 Comparative Languages"— Presentation transcript:

Similar presentations

About project

Feedback