Static Checking and Type Systems

Slides:



Advertisements
Similar presentations
CH4.1 Type Checking Md. Fahim Computer Engineering Department Jamia Millia Islamia (A Central University) New Delhi –
Advertisements

CS3012: Formal Languages and Compilers Static Analysis the last of the analysis phases of compilation type checking - is an operator applied to an incompatible.
Semantic Analysis and Symbol Tables
Chapter 6 Intermediate Code Generation
Chapter 6 Type Checking. The compiler should report an error if an operator is applied to an incompatible operand. Type checking can be performed without.
Semantic Analysis Chapter 6. Two Flavors  Static (done during compile time) –C –Ada  Dynamic (done during run time) –LISP –Smalltalk  Optimization.
Programming Languages and Paradigms
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
1 Compiler Construction Intermediate Code Generation.
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
Elaboration or: Semantic Analysis Compiler Baojian Hua
Lecture # 20 Type Systems. 2 A type system defines a set of types and rules to assign types to programming language constructs Informal type system rules,
Type Checking Compiler Design Lecture (02/25/98) Computer Science Rensselaer Polytechnic.
Chapter 6 Type Checking Section 0 Overview 1.Static Checking Check that the source program follows both the syntactic and semantic conventions of the source.
CH4.1 CSE244 Type Checking Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155 Storrs,
Compiler Construction
Lesson 12 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 5 Types Types are the leaven of computer programming;
ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =
Chapter 6 Type Checking Section 0 Overview
Type Checking  Legality checks  Operator determination  Overload resolution.
1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.
Types Type = Why? a set of values
Static checking and symbol table Chapter 6, Chapter 7.6 and Chapter 8.2 Static checking: check whether the program follows both the syntactic and semantic.
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
Type Equivalence Rules Ada –Strict name equivalence except for almost everything Unique array constructors give rise to unique types Subtypes can create.
1 Static Checking and Type Systems Chapter 6 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Lesson 11 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
410/510 1 of 18 Week 5 – Lecture 1 Semantic Analysis Compiler Construction.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Semantic Analysis Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses.
1 Static Checking and Type Systems Chapter 6 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Static Checking and Type Systems Chapter 6. 2 Static versus Dynamic Checking Static checking: the compiler enforces programming language’s static semantics.
1 Syntax-Directed Translation Part I Chapter 5 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
Data Types (3) 1 Programming Languages – Principles and Practice by Kenneth C Louden.
1 February 17, February 17, 2016February 17, 2016February 17, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
Chapter 8: Semantic Analyzer1 Compiler Designs and Constructions Chapter 8: Semantic Analyzer Objectives: Syntax-Directed Translation Type Checking Dr.
COMP 412, FALL Type Systems C OMP 412 Rice University Houston, Texas Fall 2000 Copyright 2000, Robert Cartwright, all rights reserved. Students.
Semantic Analysis Chapter 6. Two Flavors  Static (done during compile time) –C –Ada  Dynamic (done during run time) –LISP –Smalltalk  Optimization.
Arvind Computer Science and Artificial Intelligence Laboratory M.I.T. L05-1 September 21, 2006http:// Types and Simple Type.
1 Static Checking and Type Systems Chapter 6 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Lecture 9 Symbol Table and Attributed Grammars
Types Type Errors Static and Dynamic Typing Basic Types NonBasic Types
Compiler Design – CSE 504 Type Checking
Semantics Analysis.
Principles of programming languages 12: Functional programming
Semantic Analysis Type Checking
Context-Sensitive Analysis
ML: a quasi-functional language with strong typing
Constructing Precedence Table
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Syntax-Directed Translation Part I
Semantic Analysis Chapter 6.
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
Chapter 6 Intermediate-Code Generation
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
Overview of Compilation The Compiler BACK End
Semantic Analysis Chapter 6.
Syntax-Directed Translation Part I
SYNTAX DIRECTED DEFINITION
Compiler Construction
CS 242 Types John Mitchell.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Syntax-Directed Translation Part I
Compiler Construction
Presentation transcript:

Static Checking and Type Systems Chapter 6 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007-2009

The Structure of our Compiler Revisited Syntax-directed static checker Lexical analyzer Character stream Token stream Java bytecode Syntax-directed translator Lex specification Yacc specification JVM specification Type checking Code generation

Static versus Dynamic Checking Static checking: the compiler enforces programming language’s static semantics Program properties that can be checked at compile time Dynamic semantics: checked at run time Compiler generates verification code to enforce programming language’s dynamic semantics

Static Checking Typical examples of static checking are Type checks Flow-of-control checks Uniqueness checks Name-related checks

Type Checks, Overloading, Coercion, and Polymorphism int op(int), op(float); int f(float); int a, c[10], d; d = c+d; // FAIL *d = a; // FAIL a = op(d); // OK: overloading (C++) a = f(d); // OK: coersion of d to float vector<int> v; // OK: template instantiation

Flow-of-Control Checks myfunc() { … break; // ERROR } myfunc() { … switch (a) { case 0: … break; // OK case 1: … } myfunc() { … while (n) { … if (i>10) break; // OK }

Uniqueness Checks myfunc() { int i, j, i; // ERROR … } cnufym(int a, int a) // ERROR { … } struct myrec { int name; }; struct myrec // ERROR { int id;

Name-Related Checks LoopA: for (int I = 0; I < n; I++) { … if (a[I] == 0) break LoopB; // Java labeled loop … }

One-Pass versus Multi-Pass Static Checking One-pass compiler: static checking in C, Pascal, Fortran, and many other languages is performed in one pass while intermediate code is generated Influences design of a language: placement constraints Multi-pass compiler: static checking in Ada, Java, and C# is performed in a separate phase, sometimes by traversing a syntax tree multiple times

Type Expressions Type expressions are used in declarations and type casts to define or refer to a type Primitive types, such as int and bool Type constructors, such as pointer-to, array-of, records and classes, templates, and functions Type names, such as typedefs in C and named types in Pascal, refer to type expressions

Graph Representations for Type Expressions int *f(char*,char*) fun fun args pointer args pointer pointer pointer int pointer int char char char Tree forms DAGs

Cyclic Graph Representations Source program struct Node { int val; struct Node *next; }; struct val next int pointer Internal compiler representation of the Node type: cyclic graph

Name Equivalence Each type name is a distinct type, even when the type expressions the names refer to are the same Types are identical only if names match Used by Pascal (inconsistently) type link = ^node; var next : link; last : link; p : ^node; q, r : ^node; With name equivalence in Pascal: p ≠ next p ≠ last p = q = r next = last

Structural Equivalence of Type Expressions Two types are the same if they are structurally identical Used in C, Java, C# = pointer pointer struct struct val next val next int pointer int

Structural Equivalence of Type Expressions (cont’d) Two structurally equivalent type expressions have the same pointer address when constructing graphs by sharing nodes struct Node { int val; struct Node *next; }; struct Node s, *p; … p = &s; // OK … *p = s; // OK p &s *p s pointer struct val next int

Constructing Type Graphs in Yacc Type *mkint() construct int node if not already constructed Type *mkarr(Type*,int) construct array-of-type node if not already constructed Type *mkptr(Type*) construct pointer-of-type node if not already constructed

Syntax-Directed Definitions for Constructing Type Graphs in Yacc %union { Symbol *sym; int num; Type *typ; } %token INT %token <sym> ID %token <int> NUM %type <typ> type %% decl : type ID { addtype($2, $1); } | type ID ‘[’ NUM ‘]’ { addtype($2, mkarr($1, $4)); } ; type : INT { $$ = mkint(); } | type ‘*’ { $$ = mkptr($1); } | /* empty */ { $$ = mkint(); } ;

Type Systems A type system defines a set of types and rules to assign types to programming language constructs Informal type system rules, for example “if both operands of addition are of type integer, then the result is of type integer” Formal type system rules: Post system

Type Rules in Post System Notation Type judgments e :  where e is an expression and  is a type Environment  maps objects v to types : (v) =  (v) =    v :  (v) =    e :    v := e : void   e1 : integer   e2 : integer   e1 + e2 : integer

Type System Example Environment  is a set of name, type pairs, for example:  = { x,integer, y,integer, z,char, 1,integer, 2,integer } From  and rules we can check types: type checking = theorem proving The proof that x := y + 2 is typed correctly: (y) = integer (2) = integer   y : integer   2 : integer (x) = integer   y + 2 : integer   x := y + 2 : void

A Simple Language Example P  D ; S D  D ; D  id : T T  boolean  char  integer  array [ num ] of T  ^ T S  id := E  if E then S  while E do S  S ; S E  true  false  literal  num  id  E and E  E + E  E [ E ]  E ^ Pointer to T Pascal-like pointer dereference operator

Simple Language Example: Declarations D  id : T { addtype(id.entry, T.type) } T  boolean { T.type := boolean } T  char { T.type := char } T  integer { T.type := integer } T  array [ num ] of T1 { T.type := array(1..num.val, T1.type) } T  ^ T1 { T.type := pointer(T1) Parametric types: type constructor

Simple Language Example: Checking Statements (v) =    e :    v := e : void S  id := E { S.type := if id.type = E.type then void else type_error } Note: the type of id is determined by scope’s environment: id.type = lookup(id.entry)

Simple Language Example: Checking Statements (cont’d)  e : boolean   s :     if e then s :  S  if E then S1 { S.type := if E.type = boolean then S1.type else type_error }

Simple Language Example: Statements (cont’d)  e : boolean   s :     while e do s :  S  while E do S1 { S.type := if E.type = boolean then S1.type else type_error }

Simple Language Example: Checking Statements (cont’d)  s1 : void   s2 : void    s1 ; s2 : void S  S1 ; S2 { S.type := if S1.type = void and S2.type = void then void else type_error }

Simple Language Example: Checking Expressions (v) =    v :  E  true { E.type = boolean } E  false { E.type = boolean } E  literal { E.type = char } E  num { E.type = integer } E  id { E.type = lookup(id.entry) } …

Simple Language Example: Checking Expressions (cont’d)   e1 : integer   e2 : integer   e1 + e2 : integer E  E1 + E2 { E.type := if E1.type = integer and E2.type = integer then integer else type_error }

Simple Language Example: Checking Expressions (cont’d)   e1 : boolean   e2 : boolean   e1 and e2 : boolean E  E1 and E2 { E.type := if E1.type = boolean and E2.type = boolean then boolean else type_error }

Simple Language Example: Checking Expressions (cont’d)   e1 : array(s, )   e2 : integer   e1[e2] :  E  E1 [ E2 ] { E.type := if E1.type = array(s, t) and E2.type = integer then t else type_error }

Simple Language Example: Checking Expressions (cont’d)   e : pointer()   e ^ :  E  E1 ^ { E.type := if E1.type = pointer(t) then t else type_error }

A Simple Language Example: Functions T  T -> T E  E ( E ) Function type declaration Function call Example: v : integer; odd : integer -> boolean; if odd(3) then v := 1;

Simple Language Example: Function Declarations T  T1 -> T2 { T.type := function(T1.type, T2.type) } Parametric type: type constructor

Simple Language Example: Checking Function Invocations   e1 : function(, )   e2 :    e1(e2) :  E  E1 ( E2 ) { E.type := if E1.type = function(s, t) and E2.type = s then t else type_error }

Type Conversion and Coercion Type conversion is explicit, for example using type casts Type coercion is implicitly performed by the compiler to generate code that converts types of values at runtime (typically to narrow or widen a type) Both require a type system to check and infer types from (sub)expressions

Syntax-Directed Definitions for Type Checking in Yacc %{ enum Types {Tint, Tfloat, Tpointer, Tarray, … }; typedef struct Type { enum Types type; struct Type *child; // at most one type parameter } Type; %} %union { Type *typ; } %type <typ> expr %% …

Syntax-Directed Definitions for Type Checking in Yacc (cont’d) … %% expr : expr ‘+’ expr { if ($1->type != Tint || $3->type != Tint) semerror(“non-int operands in +”); $$ = mkint(); emit(iadd); }

Syntax-Directed Definitions for Type Coercion in Yacc … %% expr : expr ‘+’ expr { if ($1->type == Tint && $3->type == Tint) { $$ = mkint(); emit(iadd); } else if ($1->type == Tfloat && $3->type == Tfloat) { $$ = mkfloat(); emit(fadd); } else if ($1->type == Tfloat && $3->type == Tint) { $$ = mkfloat(); emit(i2f); emit(fadd); } else if ($1->type == Tint && $3->type == Tfloat) { $$ = mkfloat(); emit(swap); emit(i2f); emit(fadd); } else semerror(“type error in +”); $$ = mkint(); }

Checking L-Values and R-Values in Yacc %{ typedef struct Node { Type *typ; // type structure int islval; // 1 if L-value } Node; %} %union { Node *rec; } %type <rec> expr %% …

Checking L-Values and R-Values in Yacc expr : expr ‘+’ expr { if ($1->typ->type != Tint || $3->typ->type != Tint) semerror(“non-int operands in +”); $$->typ = mkint(); $$->islval = FALSE; emit(…); } | expr ‘=’ expr { if (!$1->islval || $1->typ != $3->typ) semerror(“invalid assignment”); $$->typ = $1->typ; $$->islval = FALSE; emit(…); } | ID { $$->typ = lookup($1); $$->islval = TRUE; emit(…); }

Type Inference and Polymorphic Functions Many functional languages support polymorphic type systems For example, the list length function in ML: fun length(x) = if null(x) then 0 else length(tl(x)) + 1 length([“sun”, “mon”, “tue”]) + length([10,9,8,7]) returns 7

Type Inference and Polymorphic Functions The type of fun length is: ∀α.list(α) → integer We can infer the type of length from its body: fun length(x) = if null(x) then 0 else length(tl(x)) + 1 where null : ∀α.list(α) → bool tl : ∀α.list(α) → list(α) and the return value is 0 or length(tl(x)) + 1, thus length: ∀α.list(α) → integer

Type Inference and Polymorphic Functions Types of functions f are denoted by α→β and the post-system rule to infer the type of f(x) is: The type of length([“a”, “b”]) is inferred by   e1 : α → β   e2 : α   e1(e2) : β …   length : ∀α.list(α) → integer   [“a”, “b”] : list(string)   length([“a”, “b”]) : integer

Example Type Inference Append concatenates two lists recursively: fun append(x, y) = if null(x) then y else cons(hd(x), append(tl(x), y)) where null : ∀α.list(α) → bool hd : ∀α.list(α) → α tl : ∀α.list(α) → list(α) cons : ∀α.(α × list(α)) → list(α)

Example Type Inference fun append(x, y) = if null(x) then y else cons(hd(x), append(tl(x), y)) The type of append : ∀σ,τ,φ. (σ ×τ) → φ is: type of x : σ = list(α1) from null(x) type of y : τ= φ from append’s return type return type of append : list(α2) from return type of cons and α1 = α2 because   x : list(α1)   x : list(α1)   tl(x) : list(α1)   y : list(α1)   hd(x) : α1   append(tl(x), y) : list(α1)   cons(hd(x), append(tl(x), y)) : list(α2)

Example Type Inference fun append(x, y) = if null(x) then y else cons(hd(x), append(tl(x), y)) The type of append : ∀σ,τ,φ. (σ ×τ) → φ is: σ = list(α) τ= φ = list(α) Hence, append : ∀α.(list(α) × list(α)) → list(α)

Example Type Inference   ([1, 2],[3]) : list(α) × list(α)   append([1, 2], [3]) : τ   append([1, 2], [3]) : list(α) τ = list(α) α = integer   ([1],[“a”]) : list(α) × list(α)   append([1], [“a”]) : τ   append([1], [“a”]) : list(α) Type error

Type Inference: Substitutions, Instances, and Unification The use of a paper-and-pencil post system for type checking/inference involves substitution, instantiation, and unification Similarly, in the type inference algorithm, we substitute type variables by types to create type instances A substitution S is a unifier of two types t1 and t2 if S(t1) = S(t2)

Unification An AST representation of append([], [1, 2]) apply append :∀α.(list(α) × list(α)) → list(α) ( × ) : (σ, τ) [] : list(φ) [ , ] : list(ψ) 1 : integer 2 : integer

Unification An AST representation of append([], [1, 2]) apply append :∀α.(list(α) × list(α)) → list(α) ( × ) : (σ, τ) Unify by the following substitutions: σ = list(φ) = list(ψ) ⇒ φ = ψ [] : list(φ) [ , ] : list(ψ) τ = list(ψ) = list(integer) ⇒ φ = ψ = integer 1 : integer 2 : integer σ = τ = list(α) ⇒ α = integer