Types Type = Why? a set of values

Slides:

Advertisements

Similar presentations

1 Designing Hash Tables Sections 5.3, 5.4, Designing a hash table 1.Hash function: establishing a key with an indexed location in a hash table.

Advertisements

Semantic Analysis and Symbol Tables

CSE 1302 Lecture 23 Hashing and Hash Tables Richard Gesick.

Intermediate Code Generation

Chapter 6 Type Checking. The compiler should report an error if an operator is applied to an incompatible operand. Type checking can be performed without.

Programming Languages and Paradigms The C Programming Language.

Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.

The Assembly Language Level

Chapter 7:: Data Types Programming Language Pragmatics

Compiler Construction

1 Hashing (Walls & Mirrors - end of Chapter 12). 2 I hate quotations. Tell me what you know. – Ralph Waldo Emerson.

Lecture 10 Sept 29 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing.

1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.

Hash Tables1 Part E Hash Tables  

Run time vs. Compile time

Hash Tables1 Part E Hash Tables  

Hash Tables1 Part E Hash Tables  

Tirgul 7. Find an efficient implementation of a dynamic collection of elements with unique keys Supported Operations: Insert, Search and Delete. The keys.

COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.

Chapter 9: Subprogram Control

1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.

1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.

Hashing General idea: Get a large array

Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.

Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.

1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.

Static checking and symbol table Chapter 6, Chapter 7.6 and Chapter 8.2 Static checking: check whether the program follows both the syntactic and semantic.

國立台灣大學資訊工程學系薛智文 98 Spring Symbol Table (textbook ch#2.7 and 6.5 )

1 CISC181 Introduction to Computer Science Dr. McCoy Lecture 19 Clicker Questions November 3, 2009.

C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 13: Pointers, Classes, Virtual Functions, and Abstract Classes.

C++ Programming: From Problem Analysis to Program Design, Fourth Edition Chapter 14: Pointers, Classes, Virtual Functions, and Abstract Classes.

Semantic Analysis CS 671 February 5, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens –e.g.: main$

Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.

COMPILERS Semantic Analysis hussein suleman uct csc3005h 2006.

Hash Tables1   © 2010 Goodrich, Tamassia.

1 Symbol Tables The symbol table contains information about –variables –functions –class names –type names –temporary variables –etc.

COMPILERS Symbol Tables hussein suleman uct csc3003s 2007.

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

Basic Semantics Associating meaning with language entities.

Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.

410/510 1 of 18 Week 5 – Lecture 1 Semantic Analysis Compiler Construction.

David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.

CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.

Hashing Sections 10.2 – 10.3 CS 302 Dr. George Bebis.

Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.

Hashing Hashing is another method for sorting and searching data.

© 2004 Goodrich, Tamassia Hash Tables1  

1 Hashing - Introduction Dictionary = a dynamic set that supports the operations INSERT, DELETE, SEARCH Dictionary = a dynamic set that supports the operations.

Chapter 5: Hashing Part I - Hash Tables. Hashing  What is Hashing?  Direct Access Tables  Hash Tables 2.

Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.

COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.

1 Static Checking and Type Systems Chapter 6 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.

Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,

Hash Tables © Rick Mercer.  Outline  Discuss what a hash method does  translates a string key into an integer  Discuss a few strategies for implementing.

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

Hashing COMP171. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,

1 Structure of Compilers Lexical Analyzer (scanner) Modified Source Program Parser Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator.

Context-Sensitive Analysis

Hash table CSC317 We have elements with key and satellite data

CS 326 Programming Languages, Concepts and Implementation

Hashing Alexandra Stefan.

Resolving collisions: Open addressing

CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.

CH 9.2 : Hash Tables Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and.

Compiler Construction

Symbol Table 薛智文 (textbook ch#2.7 and 6.5) 薛智文 96 Spring.

Compiler Construction

Presentation transcript:

Types Type = Why? a set of values operations that are allowed on these values. Why? To generate better code, with less runtime overhead To avoid runtime errors To improve expressiveness (see overloading)

Type Type system for a programming language = Actions set of types AND rules that specify how a typed program is allowed to behave Actions Type checking Given an operation and an operand of some type, determine whether the operation is allowed on that operand Type inference Given the type of operands, determine the meaning of the operation the type of the operation OR, without variable declarations, infer type from the way the variable is used

Issues in typing Does the language have a type system? Untyped languages (e.g. assembly) have no type system at all When is typing performed? Static typing: At compile time Dynamic typing: At runtime How strictly are the rules enforced? Strongly typed: No exceptions Weakly typed: With well-defined exceptions Type equivalence & subtyping When are two types equivalent? What does "equivalent" mean anyway? When can one type replace another?

Components of a type system Built-in types Rules for constructing new types Where do we store type information? Rules for determining if two types are equivalent Rules for inferring the types of expressions

Component: Built-in types These are the basic types. Usually, integer usual operations: standard arithmetic floating point character character set generally ordered lexicographically usual operations: (lexicographic) comparisons boolean usual operations: not, and, or, xor

Component: type constructors Arrays array(I,T) denotes the type of an array with elements of type T, and index set I multidimensional arrays are just arrays where T is also an array operations: element access, array assignment, products Strings bitstrings, character strings operations: concatenation, lexicographic comparison Products Groups of multiple objects of different types essentially, Cartesian product of types (useful in functions later) Records (structs) Groups of multiple objects of different types where the elements are given specific names. How is a recursive type definition handled?

Component: type constructors Pointers addresses operations: arithmetic, dereferencing, referencing issue: equivalency Function types A function such as "int add(real, int)" has type realintint

Component: type equivalence Name equivalence Two types are equivalent only when they have the same name. Why? Loose vs. strict Structural equivalence Two types are equivalent when they have the same structure Should records require member name identity to be equivalent? Does the order of record fields matter? How about Array(int, 1..10) and Array(int, 1..20). Should we consider the bounds or just the element type? How about recursively defined types? Example C uses structural equivalence for structs and name equivalence for arrays/pointers.

Component: type equivalence Type coercion If x is float, is x=3 acceptable? Disallow Allow and implicitly convert 3 to float. "Allow" but require programmer to explicitly convert 3 to float How do we convert? Reinterpret bit sequence Build new object What should be allowed? float to int ? int to float ? What if both types are equally general? What if multiple coercions are possible? Consider 3 + "4" in a language that can convert strings to integers.

Overloading Same operation name but different effect on different types For an overloaded function f resolve arguments look up f in symbol table to get list of all visible "versions" ignore those with wrong parameter types.

A simple type checker We can use an attribute grammar to implement a simple type checker. Enum {E.type = integer} EE1+E2 {E.type = (E1.type == integer && E2.type == integer)? integer : error} EE1[E2] {E.type = (E1.type == array(int, T) && E2.type == integer)? T : error} E*E1 {E.type = (E1.type == pointer(T))? T : error} EE1(E2) {E.type = (E1.type == (ST) && E2.type == S)? T : error S if E then S1 {S.type = (E.type == boolean)? void : error}

Inference rules The formal notation for type checking/inference is rules of inference They have the form: If we can prove the hypotheses are all true in the existing environment, then the conclusion is also true. For example: In English: If expressions E1 and E2 both have type int, then E1+E2 also has type int. Environment Hypotheses Environment Conclusions A E1: int A E2: int A E1 + E2 : int

Inference rules The inference rules give us templates that describe how to type various expressions. We can use the templates to prove that an expression has a valid type (and find what that type is) The construct parallels the parse tree. Short example: Type the expression x+a[i] under the assumption that x is an char, a[i] is an array of chars and i is an int. The environment is A={x:char, a:array(int, char), i:char} A E1: array(T,S) , A E2: T A E1: char , A E2: char [ARRAY] [ADD] A E1 + E2 : char A E1[E2] : S

Inference rules A |– a: array(int,char) A |– i:int A |– x:char A |– a[i]: char [ADD] A |– x+a[i]: char

Types Where do we store type information? The symbol table In general, the symbol table contains information about variables functions class names type names temporary variables etc. Do we need a separate symbol table for types?

Symbol Tables What kind of information is usually stored in a symbol table? type storage class size scope stack frame offset register

Symbol Tables How is a symbol table implemented? array simple, but linear LookUp time However, we may use a sorted array for reserved words, since they are generally few and known in advance. tree O(lgn) lookup time if kept balanced hash table most common implementation O(1) LookUp time

Revisiting hash tables (cs311) use array of size m to store elements given key k (the identifier name), use a function h to compute index h(k) for that key collisions are possible two keys hash into the same slot. Hash functions A good hash function is easy to compute avoids collisions (by breaking up patterns in the keys and uniformly distributing the hash values)

Revisiting hash tables (cs311) Hash functions A common hash function is h(k) = m*(k*c- k*c), for some constant 0<c<1 In English multiply the key k by the constant c Take the fractional part of k*c Multiply that by size m Take the floor of the result A good value for c:

Revisiting hash tables (cs311) Different elements may still hash into the same slot. How do we resolve collisions? Chaining Put all the elements that collide in a chain (list) attached to the slot. Insert/Delete/Lookup in expected O(1) time However, this assumes that the chains are kept small. If the chains start becoming too long, the table must be enlarged and all the keys rehashed.

Revisiting hash tables (cs311) Different elements may still hash into the same slot. How do we resolve collisions? Open addressing Store all elements within the table The space we save from the chain pointers is used up to make the array larger. If there is a collision, probe the table in a systematic way to find an empty slot. If the table fills up, we need to enlarge it and rehash all the keys. Open addressing with linear probing Probe the slots in a linear manner Simple but Bad: results in clustering (long sequences of used slots build up very fast)

Revisiting hash tables (cs311) Open addressing with double hashing Use a second hash function. The probe sequence is: (h(k) + i*h2(k) ) mod m, with i=0, 1, 2, ... Good performance Since we use a second function, keys that originally collide will subsequently have different probe sequences. No clustering A good choice for h2(k) is p-(k mod p) where p is a prime less than m

Scope issues Declaration before use promotes one-pass compiling Block-structured languages allow nested name scopes. Usual visibility rules Only names created in the current or enclosing scopes are visible When there is a conflict, the innermost declaration takes precedence. What about "same-level" declarations?

Scope issues One idea is to have a global symbol table and save the scope information for each entry. When an identifier goes out of scope, scan the table and remove the corresponding entries We may even link all same-scope entries together for easier removal. Careful: deleting from a hash table that uses open addressing is tricky We must mark a slot as Deleted, rather than Empty, otherwise later LookUp operations may fail. Alternatively, we can maintain a separate, local table for each scope.

Structure tables Where should we store struct field names? Separate mini symbol table for each struct Conceptually easy Separate table for all struct field names We need to somehow uniquely map each name to its structure (e.g. by concatenating the field name with the struct name) No special storage struct field names are stored in the regular symbol table. Again we need to be able to map each name to its structure.

Static vs. Dynamic scope What we've seen so far is static scope: Scoping follows the structure of the program An alternative is dynamic scope, where scoping follows the execution path. Example: int i = 1; void func() { cout << i << endl; } int main () { int i = 2; func(); return 0; If C++ used dynamic scoping, this would print out 2, not 1.