CMPE 152: Compiler Design March 19 Class Meeting

Slides:



Advertisements
Similar presentations
Lexical Analysis (4.2) Programming Languages Hiram College Ellen Walker.
Advertisements

CS 153: Concepts of Compiler Design August 25 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
CS 153: Concepts of Compiler Design August 24 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 2 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
CS 153: Concepts of Compiler Design August 26 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 16 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 7 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CSc 453 Lexical Analysis (Scanning)
CS 153: Concepts of Compiler Design September 30 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 154 Formal Languages and Computability February 4 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 153: Concepts of Compiler Design September 23 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Department of Software & Media Technology
Intro to compilers Based on end of Ch. 1 and start of Ch. 2 of textbook, plus a few additional references.
CMPE Data Structures and Algorithms in C++ September 14 Class Meeting
Chapter 2 :: Programming Language Syntax
CS 153: Concepts of Compiler Design August 29 Class Meeting
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
CSc 453 Lexical Analysis (Scanning)
CS 153: Concepts of Compiler Design October 17 Class Meeting
CS 153: Concepts of Compiler Design October 5 Class Meeting
CSc 453 Lexical Analysis (Scanning)
Objectives Identify the built-in data types in C++
Finite-State Machines (FSMs)
CS 153: Concepts of Compiler Design September 7 Class Meeting
CS 153: Concepts of Compiler Design December 5 Class Meeting
CS 153: Concepts of Compiler Design November 30 Class Meeting
CMPE 152: Compiler Design December 5 Class Meeting
CMPE Data Structures and Algorithms in C++ February 22 Class Meeting
CMPE 152: Compiler Design February 6 Class Meeting
CMPE 152: Compiler Design September 25 Class Meeting
CMPE 152: Compiler Design August 30 Class Meeting
CMPE 152: Compiler Design September 18 Class Meeting
CSE 3302 Programming Languages
CMPE 152: Compiler Design September 13 Class Meeting
CMPE 152: Compiler Design October 4 Class Meeting
CS 3304 Comparative Languages
CMPE 152: Compiler Design August 23 Class Meeting
CMPE 152: Compiler Design August 21/23 Lab
CMPE 152: Compiler Design September 20 Class Meeting
CS 3304 Comparative Languages
CMPE 152: Compiler Design December 6 Class Meeting
Chapter 2 :: Programming Language Syntax
Lexical Analysis - An Introduction
CMPE 152: Compiler Design February 28 Class Meeting
CMPE 152: Compiler Design March 7 Class Meeting
CMPE 152: Compiler Design January 29 Class Meeting
Chapter 2 :: Programming Language Syntax
CMPE 152: Compiler Design February 7 Class Meeting
CMPE 152: Compiler Design March 21 Class Meeting
CMPE 152: Compiler Design February 28 / March 5 Lab
CS 144 Advanced C++ Programming January 31 Class Meeting
CMPE 152: Compiler Design February 21 Class Meeting
CMPE 152: Compiler Design March 7 Class Meeting
CMPE 152: Compiler Design April 18 – 30 Labs
CMPE 152: Compiler Design March 5 Class Meeting
CMPE 152: Compiler Design December 4 Class Meeting
CMPE 152: Compiler Design March 28 Class Meeting
CMPE 152: Compiler Design March 7/12 Lab
CSc 453 Lexical Analysis (Scanning)
CMPE 152: Compiler Design September 3 Class Meeting
CMPE 152: Compiler Design August 27 Class Meeting
CMPE 152: Compiler Design September 17 Class Meeting
CMPE 152: Compiler Design February 7 Class Meeting
CMPE 152: Compiler Design September 19 Class Meeting
Presentation transcript:

CMPE 152: Compiler Design March 19 Class Meeting Department of Computer Engineering San Jose State University Spring 2019 Instructor: Ron Mak www.cs.sjsu.edu/~mak

Midterm Solution: Question #51 Suppose the hour data type represents the hours of a 24-hour clock, where the hours are numbered 0, 1, 2, ... 23. Describe in words the steps necessary in class Predefined to implement the new hour data type for Pascal.

Midterm Solution: Question #51, cont’d [5 points] Enter the identifier “hour” into the global symbol table (which during compiler initialization is also the local symbol table) and set its definition to a type name. hour_id = symtab_stack->enter_local("hour"); hour_id->set_definition((Definition) DF_TYPE); [5 points] Create the type specification object for the hour type as a subrange type. Set its identifier to the “hour” identifier. hour_type = TypeFactory::create_type((TypeForm) TF_SUBRANGE); hour_type->set_identifier(hour_id); [5 points] Set the subrange attributes of the hour type. It’s a subrange of integer whose minimum value is 0 and the maximum value is 23. hour_type->set_attribute((TypeKey) SUBRANGE_BASE_TYPE, integer_type); hour_type->set_attribute((TypeKey) SUBRANGE_MIN_VALUE, 0); hour_type->set_attribute((TypeKey) SUBRANGE_MAX_VALUE, 23); [5 points] Set the datatype of the “hour” identifier. hour_id->set_typespec(hour_type);

Midterm Solution: Question #52 Suppose a program has many procedure and function calls. Briefly explain how having a runtime display can improve overall execution performance of the program. Briefly explain how having a runtime display can worsen overall execution performance.

Midterm Solution: Question #51, cont’d [5 points] The runtime display speeds access to variables during the execution of a procedure or function, especially to nonlocal variables of deeply nested procedures and functions. Overall execution performance improves if the program spends a large percentage of its run time inside procedures and functions. [5 points] However, there is overhead incurred during calls and returns to update the runtime display. Performance degrades if there are many calls and returns but only a small percentage of the program’s run time is spent inside the procedures and functions.

Minimum Acceptable Compiler Project At least two data types with type checking. Basic arithmetic operations with operator precedence. Assignment statements. At least one conditional control statement (e.g., IF). At least one looping control statement. Procedures or functions with calls and returns. Parameters passed by value or by reference. Basic error recovery (skip to semicolon or end of line). “Nontrivial” sample programs written in the source language. Generate Jasmin code that can be assembled. Execute the resulting .class file standalone (preferred) or with a test harness. No crashes (e.g., null pointer exceptions). 70 points/100

Ideas for Programming Languages A language that works with a database such as MySQL Combines Pascal and SQL for writing database applications. Compiled code hides JDBC calls from the programmer. Not PL/SQL – use the language to write client programs. A language that can access web pages Statements that “scrape” pages to extract information. A language for generating business reports A Pascal-like language with features that make it easy to generate reports. A string-processing language Combines Pascal and Perl for writing applications that involve pattern matching and string transformations. DSL = Domain-Specific Language

Can We Build a Better Scanner? Our scanner in the front end is relatively easy to understand and follow. Separate scanner classes for each token type. However, it’s big and slow. Creates lots of objects and makes lots of method calls. We can write a more compact and faster scanner. However, it may be harder to understand and follow.

Deterministic Finite Automata (DFA) Pascal identifier Regular expression: <letter> ( <letter> | <digit> )* Implement the regular expression with a finite automaton (AKA finite state machine): 1 2 3 letter digit [other]

Deterministic Finite Automata (DFA) This automaton is a deterministic finite automaton (DFA). At each state, the next input character uniquely determines which transition to take to the next state. accepting state 1 2 3 letter digit [other] start state transition

State-Transition Matrix 1 2 3 letter digit [other] Represent the behavior of a DFA by a state-transition matrix:

DFA for a Pascal Number 6 9 10 4 7 11 digit + - E . 5 8 12 [other] 3 Note that this diagram allows only an upper-case E for an exponent. What changes are required to also allow a lower-case e?

DFA for a Pascal Identifier or Number const int SimpleDFAScanner::matrix[13][7] = {     /*        letter digit   +    -    .    E other */     /*  0 */ {   1,    4,    3,   3, ERR,   1, ERR },     /*  1 */ {   1,    1,   -2,  -2,  -2,   1,  -2 },     /*  2 */ { ERR,  ERR,  ERR, ERR, ERR, ERR, ERR },     /*  3 */ { ERR,    4,  ERR, ERR, ERR, ERR, ERR },     /*  4 */ {  -5,    4,   -5,  -5,   6,   9,  -5 },     /*  5 */ { ERR,  ERR,  ERR, ERR, ERR, ERR, ERR },     /*  6 */ { ERR,    7,  ERR, ERR, ERR, ERR, ERR },     /*  7 */ {  -8,    7,   -8,  -8,  -8,   9,  -8 },     /*  8 */ { ERR,  ERR,  ERR, ERR, ERR, ERR, ERR },     /*  9 */ { ERR,   11,   10,  10, ERR, ERR, ERR },     /* 10 */ { ERR,   11,  ERR, ERR, ERR, ERR, ERR },     /* 11 */ { -12,   11,  -12, -12, -12, -12, -12 },     /* 12 */ { ERR,  ERR,  ERR, ERR, ERR, ERR, ERR }, }; Negative numbers in the matrix are the accepting states. digit 1 2 letter [other] 6 9 10 4 7 11 digit + - E . 5 8 12 [other] 3 Notice how the letter E is handled!

A Simple DFA Scanner class SimpleDFAScanner { public:     SimpleDFAScanner(string source_path);     virtual ~SimpleDFAScanner();     /**      * Scan the source file.      */     void scan() throw(string); private:     // Input characters.     static const int LETTER = 0;     static const int DIGIT  = 1;     static const int PLUS   = 2;     static const int MINUS  = 3;     static const int DOT    = 4;     static const int E      = 5;     static const int OTHER  = 6;     // Error state.     static const int ERR = -99999;

A Simple DFA Scanner, cont’d     // State-transition matrix (acceptance states < 0)     static const int matrix[13][7];     char ch;    // current input character     int state;  // current state     ifstream reader;     string line;     int line_number;     int line_pos;

A Simple DFA Scanner, cont’d int SimpleDFAScanner::type_of(char ch) {     return   (ch == 'E')  ? E            : isalpha(ch)  ? LETTER            : isdigit(ch)  ? DIGIT            : (ch == '+')  ? PLUS            : (ch == '-')  ? MINUS            : (ch == '.')  ? DOT            :                OTHER; }

A Simple DFA Scanner, cont’d string SimpleDFAScanner::next_token() throw(string) {     // Skip blanks.     while (isspace(ch)) next_char();     // At EOF?     if (reader.fail()) return "";     state = 0;  // start state     string buffer;     // Loop to do state transitions.     while (state >= 0)  // not acceptance state     {         state = matrix[state][type_of(ch)];  // transition         if ((state >= 0) || (state == ERR))         {             buffer += ch;  // build token string             next_char();         }     }     return buffer; } This is the heart of the scanner. Table-driven scanners can be very fast!

A Simple DFA Scanner, cont’d void SimpleDFAScanner::scan() throw(string) {     next_char();     while (ch != 0)  // EOF?     {         string token = next_token();         if (token != "")         {             cout << "=====> \"" << token << "\" ";             string token_type =                   (state ==  -2) ? "IDENTIFIER"                 : (state ==  -5) ? "INTEGER"                 : (state ==  -8) ? "REAL (fraction only)"                 : (state == -12) ? "REAL"                 :                  "*** ERROR ***";             cout << token_type << endl;         }     } } How do we know which token we just got? Demo