Elements of Computing Systems, Nisan & Schocken, MIT Press, www.nand2tetris.org, Chapter 10: Compiler I: Syntax Analysis slide 1www.nand2tetris.org Building.

Slides:



Advertisements
Similar presentations
1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
Advertisements

Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Parser construction tools: YACC
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Introduction: Hello, World Below slide 1www.idc.ac.il/tecs Introduction:
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Chapter 10: Compiler I: Syntax Analysis slide 1www.idc.ac.il/tecs.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 1www.idc.ac.il/tecs Compiler.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 6: Assembler slide 1www.idc.ac.il/tecs Assembler Elements of Computing.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Compiler Construction1 COMP Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
Building a Modern Computer From First Principles
Elements of Computing Systems, Nisan & Schocken, MIT Press, 2005, Chapter 11: Compiler II: Code Generation slide 1www.idc.ac.il/tecs.
Lecture 2: Lexical Analysis
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 1: Compiler II: Code Generation slide 1www.nand2tetris.org Building.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 11: Compiler II: Code Generation slide 1www.nand2tetris.org Building.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
Compiler Tools Lex/Yacc – Flex & Bison. Compiler Front End (from Engineering a Compiler) Scanner (Lexical Analyzer) Maps stream of characters into words.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 1www.nand2tetris.org Building.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 1www.nand2tetris.org Building.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
By Neng-Fa Zhou Lexical Analysis 4 Why separate lexical and syntax analyses? –simpler design –efficiency –portability.
CPS 506 Comparative Programming Languages Syntax Specification.
Introduction to Lex Ying-Hung Jiang
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
Introduction Lecture 1 Wed, Jan 12, The Stages of Compilation Lexical analysis. Syntactic analysis. Semantic analysis. Intermediate code generation.
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
COMPILERS AND INTERPRETERS Lesson 3 – TDDD16 TDDB44 Compiler Construction 2010 Kristian Stavåker Department.
Introduction to Lex Fan Wu
By Neng-Fa Zhou Programming language syntax 4 Three aspects of languages –Syntax How are sentences formed? –Semantics What does a sentence mean? –Pragmatics.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Introduction CPSC 388 Ellen Walker Hiram College.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 6: Assembler slide 1www.nand2tetris.org Building a Modern Computer.
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Introduction: From Nand to Tetris
Lexical Analysis.
Tutorial On Lex & Yacc.
PROGRAMMING LANGUAGES
Context-free Languages
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
Bison: Parser Generator
Lexical Analysis Why separate lexical and syntax analyses?
Compiler I: Sytnax Analysis
Review: Compiler Phases:
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Compiler Structures 3. Lex Objectives , Semester 2,
Compiler Design 3. Lexical Analyzer, Flex
Presentation transcript:

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 1www.nand2tetris.org Building a Modern Computer From First Principles Compiler I: Syntax Analysis

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 2www.nand2tetris.org Course map Assembler Chapter 6 H.L. Language & Operating Sys. abstract interface Compiler Chapters VM Translator Chapters Computer Architecture Chapters Gate Logic Chapters Electrical Engineering Physics Virtual Machine abstract interface Software hierarchy Assembly Language abstract interface Hardware hierarchy Machine Language abstract interface Hardware Platform abstract interface Chips & Logic Gates abstract interface Human Thought Abstract design Chapters 9, 12

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 3www.nand2tetris.org Motivation: Why study about compilers? The first compiler is FORTRAN compiler developed by an IBM team led by John Backus (Turing Award, 1977) in It took 18 man-month. Because Compilers … Are an essential part of applied computer science Are very relevant to computational linguistics Are implemented using classical programming techniques Employ important software engineering principles Train you in developing software for transforming one structure to another (programs, files, transactions, …) Train you to think in terms of ”description languages”. Parsing files of some complex syntax is very common in many applications.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 4www.nand2tetris.org The big picture Intermediate code VM implementation over CISC platforms VM imp. over RISC platforms VM imp. over the Hack platform VM emulator VM lectures (Projects 7-8) Some Other language Jack language Some compiler Some Other compiler Jack compiler... Some language... Compiler lectures (Projects 10,11) Modern compilers are two-tiered: Front-end: from high-level language to some intermediate language Back-end: from the intermediate language to binary code.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 5www.nand2tetris.org Compiler architecture (front end) Syntax analysis: understanding the structure of the source code Code generation: reconstructing the semantics using the syntax of the target code.  Tokenizing: creating a stream of “atoms”  Parsing: matching the atom stream with the language grammar XML output = one way to demonstrate that the syntax analyzer works (source)(target) scanner

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 6www.nand2tetris.org Tokenizing / Lexical analysis / scanning Remove white space Construct a token list (language atoms) Things to worry about: Language specific rules: e.g. how to treat “++” Language-specific classifications: keyword, symbol, identifier, integerCconstant, stringConstant,... While we are at it, we can have the tokenizer record not only the token, but also its lexical classification (as defined by the source language grammar).

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 7www.nand2tetris.org C function to split a string into tokens char* strtok(char* str, const char* delimiters); str : string to be broken into tokens delimiters : string containing the delimiter characters

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 8www.nand2tetris.org Jack Tokenizer if (x < 153) {let city = ”Paris”;} Source code if ( x < 153 ) { let city = Paris ; } if ( x < 153 ) { let city = Paris ; } Tokenizer’s output Tokenizer

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 9www.nand2tetris.org Parsing The tokenizer discussed thus far is part of a larger program called parser Each language is characterized by a grammar. The parser is implemented to recognize this grammar in given texts The parsing process: A text is given and tokenized The parser determines weather or not the text can be generated from the grammar In the process, the parser performs a complete structural analysis of the text The text can be in an expression in a : Natural language (English, …) Programming language (Jack, …).

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 10www.nand2tetris.org Parsing examples He ate an apple on the desk. English ate he an apple the desk parse on (5+3)*2 – sqrt(9*4) Jack

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 11www.nand2tetris.org Regular expressions a|b* { , “a”, “b”, “bb”, “bbb”, …} (a|b)* { , “a”, “b”, “aa”, “ab”, “ba”, “bb”, “aaa”, …} ab*(c|  ) {a, “ac”, “ab”, “abc”, “abb”, “abbc”, …}

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 12www.nand2tetris.org Context-free grammar S  () S  (S) S  SS S  a|aS|bS strings ending with ‘a’ S  x S  y S  S+S S  S-S S  S*S S  S/S S  (S) (x+y)*x-x*y/(x+x) Simple (terminal) forms / complex (non-terminal) forms Grammar = set of rules on how to construct complex forms from simpler forms Highly recursive.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 13www.nand2tetris.org Recursive descent parser A=bB|cC A() { if (next()==‘b’) { eat(‘b’); B(); } else if (next()==‘c’) { eat(‘c’); C(); } A=(bB)* A() { while (next()==‘b’) { eat(‘b’); B(); }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 14www.nand2tetris.org A typical grammar of a typical C-like language while (expression) { if (expression) statement; while (expression) { statement; if (expression) statement; } while (expression) { statement; } while (expression) { if (expression) statement; while (expression) { statement; if (expression) statement; } while (expression) { statement; } Code samples if (expression) { statement; while (expression) statement; } if (expression) statement; } if (expression) { statement; while (expression) statement; } if (expression) statement; }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 15www.nand2tetris.org A typical grammar of a typical C-like language program: statement; statement: whileStatement | ifStatement | // other statement possibilities... | '{' statementSequence '}' whileStatement: 'while' '(' expression ')' statement ifStatement: simpleIf | ifElse simpleIf: 'if' '(' expression ')' statement ifElse: 'if' '(' expression ')' statement 'else' statement statementSequence: '' // null, i.e. the empty sequence | statement ';' statementSequence expression: // definition of an expression comes here program: statement; statement: whileStatement | ifStatement | // other statement possibilities... | '{' statementSequence '}' whileStatement: 'while' '(' expression ')' statement ifStatement: simpleIf | ifElse simpleIf: 'if' '(' expression ')' statement ifElse: 'if' '(' expression ')' statement 'else' statement statementSequence: '' // null, i.e. the empty sequence | statement ';' statementSequence expression: // definition of an expression comes here

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 16www.nand2tetris.org Parse tree program: statement; statement: whileStatement | ifStatement | // other statement possibilities... | '{' statementSequence '}' whileStatement: 'while' '(' expression ')' statement... program: statement; statement: whileStatement | ifStatement | // other statement possibilities... | '{' statementSequence '}' whileStatement: 'while' '(' expression ')' statement...

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 17www.nand2tetris.org Recursive descent parsing Parser implementation: a set of parsing methods, one for each rule: parseStatement() parseWhileStatement() parseIfStatement() parseStatementSequence() parseExpression(). Highly recursive LL(0) grammars: the first token determines in which rule we are In other grammars you have to look ahead 1 or more tokens Jack is almost LL(0). while (expression) { statement; while (expression) { while (expression) statement; } while (expression) { statement; while (expression) { while (expression) statement; } code sample

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 18www.nand2tetris.org The Jack grammar ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y. ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 19www.nand2tetris.org The Jack grammar ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y. ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 20www.nand2tetris.org The Jack grammar ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y. ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 21www.nand2tetris.org The Jack grammar ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y. ’x’ : x appears verbatim x : x is a language construct x? : x appears 0 or 1 times x* : x appears 0 or more times x|y : either x or y appears (x,y) : x appears, then y.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 22www.nand2tetris.org Jack syntax analyzer in action Class Bar { method Fraction foo(int y) { var int temp; // a variable let temp = (xxx+12)*-63; Class Bar { method Fraction foo(int y) { var int temp; // a variable let temp = (xxx+12)*-63; Syntax analyzer With the grammar, we can write a syntax analyzer program (parser) The syntax analyzer takes a source text file and attempts to match it on the language grammar If successful, it can generate a parse tree in some structured format, e.g. XML. Syntax analyzer var int temp ; let temp = ( xxx var int temp ; let temp = ( xxx

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 23www.nand2tetris.org Jack syntax analyzer in action Class Bar { method Fraction foo(int y) { var int temp; // a variable let temp = (xxx+12)*-63; Class Bar { method Fraction foo(int y) { var int temp; // a variable let temp = (xxx+12)*-63; Syntax analyzer If xxx is non-terminal, output: Recursive code for the body of xxx If xxx is terminal (keyword, symbol, constant, or identifier), output: xxx value var int temp ; let temp = ( xxx var int temp ; let temp = ( xxx

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 24www.nand2tetris.org Summary and next step Syntax analysis: understanding syntax Code generation: constructing semantics The code generation challenge: Extend the syntax analyzer into a full-blown compiler that, instead of passive XML code, generates executable VM code Two challenges: (a) handling data, and (b) handling commands.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 25www.nand2tetris.org Perspective The parse tree can be constructed on the fly The Jack language is intentionally simple: Statement prefixes: let, do,... No operator priority No error checking Basic data types, etc. The Jack compiler: designed to illustrate the key ideas that underlie modern compilers, leaving advanced features to more advanced courses Richer languages require more powerful compilers

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 26www.nand2tetris.org Perspective Syntax analyzers can be built using: Lex tool for tokenizing (flex) Yacc tool for parsing (bison) Do everything from scratch (our approach...) Industrial-strength compilers: (LLVM) Have good error diagnostics Generate tight and efficient code Support parallel (multi-core) processors.

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 27www.nand2tetris.org Lex (from wikipedia) A computer program that generates lexical analyzers (scanners or lexers) Commonly used with the yacc parser generator. Structure of a Lex file Definition section % Rules section % C code section Definition section % Rules section % C code section

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 28www.nand2tetris.org Example of a Lex file /*** Definition section ***/ %{ /* C code to be copied verbatim */ #include %} /* This tells flex to read only one input file */ %option noyywrap /*** Rules section ***/ % [0-9]+ { /* yytext is a string containing the matched text. */ printf("Saw an integer: %s\n", yytext); }.|\n { /* Ignore all other characters. */ } /*** Definition section ***/ %{ /* C code to be copied verbatim */ #include %} /* This tells flex to read only one input file */ %option noyywrap /*** Rules section ***/ % [0-9]+ { /* yytext is a string containing the matched text. */ printf("Saw an integer: %s\n", yytext); }.|\n { /* Ignore all other characters. */ }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 29www.nand2tetris.org Example of a Lex file % /*** C Code section ***/ int main(void) { /* Call the lexer, then quit. */ yylex(); return 0; } % /*** C Code section ***/ int main(void) { /* Call the lexer, then quit. */ yylex(); return 0; }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 30www.nand2tetris.org Example of a Lex file > flex test.lex (a file lex.yy.c with 1,763 lines is generated) > gcc lex.yy.c (an executable file a.out is generated) >./a.out < test.txt Saw an integer: 123 Saw an integer: 2 Saw an integer: 6 > flex test.lex (a file lex.yy.c with 1,763 lines is generated) > gcc lex.yy.c (an executable file a.out is generated) >./a.out < test.txt Saw an integer: 123 Saw an integer: 2 Saw an integer: 6 abc123z.!&*2gj6 test.txt

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 31www.nand2tetris.org Another Lex example %{ int num_lines = 0, num_chars = 0; %} %option noyywrap % \n ++num_lines; ++num_chars;. ++num_chars; % main() { yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); } %{ int num_lines = 0, num_chars = 0; %} %option noyywrap % \n ++num_lines; ++num_chars;. ++num_chars; % main() { yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 32www.nand2tetris.org A more complex Lex example %{ /* need this for the call to atof() below */ #include %} %option noyywrap DIGIT [0-9] ID [a-z][a-z0-9]* % {DIGIT}+ { printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) ); } {DIGIT}+"."{DIGIT}* { printf( "A float: %s (%g)\n", yytext, atof( yytext ) ); } %{ /* need this for the call to atof() below */ #include %} %option noyywrap DIGIT [0-9] ID [a-z][a-z0-9]* % {DIGIT}+ { printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) ); } {DIGIT}+"."{DIGIT}* { printf( "A float: %s (%g)\n", yytext, atof( yytext ) ); }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 33www.nand2tetris.org A more complex Lex example if|then|begin|end|procedure|function { printf( "A keyword: %s\n", yytext ); } {ID} printf( "An identifier: %s\n", yytext ); "+"|"-"|"="|"("|")" printf( “Symbol: %s\n", yytext ); [ \t\n]+ /* eat up whitespace */. printf("Unrecognized char: %s\n", yytext ); % void main(int argc, char **argv ) { if ( argc > 1 ) yyin = fopen( argv[1], "r" ); else yyin = stdin; yylex(); } if|then|begin|end|procedure|function { printf( "A keyword: %s\n", yytext ); } {ID} printf( "An identifier: %s\n", yytext ); "+"|"-"|"="|"("|")" printf( “Symbol: %s\n", yytext ); [ \t\n]+ /* eat up whitespace */. printf("Unrecognized char: %s\n", yytext ); % void main(int argc, char **argv ) { if ( argc > 1 ) yyin = fopen( argv[1], "r" ); else yyin = stdin; yylex(); }

Elements of Computing Systems, Nisan & Schocken, MIT Press, Chapter 10: Compiler I: Syntax Analysis slide 34www.nand2tetris.org A more complex Lex example if (a+b) then foo= else foo=12 if (a+b) then foo= else foo=12 pascal.txt A keyword: if Symbol: ( An identifier: a Symbol: + An identifier: b Symbol: ) A keyword: then An identifier: foo Symbol: = A float: (3.1416) An identifier: else An identifier: foo Symbol: = An integer: 12 (12) A keyword: if Symbol: ( An identifier: a Symbol: + An identifier: b Symbol: ) A keyword: then An identifier: foo Symbol: = A float: (3.1416) An identifier: else An identifier: foo Symbol: = An integer: 12 (12) output