Computer Science 210 Computer Organization Building an Assembler Part IV: Grammars and Syntax Analysis
The Second Pass Text file Line stream Token stream Tools Opcode table Text file Line stream CharacterIO Scanner Token stream First Pass Symbol table Second Pass Source program listing, error messages (file and/or terminal) Sym file Bin file
Syntax Analysis Examine each token in a line of code and verify that the line obeys the syntax rule for a type of instruction An instruction’s type is determined by its leading token and possible other tokens Specify all the syntax rules in a grammar
EBNF Grammar Extended Backus-Naur Form Contains rules and three types of symbols Terminal symbols: can appear in code Non-terminal symbols: name phrases Metasymbols: used to form the rules
Example Types of Symbols Terminals are "ADD" and "x3000" Non-terminals are label and string-literal Metasymbols are = and |
Some Example Rules add-or-and-ins = ( "ADD" | "AND" ) register "," register "," ( register | integer-literal ) blkw-ins = ".BLKW" integer-literal br-ins = br-opcode label = means "is defined as" " " enclose literal items [ ] enclose optional items { } enclose items that can appear zero or more times ( ) group items together as a unit | indicates a choice
The Complete Grammar (in Grammar.txt) Contains rules for all the syntactic forms Mentions most of the tokens, except for the integers and labels, but specifies rules for constructing and recognizing those
For Wednesday Code Generation