Using Real Compiler Source Code for Teaching Graduate Compiler Design Elizabeth White Nina Stewart Computer Science Department George Mason University Ranjan Sen Microsoft Corporation Washington DC This work is supported by SSCLI (Rotor) RFP 2
Graduate Compilers Typical courses in graduate compilers include: Theory Lexical analysis, Syntax analysis (LL/LR), Semantic analysis (typechecking, intermediate code generation, …) Development of a ‘compiler’ for a small language Use of theory Use of tools (e.g. Lex, YACC)
Graduate Compilers Typical courses in graduate compilers do not include: Access to ‘real’ compiler Access to ‘real’ runtime environment Why? Real compilers are extremely large and complex C# compiler 200,000+ lines of C++ 138 modules
Can we integrate ‘real’ compilers into a class like this? We think so. Idea: Show relevant details inside the compiler tied to input code Hide everything else Hide and Show
Hide and Show Use a debugging tool on the compiler. Combine: Pre-chosen input program Pre-chosen breakpoints inside source code of the appropriate compiler component Step through the processing of input program and watch how the input program triggers changes. A carefully planned exercise of this type should provide insights that would be difficult to provide to the students using other techniques.
Hide and Show SSCLI (Rotor) C# compiler Concept Demonstration _parseNumber: fReal = TRUE; while (*p >= '0' && *p <= '9‘) p++; // Number + dot + non-digit -- these are separate tokens, so don't absorb the // dot token into the number. p = pszHold; pFT->iToken = TID_NUMBER; break; } } if (*p == 'E' || *p == 'e‘) { fReal = TRUE; // skip exponent p++; if (*p == '+' || *p == '-‘) p++; while (*p >= '0' && *p <= '9') p++; } … break; SSCLI (Rotor) C# compiler Target language Scanner Parser Semantic Analysis Code Generator Optimizer Symbol Table C# input file Concept Demonstration & Exploration Instructions: (debugger commands, breakpoints VisualStudio .net
Demo Platform: Rotor (SSCLI) C# compiler Visual Studio .net Concept: Lexical Analysis – How are tokens identified by the lexer? What happens to the input as tokens are found?
Should we integrate ‘real’ compilers into a class like this? Will there be value added? Strengthen understanding of how basic concepts apply? Strengthen understanding of inter-relationships between parts of a compiler? How to measure value added?
How should this be done? Classroom Directed exercises for individuals Would need strong visualization tools to be effective Directed exercises for individuals More promising (at least immediately) Could be assignment based
Current Status Basic concept demos Lexical Analysis Parsing Symbol Tables Need to find the best mechanism to allow students to independently use the demo. Integrated into graduate compilers Fall 2004.
Future How effective is this approach? What additional tools (such as visualization) would make this approach more effective? Expand to see how hide and show can be used at the ‘back-end’ of a compiler. What other areas/courses could this approach be effective in?
Questions?
Example Concept: Lexical Analysis – How are tokens identified by the lexer? What happens to the input stream as tokens are found? Illustration: watch the C# lexer work Set C# lexer variables to watch Set breakpoints in the C# lexer code Trace through parsing of C# input