Using Microsoft Phoenix in Education and Research Dragan Bojić University of Belgrade
Contents Introduction Features of Phoenix Framework Application of Phoenix Educational uses of Phoenix Research using Phoenix Phoenix RPFs Phoenix based tool for Data flow testing Conclusions
Introduction: what is Microsoft Phoenix? “a project at Microsoft Research that provides an extensible framework for the analysis, optimization, and modification of code during compilation. “Phoenix is Microsoft’s next-generation, state of the art infrastructure for program analysis and transformation” “Phoenix is a framework and a toolkit that allows users to peek into the black box at the heart of compilers in order to see and modify their internals.”
Introduction: what is Microsoft Phoenix? currently available in a form of Research Development Kit for noncommercial use to qualified researchers affiliated with recognized institutions (research.microsoft.com/phoenix). Final goal for the Phoenix technology is to replace the Microsoft standard, 10 years old, compiler back end (c2.exe), as well as the Microsoft®.NET Framework just-in- time (JIT) compiler, for new, 64bit computer architectures;
Connection to other MS projects Phoenix Infrastructure.Net CodeGen Runtime JITs Pre-JIT OO and.Net optimizations Native CodeGen Advanced C++/OO Optimizations FP optimizations OpenMP Retargetable “Machine Models” ~3 months: -Od ~3 months: -O2 Chip Vendor CDK ~6 month ports Sample port + docs Academic RDK Managed API’s IP as DLLs Docs MSR & Partner Tools Built on Phoenix API’s Both HL and LL API’s Managed API’s Program Analysis Program Rewrite MSR Adv Lang Language Research Direct xfer to Phoenix Research Insulated from code generation AST Tools Static Analysis Tools Next Gen Front-Ends R/W Global Program Views
Phoenix implemenation built Phoenix in C++/CLI Phoenix is the largest C++/CLI code base we know of: ~400K LOC written by hand ~1.8M LOC written by tools Initially written in MC syntax, now converting to C++/CLI (taken from Microsoft Material)
Features of Phoenix: basic usage scenario #1 converts input to common Phoenix Internal representation has a sequence of phases, exposed to user who can easily modify the sequence, e.g. by omitting some standard phase or inserting its own phase. code insertion or code transformations and optimizations could easily be done finally it generates native code for a variety of architectures Phoenix Managed compiler C#, VB.NET assembly C/C++ compiler front end CIL Native code PE file (acts as a back end) IR
Features of Phoenix: standard phases user can easily modify the sequence, e.g. by omitting some standard phase or inserting its own phase.
Features of Phoenix: usage scenario #2 code retargeting Phoenix could regenerate.NET assembly code or Phoenix could read and convert native code (.exe or.dll file) into common IR. The Phoenix infrastructure can then perform static analysis on the IR, instrument the IR, and/or write a new binary. Phoenix Managed compiler C#, VB.NET assembly C/C++ compiler front end CIL Native code (acts as a back end)
Possible uses of Phoenix in education since focus of Phoenix is on the compiler back end, it is most suitable for advanced undergraduate or graduate courses in compiler construction Teacher can give programming assignments for implementing some well known compiler algorithms that is not easy to implement without using Phoenix.
Sample assignments BB calculates basic blocks and produces BB graph ReachingDefs computes the reaching definitions for all the basic blocks in a function. UninitializedLocal checks for uses of uninitialized local variables. DumpType prints types and symbols from exe file. FuncNames reports the name of each function as it is being compiled by the c2 backend. MTrace injects tracing code into a managed application. ProcTrace instruments a native binary by adding, to the beginning of every function, a call to some user’s routine.
Possible uses of Phoenix in education (…) Phoenix completely exposes internal compiler data structures and algorithms Phoenix makes it straightforward to implement compiler extensions without writing a new compiler from scratch. it is easy to build custom compiler plug-ins simply by extending the Phx.PlugIn and Phx.Phase classes and then overriding the Phase.Execute method with your own code. For example, the following code fragment implements a compiler phase that displays the name of each method processed. public class DisplayNamesPhase : Phx.Phase { protected override void Execute(Phx.Unit unit) { Phx.FuncUnit func = unit as Phx.FuncUnit; Console.WriteLine(“Function name: {0}”, func.NameString); }
Possible uses of Phoenix in education (…) For example, one graduate student wrote a Phoenix add-in to compute in and out sets for liveliness analysis and represent it in a graph in two weeks, without any previous experience with Phoenix. The implemenation in C# is several hundred LOC.
Example: Analyzed source int main() { int x,y,z,k,b; x=0;y=0; z=5; k=2; if (k==z) b=1; else b=2; while (x<z) { x=x+z*b; z=z+3; }
Example: Output BB graph with in and out sets
Research using Phoenix Microsoft stimulates research on Phoenix by providing funding opportunities annually via Requests for Proposals (RFPs) on this topic. in 2005, a total fund of $ was distributed to 12 award recipients. in 2006 a total fund of $ is distributed to 15 award recipients (more than 100 proposals were received). Phoenix and SSCLI: Compilation and Managed Execution 2005 RFP Awards:
Compiler Support for Software Transactional Memory Brown University, U.S. Phoenix-Based Optimizing Compilers Course Development Indian Institute of Information Technology, Hyderabad, India Integrating Dynamic Slicing into the coredbg Debugger University of Arizona, U.S. A Testbed for Studying the Order and Combination of Code Optimization Phases Harvard University, U.S. PTV: Translation Validation in the Phoenix Compiler Framework University of Illinois at Chicago, U.S. Phase Detection and Optimization University of California at Santa Barbara, U.S. Extending Dynamic Features of the SSCLI University of Oviedo, Spain A Lua Compiler for the Phoenix Framework Pontifícia Universidade Católica do Rio de Janeiro, Brazil Developing a Testing Framework for Security University of Virginia, U.S. A Viable Approach to Compiling Sequential Codes for CMPs Princeton University, U.S. Improving the Compilation of Lazy Functional Languages Using Phoenix and the SSCLI Federal University of Pernambuco, Brazil A Phoenix-Based Tool for Data Flow Testing University of Belgrade, Serbia and Montenegro Concurrency Support for Managed Code and Interactive AsmL University of Zagreb, Croatia SPBU for Phoenix (SPBU4PHX): A Set of Compiler Development and AOP Tools Based on Phoenix St. Petersburg University, Russia Adaptive Heap Size Control Using Phoenix and.NET Virtual Machine University of Rochester, U.S. Region Memory System for Scalable Performance National University of Singapore
Phoenix-Based Tool for Data Flow Testing The main outcome of the project will be a data flow testing tool, based on Phoenix RDK, for.NET applications. The tool will support dataflow test adequacy criteria such as: all-uses, all-defs, all-c-uses, and all-p-uses. Other criteria will be considered as well, such as all all- du-paths or interprocedural data flow testing (global def-uses). This tool should help developers to determine how well the software has been tested, displaying the paths that has not been executed. The tool could also help researchers that compare relative strengths of testing techniques using empirical statistical approach. At the present very few testing tools support data flow techniques.
How Phoenix easies the task of writing the tool Data flow testing is a structural (white box) testing technique, based on information obtained from static data flow analysis of application code. Phoenix has built in data flow analysis and computing of static single assignment form etc. In order to compute what paths are covered by tests, code should be instrumented. Traditionally, you'd need to write a program that reads in a program file, figures out what's code and what's data, identifies each function, basic block, and instruction, inserts the instrumentation code, adjust for the added code, and then writes out a valid binary. With Phoenix, there exists support for code instrumentation at any desired point.
Conclusions Strengths of Phoenix for using in education and research very powerful and flexible framework and toolset for areas such as compiler construction, program transformation, etc. easy to modify/extend existing compiler because the internal parts and algorithms are all exposed a single Phoenix based tool automatically covers multiple.NET languages because they all use the same back end. Potential drawbacks of using Phoenix Microsoft proprietary technology, no support e.g. for Java Still in development, e.g. documentation still unfinished in some parts