S.Ducasse Marcus Denker 1 Working with Bytecodes: IRBuilder and InstructionStream.

Slides:



Advertisements
Similar presentations
Virtual Machines Matthew Dwyer 324E Nichols Hall
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Chapter 8 High-Level Programming Languages. 8-2 Chapter Goals Describe the translation process and distinguish between assembly, compilation, interpretation,
1 Compiler Construction Intermediate Code Generation.
Recursion vs. Iteration The original Lisp language was truly a functional language: –Everything was expressed as functions –No local variables –No iteration.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
Object Oriented Programming Chapter 7 Programming Languages by Ravi Sethi.
9. Bytecode and Virtual Machines Original material prepared by Adrian Lienhard and Marcus Denker.
11. Working with Bytecode. © Oscar Nierstrasz ST — Working with Bytecode 11.2 Roadmap  The Squeak compiler  Introduction to Squeak bytecode  Generating.
Principles of Object-Oriented Software Development The language Smalltalk.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
Stéphane Ducasse 1 The Taste of Smalltalk.
13. A bit of Smalltalk. © Oscar Nierstrasz 2 Roadmap  The origins of Smalltalk  What is Smalltalk?  Syntax in a nutshell  Seaside — web development.
Dynamic Object-Oriented Programming with Smalltalk 1. Introduction Prof. O. Nierstrasz Summer Semester 2006.
CS 536 Spring Code generation I Lecture 20.
High-Level Programming Languages
11. Working with Bytecode. © Oscar Nierstrasz ST — Working with Bytecode 11.2 Roadmap  The Pharo compiler  Introduction to Pharo bytecode  Generating.
PZ06BX Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ06BX - Introduction to Smalltalk Programming Language.
Run time vs. Compile time
Prof. Fateman CS 164 Lecture 201 Virtual Machine Structure Lecture 20.
03G-1 Everything is an Object Examining builtin classes All these are classes in Little Smalltalk:  Object  Class  Method  Block  Boolean True False.
5/6/99 Ashish Sabharwal1 JVM Architecture n Local storage area –Randomly accessible –Just like standard RAM –Stores variables (eg. an array) –Have to specify.
© Oscar Nierstrasz ST — Introduction ST 1.1 The Smalltalk Compiler.
Object-oriented programming and design 1 Smalltalk in a Nutshell Objects & classes Messages & methods Inheritance & metaclasses.
Stéphane Ducasse5.1 Smalltalk in a Nutshell OO Model in a Nutshell Syntax in a Nutshell.
Stéphane Ducasse 1 Smalltalk in a Nutshell.
11. Working with Bytecode. © Oscar Nierstrasz ST — Working with Bytecode 11.2 Roadmap  The Squeak compiler  Introduction to Squeak bytecode  Generating.
S.Ducasse Stéphane Ducasse 1 The Taste of Smalltalk.
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Programmer's view on Computer Architecture by Istvan Haller.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
CS 598 Scripting Languages Design and Implementation 7. Smalltalk 80.
CSCI-383 Object-Oriented Programming & Design Lecture 13.
CS 147 June 13, 2001 Levels of Programming Languages Svetlana Velyutina.
 JAVA Compilation and Interpretation  JAVA Platform Independence  Building First JAVA Program  Escapes Sequences  Display text with printf  Data.
Smalltalk (and Squeak) Aida Dungan and Rick Shreve.
Objects & Dynamic Dispatch CSE 413 Autumn Plan We’ve learned a great deal about functional and object-oriented programming Now,  Look at semantics.
Introduction to Java Java Translation Program Structure
Virtual Machines, Interpretation Techniques, and Just-In-Time Compilers Kostis Sagonas
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Smalltalk in a.NET World How to write a Smalltalk compiler without writing a VM John Brant
8-1 Compilers Compiler A program that translates a high-level language program into machine code High-level languages provide a richer set of instructions.
More on MIPS programs n SPIM does not support everything supported by a general MIPS assembler. For example, –.end doesn’t work Use j $ra –.macro doesn’t.
S.Ducasse 1 7. Understanding Metaclasses. S.Ducasse 2 Roadmap Metaclasses in 7 points Indexed Classes Class Instance Variables Class Variables Pool Dictionaries.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
S.Ducasse Stéphane Ducasse 1 Smalltalk in a Nutshell.
S.Ducasse Stéphane Ducasse savoie.fr e/ e/ 1 Smalltalk in a Nutshell.
CSE 3302 Programming Languages Chengkai Li Fall 2007 Smalltalk Lecture 14 – Smalltalk, Fall CSE3302 Programming Languages, UT-Arlington ©Chengkai.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
Programming Language Concepts (CIS 635) Elsa L Gunter 4303 GITC NJIT,
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
COP4020 Programming Languages Introduction Prof. Robert van Engelen (modified by Prof. Em. Chris Lacher)
RealTimeSystems Lab Jong-Koo, Lim
LLVM IR, File - Praakrit Pradhan. Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR.
Patterns Protect from Change Rule: if something is going to change, make it an object. Strategy: make algorithm an object so it can change State: make.
Smalltalk Implementation Harry Porter, October 2009 Smalltalk Implementation: Optimization Techniques Prof. Harry Porter Portland State University 1.
Lecture 3 Translation.
Chapter 14 Functions.
Smalltalk Implementation
7.1 What Is An Object Object-oriented program - Description or simulation of application Object-oriented programming is done by adopting or extending an.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Virtual Machines: Squeak and Java
Low Level Programming Languages
Introduction to Smalltalk
CSE 3302 Programming Languages
CSc 453 Interpreters & Interpretation
Presentation transcript:

S.Ducasse Marcus Denker 1 Working with Bytecodes: IRBuilder and InstructionStream

S.Ducasse License: CC-Attribution-ShareAlike

S.Ducasse 3 Reasons for working with Bytecode Generating Bytecode Implementing compilers for other languages Experimentation with new language features Parsing and Interpretation: Analysis Decompilation (for systems without source) Pretty printing Interpretation: Debugger, Profiler

S.Ducasse 4 Overview Squeak Bytecodes Examples: Generating Bytecode with IRBuilder Decoding Bytecodes Bytecode Execution

S.Ducasse 5 The Squeak Virtual Machine Virtual machine provides a virtual processor Bytecode: The ‘machinecode’ of the virtual machine Smalltalk (like Java): Stack machine easy to implement interpreters for different processors most hardware processors are register machines Squeak VM: Implemented in Slang Slang: Subset of Smalltalk. (”C with Smalltalk Syntax”) Translated to C

S.Ducasse 6 Bytecode in the CompiledMethod CompiledMethods format: Header Literals Bytecode Trailer Pointer to Source Array of all Literal Objects Number of temps, literals... (Number>>#asInteger)inspect

S.Ducasse 7 Bytecodes: Single or multibyte Different forms of bytecodes: Single bytecodes: Example: 120: push self Groups of similar bytecodes 16: push temp 1 17: push temp 2 up to 31 Multibyte bytecodes Problem: 4bit offset may be too small Solution: Use the following byte as offset Example: Jumps need to encode large jump offsets Offset Type 4bits

S.Ducasse 8 Example: Number>>asInteger Smalltalk code: Symbolic Bytecode Number>>asInteger "Answer an Integer nearest the receiver toward zero." ^self truncated 9 self 10 send: truncated 11 returnTop

S.Ducasse 9 Example: Step by Step 9 self The receiver (self) is pushed on the stack 10 send: truncated Bytecode 208: send litereral selector 1 Get the selector from the first literal start message lookup in the class of the object that is top of the stack result is pushed on the stack 11 returnTop return the object on top of the stack to the calling method

S.Ducasse 10 Squeak Bytecodes 256 Bytecodes, four groups: Stack Bytecodes Stack manipulation: push / pop / dup Send Bytecodes Invoke Methods Return Bytecodes Return to caller Jump Bytecodes Control flow inside a method

S.Ducasse 11 Stack Bytecodes push and store, e.g. temps, instVars, literals e.g: : push instance variable Push Constants (False/True/Nil/1/0/2/-1) Push self, thisContext duplicate top of stack pop

S.Ducasse 12 Sends and Returns Sends: receiver is on top of stack Normal send Super Sends Hard-coded sends for efficiency, e.g. +, - Returns Return top of stack to the sender Return from a block Special bytecodes for return self, nil, true, false (for efficiency)

S.Ducasse 13 Jump Bytecodes Control Flow inside one method Used to implement control-flow efficiently Example: 9 pushConstant: 1 10 pushConstant: 2 11 send: < 12 jumpFalse: pushConstant: 'true' 14 jumpTo: pushConstant: nil 16 returnTop ^ 1<2 ifTrue: ['true']

S.Ducasse 14 What you should have learned dealing with bytecodes directly is possible, but very boring. We want reusable abstractions that hide the details (e.g. the different send bytecodes) We would like to have frameworks for Generating bytecode easily Parsing bytecode Evaluating bytecode

S.Ducasse 15 Generating Bytecodes IRBuilder Part of the new compiler (on SqueakMap) Like an Assembler for Squeak

S.Ducasse 16 IRBuilder: Simple Example Number>>asInteger iRMethod := IRBuilder new rargs: #(self); "receiver and args" pushTemp: #self; send: #truncated; returnTop; ir. aCompiledMethod := iRMethod compiledMethod. aCompiledMethod valueWithReceiver:3.5 arguments: #()

S.Ducasse 17 IRBuilder: Stack Manipulation popTop - remove the top of stack pushDup - push top of stack on the stack pushLiteral: pushReceiver - push self pushThisContext

S.Ducasse 18 IRBuilder: Symbolic Jumps Jump targets are resolved: Example: false ifTrue: [’true’] ifFalse: [’false’] iRMethod := IRBuilder new rargs: #(self); "receiver and args" pushLiteral: false; jumpAheadTo: #false if: false; pushLiteral: 'true';"ifTrue: ['true']" jumpAheadTo: #end; jumpAheadTarget: #false; pushLiteral: 'false';"ifFalse: ['false']" jumpAheadTarget: #end; returnTop; ir.

S.Ducasse 19 IRBuiler: Instance Variables Access by offset Read: getField: receiver on top of stack Write: setField: receiver and value on stack Example: set the first instance variable to 2 iRMethod := IRBuilder new rargs: #(self);"receiver and args declarations" pushLiteral: 2; pushTemp: #self; setField: 1; pushTemp: #self; returnTop; ir. aCompiledMethod := iRMethod compiledMethod. aCompiledMethod valueWithReceiver: arguments: #()

S.Ducasse 20 IRBuilder: Temporary Variables Accessed by name Define with addTemp: / addTemps: Read with pushTemp: Write with storeTemp: Examle: set variables a and b, return value of a iRMethod := IRBuilder new rargs: #(self); "receiver and args" addTemps: #(a b); pushLiteral: 1; storeTemp: #a; pushLiteral: 2; storeTemp: #b; pushTemp: #a; returnTop; ir.

S.Ducasse 21 IRBuilder: Sends normal send super send The second parameter specifies the class were the lookup starts..... builder send: #selector toSuperOf: aClass; builder pushLiteral: ‘hello’ builder send: #size;

S.Ducasse 22 IRBuilder: Lessons learned IRBuilder: Easy bytecode generation Not part of Squeak yet Will be added to Squeak 3.9 or 4.0 Next: Decoding bytecode

S.Ducasse 23 Parsing and Interpretation First step: Parse bytecode enough for easy analysis, pretty printing, decompilation Second step: Interpretation needed for simulation, complex analyis (e.g., profiling) Squeak provides frameworks for both: InstructionStream/InstructionClient (parsing) ContextPart (Interpretation)

S.Ducasse 24 The InstructionStream Hierarchy InstructionStream ContextPart BlockContext MethodContext Decompiler InstructionPrinter InstVarRefLocator BytecodeDecompiler

S.Ducasse 25 InstructionStream Parses the byte-encoded instructions State: pc: programm counter sender: the method (bad name!) Object subclass: #InstructionStream instanceVariableNames: 'sender pc' classVariableNames: 'SpecialConstants' poolDictionaries: '' category: 'Kernel-Methods'

S.Ducasse 26 Usage Generate an instance: Now we can step through the bytecode with calls methods on a client object for the type of bytecode, e.g. pushReceiver pushConstant: value pushReceiverVariable: offset instrStream := IntructionStream on: aMethod instrStream interpretNextInstructionFor: client

S.Ducasse 27 InstructionClient Abstract superclass Defines empty methods for all methods that InstructionStream calls on a client For convienience: Client don’t need to inherit from this Object subclass: #InstructionClient instanceVariableNames: '' classVariableNames: '' poolDictionaries: '' category: 'Kernel-Methods'

S.Ducasse 28 Example: A test InstructionClientTest>>testInstructions "just interpret all of methods of Object" | methods client scanner| methods := Object methodDict values. client := InstructionClient new. methods do: [:method | scanner := (InstructionStream on: method). [scanner pc <= method endPC] whileTrue: [ self shouldnt: [scanner interpretNextInstructionFor: client] raise: Error. ].

S.Ducasse 29 Example: Printing Bytecodes InstructionPrinter: Print the bytecodes as human readable text Example: print the bytecode of Number>>asInteger: String streamContents: [:str | (InstructionPrinter on: Number>>#asInteger) printInstructionsOn: str ] result: '9 self 10 send: truncated 11 returnTop '

S.Ducasse 30 InstructionPrinter Class Definition: InstructionClient subclass: #InstructionPrinter instanceVariableNames: 'method scanner stream oldPC indent' classVariableNames: '' poolDictionaries: '' category: 'Kernel-Methods'

S.Ducasse 31 InstructionPrinter Main Loop: InstructionPrinter>>printInstructionsOn: aStream "Append to the stream, aStream, a description of each bytecode in the instruction stream." | end | stream := aStream. scanner := InstructionStream on: method. end := method endPC. oldPC := scanner pc. [scanner pc <= end] whileTrue: [scanner interpretNextInstructionFor: self]

S.Ducasse 32 InstructionPrinter Overwrites methods form InstructionClient to print the bytecodes as text e.g. the method for pushReceiver: InstructionPrinter>>pushReceiver "Print the Push Active Context's Receiver on Top Of Stack bytecode." self print: 'self'

S.Ducasse 33 Example: InstVarRefLocator InstructionClient subclass: #InstVarRefLocator instanceVariableNames: 'bingo'.... interpretNextInstructionUsing: aScanner bingo := false. aScanner interpretNextInstructionFor: self. ^bingo popIntoReceiverVariable: offset bingo := true pushReceiverVariable: offset bingo := true storeIntoReceiverVariable: offset bingo := true

S.Ducasse 34 InstVarRefLocator CompiledMethod>>#hasInstVarRef hasInstVarRef "Answer whether the receiver references an instance variable." | scanner end printer | scanner := InstructionStream on: self. printer := InstVarRefLocator new. end := self endPC. [scanner pc <= end] whileTrue: [ (printer interpretNextInstructionUsing: scanner) ifTrue: [^true]. ]. ^false

S.Ducasse 35 InstVarRefLocator Example for a simple bytecode analyzer Usage: aMethod hasInstVarRef (TestCase>>#debug) hasInstVarRef true (Integer>>#+) hasInstVarRef false

S.Ducasse 36 Example: Decompiling to IR BytecodeDecompiler Usage: InstructionStream subclass: #BytecodeDecompiler instanceVariableNames: 'irBuilder blockFlag' classVariableNames: '' poolDictionaries: '' category: 'Compiler-Bytecodes' BytecodeDecompiler new decompile: (Number>>#asInteger)

S.Ducasse 37 Decompiling uses IRBuilder for building IR e.g. code for the bytecode pushReceiver: BytecodeDecompiler>>pushReceiver irBuilder pushReceiver

S.Ducasse 38 ContextPart: Execution Semantics Sometimes we need more than parsing “stepping” in the debugger system simulation for profiling InstructionStream subclass: #ContextPart instanceVariableNames: 'stackp' classVariableNames: 'PrimitiveFailToken QuickStep' poolDictionaries: '' category: 'Kernel-Methods'

S.Ducasse 39 Simulation Provides a complete Bytecode interpreter Run a block with the simulator: (ContextPart runSimulated: [3 factorial])

S.Ducasse 40 Profiling: MesageTally Usage: Other example: MessageTally tallySends: [3 factorial] This simulation took 0.0 seconds. **Tree** 1 SmallInteger(Integer)>>factorial MessageTally tallySends: [’3’ + 1] **Leaves** 4 SmallInteger(Integer)>>factorial 3 SmallInteger(Integer)>>factorial 1 BlockContext>>DoIt

S.Ducasse 41