Dynamic Compilation Vijay Janapa Reddi

Slides:



Advertisements
Similar presentations
Part IV: Memory Management
Advertisements

Programming Technologies, MIPT, April 7th, 2012 Introduction to Binary Translation Technology Roman Sokolov SMWare
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Memory Protection: Kernel and User Address Spaces  Background  Address binding  How memory protection is achieved.
Exceptional Control Flow Processes Today. Control Flow Processors do only one thing: From startup to shutdown, a CPU simply reads and executes (interprets)
Secure In-VM Monitoring Using Hardware Virtualization Monirul Sharif, Wenke Lee, Weidong Cui, and Andrea Lanzi Presented by Tyler Bletsch.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
Microprocessors General Features To be Examined For Each Chip Jan 24 th, 2002.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 3: Input/output and co-processors dr.ir. A.C. Verschueren.
Architectural Support for OS March 29, 2000 Instructor: Gary Kimura Slides courtesy of Hank Levy.
Pipeline Exceptions & ControlCSCE430/830 Pipeline: Exceptions & Control CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
Advanced OS Chapter 3p2 Sections 3.4 / 3.5. Interrupts These enable software to respond to signals from hardware. The set of instructions to be executed.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
KVM/ARM: The Design and Implementation of the Linux ARM Hypervisor Fall 2014 Presented By: Probir Roy.
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
Tanenbaum 8.3 See references
Protection and the Kernel: Mode, Space, and Context.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-7 Memory Management (1) Department of Computer Science and Software.
Operating System Support for Virtual Machines Samuel T. King, George W. Dunlap,Peter M.Chen Presented By, Rajesh 1 References [1] Virtual Machines: Supporting.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto OS-Related Hardware.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming  To allocate scarce memory resources.
Virtual 8086 Mode  The supports execution of one or more 8086, 8088, 80186, or programs in an protected-mode environment.  An 8086.
Exceptional Control Flow Topics Exceptions except1.ppt CS 105 “Tour of the Black Holes of Computing”
Processes and Virtual Memory
Full and Para Virtualization
Efficient software-based fault isolation Robert Wahbe, Steven Lucco, Thomas Anderson & Susan Graham Presented by: Stelian Coros.
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
University of Washington Roadmap 1 car *c = malloc(sizeof(car)); c->miles = 100; c->gals = 17; float mpg = get_mpg(c); free(c); Car c = new Car(); c.setMiles(100);
Smalltalk Implementation Harry Porter, October 2009 Smalltalk Implementation: Optimization Techniques Prof. Harry Porter Portland State University 1.
Virtual Machine Monitors
Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir. A.C. Verschueren Eindhoven University of Technology Section of Digital.
CS352H: Computer Systems Architecture
Exceptional Control Flow
CS703 - Advanced Operating Systems
Today How was the midterm review? Lab4 due today.
Exceptional Control Flow
Exceptional Control Flow
Module: Handling Exceptions
OS Virtualization.
Memory Management Lectures notes from the text supplement by Siberschatz and Galvin Modified by B.Ramamurthy 11/12/2018.
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Page Replacement.
Chapter 8: Memory management
Lecture Topics: 11/1 General Operating System Concepts Processes
Architectural Support for OS
CSE 451: Operating Systems Autumn 2005 Memory Management
Operating Systems Lecture 3.
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Autumn 2003 Lecture 2 Architectural Support for Operating Systems Hank Levy 596 Allen Center 1.
CSE 451: Operating Systems Autumn 2001 Lecture 2 Architectural Support for Operating Systems Brian Bershad 310 Sieg Hall 1.
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
CSE 451: Operating Systems Winter 2007 Module 2 Architectural Support for Operating Systems Brian Bershad 562 Allen Center 1.
CSE 451: Operating Systems Autumn 2003 Lecture 9 Memory Management
Introduction to Virtual Machines
CSE 451: Operating Systems Winter 2003 Lecture 2 Architectural Support for Operating Systems Hank Levy 412 Sieg Hall 1.
Architectural Support for OS
CSE 471 Autumn 1998 Virtual memory
Introduction to Virtual Machines
Where is all the knowledge we lost with information? T. S. Eliot
Chapter 11 Processor Structure and function
Translation Buffers (TLBs)
Interrupts and exceptions
Review What are the advantages/disadvantages of pages versus segments?
Interrupts and System Calls
Dynamic Binary Translators and Instrumenters
Presentation transcript:

Dynamic Compilation Vijay Janapa Reddi The University of Texas at Austin Memory, OS and Exception Emulation

Today’s Class Quick recap of memory state mapping Discuss memory protection Handle OS call and trap emulation

Compatibility How accurately does the emulation of the guest’s functional behavior compare with its behavior on its native platform two systems are compatible if, in response to the same sequence of input values, they give the same sequence of outputs does not take into account performance Intrinsic compatibility precise behavior, difficult to achieve depends on the VM design (only) Extrinsic compatibility accuracy within some well-defined constraints acceptable for most systems depends on the VM design + external dependencies (e.g. libraries)

Compatibility Framework Boundary Compatibility is verified where control is transferred between user code and OS establish one-to-one mapping between control transfer points in both native platform and VM Application Memory Image Emulation Engine traps Application Memory Image OS calls OS Exception Emulator OS calls traps Native Operating System OS Call Emulator Why at the OS-level? Host Operating System

What’s the optimization here? State Mapping Host Register Space Host Registers Map user-managed register & memory state guest data and code map into host’s address space host address space includes runtime data and code guest state does not have to be maintained in the same type of resource Register mapping straight-forward depends on number of guest and host registers Guest Registers VM Data VM Code Guest Data Host ABI Address Space Guest Code

What are the trade-offs? Memory State Mapping Memory address space mapping map guest address space to host address space maintain protection requirements Methods – results in different performance and flexibility levels direct translation software supported translation table What are the trade-offs?

Memory State Mapping – Summary Runtime + guest space <= host space direct memory translation can achieve performance and intrinsic compatibility Runtime + guest space > host space software translation will lose intrinsic compatibility, performance or both guest space == host space happens often, same-ISA dynamic translation “no room” for runtime use software translation, extrinsic compatibility

Memory Architecture Emulation Aspects of the ABI memory architecture need to be emulated Address space structure segmented or flat Access privilege types combination of N, R, W, E Protection / allocation granularity size of the smallest block of memory that OS can allocate Reserved by System Free Committed Reserved 7FFF FFFF 7FFE FFFF 0001 0000 0000 0000

Guest Memory Protection Access restrictions placed on different regions of memory Can be achieved during software supported translation slow and inefficient, but very flexible Host supported memory protection runtime sets access restrictions using OS system calls OS delivers signals to runtime on access violations protection faults reported to runtime requires host OS support

writes cause protection faults Host OS Support Direct mechanism runtime sets protection levels via sys calls (mprotect) protection faults trap to handler in runtime (SIGSEGV) Indirect mechanism mapping region of memory to file with access protections mmap() in linux Physical Memory File (VM Memory) references succeed Free Pages Read-Only Mappings references cause page faults Virtual Machine's references succeed Virtual Address Space writes cause protection faults

Guest Memory Protection (2) Implementation issues What are the two obvious issues here? – think OS page tables...

Guest Memory Protection (2) Implementation issues host and guest ISAs provide different protections host provides a superset of guest protections host provides a subset of guest protections host and guest support different page sizes difficult to map access privileges simple if guest pg size is a multiple of host page size Host Page Guest Page (Data) Guest Page (Code)

Protecting Runtime Memory Runtime and guest application share the same process address space guest program can read/write portions of the runtime Addressing software translation tables hardware address translation, aided with runtime software protection checking hardware for both address translation & protection checking OS sets protections for emulation mode and runtime mode

Protecting Runtime Memory (2) Runtime Data Runtime Code Code Cache Guest Data Guest Code N Runtime Data Runtime Code Code Cache Guest Data Guest Code N Change protections on context switch from runtime to translated code Translated code can only access guest memory image Translated code cannot jump outside code cache (emulation s/w sets up links) Multiple system calls at context switch time high overhead R/W N Ex N R/W Ex N N R/W R/W R/W R Why worry about all this? Inside VM mode Emulation mode

Self-Referencing/Modifying Code Program may refer to itself Program may attempt to modify itself Why do programs do this? Why do programs do this?

Protecting Against SRC/SMC Solution maintain guest program code memory image load/store addresses are mapped into source memory region loads from code region are OK writes to code region trigger segmentation fault flush relevant cache entry, enable writes to code region, interpret the code block that caused the fault, re-enable write-protection SMC…

Allowing/Detecting SRC/SMC data data original code translated code original code translated code self reference Translator Translator write protected Self–referencing code Self–modifying code

Exception Emulation Types of exceptions Precise exceptions trap: produced by a specific program instruction during program execution interrupt: an external event, not associated with a particular instruction Precise exceptions all prior instructions have committed none of the following instructions have committed Further division of exceptions for a process VM ABI visible: exceptions returned to the application via an OS signal memory protection fault ABI invisible: ABI is unaware of the exception’s occurrence timer interrupt

Trap Detection Detecting trap conditions Implementation interpretive trap detection: checking trap conditions during interpretation routine trap condition detected by the host OS sometimes referred to as lazy mode Implementation runtime registers all exceptions with the host OS all signals registered by the guest OS are recorded on receiving OS signal, if signal is guest-registered then send to guest signal-handling code else, runtime handles the trap condition special tables needed during binary translation Why?

Interrupt Handling Interrupts are not associated with any instruction a small response latency is acceptable maintaining precise state easier than traps Receiving interrupt during interpretation complete current routine service interrupt Receiving interrupt during binary translation execution may not be at an interruptible point precise recovery at arbitrary points difficult no idea when control will return to the emulation manager from the code cache What’s an example here? How do we handle this?

Interrupt Handling (2) Solving the interrupt response time problem during binary translation on interrupt, control is passed to runtime runtime unlinks the current translation block from the next block control is returned back to translated code control returns to runtime after end of current block runtime handles the interrupt

Determining Precise State Interpreter easy, each source instruction has its own routine source PC and state updated in each instruction routine from … “lec4-threadedinterp”

Determining Precise State (2) Binary Translation hard, first determine the source PC source PC not continuously updated maintain reverse translation table mapping target PC to source PC, inefficient target instruction can map to multiple source instructions target code may be optimized, and re-ordered

Reverse Translation Table source code code cache block A block B block N signal returns target PC 2 trap occurs 1 search side table 3 find corresponding source PC 4 side table target PCs Start PC End PC source PCs Src PC1 Src PC2 Src PCm Src PCn Start PC Src PC1 End PC … . Table performance is key. Why?

Restoring Precise State Register state (during binary translation) 2 cases, based on if source-to-target register mapping remains constant throughout emulation if not constant, side tables can be maintained, or analyze from start of translation block again Memory State (during binary translation) changed by store instructions do not reorder stores, or other potentially trapping instructions with stores restricts optimizations

OS Call Emulation A VM emulates the function or semantics of the guest’s OS calls not emulate individual instructions in the guest OS Different from instruction emulation given enough time, any function can be performed on the input operands to produce a result most ISAs perform same functions, ISA emulation is always possible with OS, it is possible that providing some host function is impossible, operation semantic mismatch

OS Call Emulation (2) Different source and target OS semantic translation of mapping required may be difficult or impossible ad-hoc process on a case-by-case basis Same source and target OS emulate the guest calling convention e.g. arguments by registers versus memory… guest system call jumps to runtime, which provides wrapper code

OS Call Emulation (3) Same source and target OS (cont...) Runtime Source code segment . s_inst1 s_inst2 s_system_call X s_inst4 s_inst5 Code cache segment . t_inst1 t_inst2 jump runtime t_inst4 t_inst5 Binary Translation wrapper code copy/convert arg1 copy/convert arg2 . t_system_call X copy/convert return val return to t_inst4 Same source and target OS (cont...) runtime may handle some guest OS calls itself (signals, memory management) handling abnormal conditions like callbacks, runtime maintaining program control, lack of documentation

Nasty Realities A Windows user process interacts intimately with the OS Callback entry is through specific dispatch points Callback returns are to specific points in the user process code Now you know why…

Nasty Realities Continued - Instruction Set Issues Register architectures register mappings, reservation of special registers Condition codes lazy evaluation as needed Data formats and arithmetic floating point Decimal MMX Address resolution byte vs word addressing Data Alignment natural vs arbitrary Byte order big/little endian