Assemblers, linkers, loaders

Slides:



Advertisements
Similar presentations
Memory Protection: Kernel and User Address Spaces  Background  Address binding  How memory protection is achieved.
Advertisements

MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
Compiler construction in4020 – lecture 10 Koen Langendoen Delft University of Technology The Netherlands.
CS3350B Computer Architecture Winter 2015 Lecture 4
Assembler/Linker/Loader Mooly Sagiv html:// Chapter 4.3 J. Levine: Linkers & Loaders
Linkage Editors Difference between a linkage editor and a linking loader: Linking loader performs all linking and relocation operations, including automatic.
Linking & Loading CS-502 Operating Systems
3. Loaders & Linkers1 Chapter III: Loaders and Linkers Chapter goal: r To realize how a source program be loaded into memory m Loading m Relocation m Linking.
Chapter 3 Loaders and Linkers
Loaders and Linkers CS 230 이준원. 2 Overview assembler –generates an object code in a predefined format »COFF (common object file format) »ELF (executable.
Lecture 10: Linking and loading. Lecture 10 / Page 2AE4B33OSS 2011 Contents Linker vs. loader Linking the executable Libraries Loading executable ELF.
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
CS 31003: Compilers ANIRUDDHA GUPTA 11CS10004 G2 CLASS DATE : 24/07/2013.
Computer Organization CS224 Fall 2012 Lesson 12. Synchronization  Two processors or threads sharing an area of memory l P1 writes, then P2 reads l Data.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Lec 9Systems Architecture1 Systems Architecture Lecture 9: Assemblers, Linkers, and Loaders Jeremy R. Johnson Anatole D. Ruslanov William M. Mongan Some.
Compilation (Semester A, 2013/14) Lecture 13: Assembler, Linker & Loader Noam Rinetzky Slides credit: Eli Bendersky, Mooly Sagiv & Sanjeev Setia.
1 Machine-Independent Features Automatic Library Search automatically incorporate routines from a subprogram library Loading Options.
Assembler/Linker/Loader Mooly Sagiv html:// Chapter 4.3.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Enabling the ARM Learning in INDIA ARM DEVELOPMENT TOOL SETUP.
MIPS coding. SPIM Some links can be found such as:
Topic 2d High-Level languages and Systems Software
C OMPUTER A RCHITECTURE & O RGANIZATION Assemblers and Compilers Engr. Umbreen Sabir Computer Engineering Department, University of Engg. & Technology.
5-1 Chapter 5 - Languages and the Machine Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
CSE451 Linking and Loading Autumn 2002 Gary Kimura Lecture #21 December 9, 2002.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
April 23, 2001Systems Architecture I1 Systems Architecture I (CS ) Lecture 9: Assemblers, Linkers, and Loaders * Jeremy R. Johnson Mon. April 23,
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /29/2013 Lecture 13: Compile-Link-Load Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
COMP3190: Principle of Programming Languages
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
Chapter 13 : Symbol Management in Linking
Different Types of Libraries
Compilation Lecture 13: Course summary: Advanced Topics Noam Rinetzky 1.
CSc 453 Linking and Loading
Linking I Topics Assembly and symbol resolution Static linking Systems I.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Hello world !!! ASCII representation of hello.c.
Object Files & Linking. Object Sections Compiled code store as object files – Linux : ELF : Extensible Linking Format – Windows : PE : Portable Execution.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Lecture 3 Translation.
Instruction Set Architectures
Computer Architecture & Operations I
Computer Science 210 Computer Organization
System Programming and administration
Naming and Binding A computer system is merely a bunch of resources that are glued together with names Thus, much of system design is merely making choices.
The University of Adelaide, School of Computer Science
Linking & Loading.
Separate Assembly allows a program to be built from modules rather than a single source file assembler linker source file.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
Topic 2e High-Level languages and Systems Software
Computer Science 210 Computer Organization
Loaders and Linkers: Features
Machine Independent Features
Memory Management Tasks
Computer Organization and Design Assembly & Compilation
The Assembly Language Level
Linking & Loading CS-502 Operating Systems
Computer Architecture
Program Execution in Linux
10/6: Lecture Topics C Brainteaser More on Procedure Call
Linking & Loading CS-502 Operating Systems
An introduction to systems programming
Lecture 4: Instruction Set Design/Pipelining
Program Assembly.
Computer Architecture and Assembly Language
Computer Architecture and System Programming Laboratory
Presentation transcript:

Assemblers, linkers, loaders Compilation 0368-3133 2016/17a Lecture 12 Assemblers, linkers, loaders Noam Rinetzky

What is a compiler? “A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.” --Wikipedia

Portable/Retargetable code generation Stages of compilation Source code (program) Lexical Analysis Syntax Analysis Parsing Context Analysis Portable/Retargetable code generation Code Generation Target code (executable) Text Token stream AST AST + Sym. Tab. IR Assembly

Compilation  Execution AST + Sym. Tab. Source code (program) Lexical Analysis Syntax Analysis Parsing Context Analysis Portable/Retargetable code generation Target code (executable) IR Text Token stream AST Code Generation Executing program Runtime System Linker Assembler Loader Symbolic Addr Object File Executable File image

Program Runtime State Registers 0x11000 Code 0x22000 Static Data foo, extern_foo printf 0x22000 Static Data G, extern_G 0x33000 Stack x 0x88000 Heap 0x99000

Challenges goto L2  JMP 0x110FF G:=3  MOV 0x2200F, 0..011 foo()  CALL 0x130FF extern_G := 1  MOV 0x2400F, 0..01 extern_foo()  CALL 0x140FF printf()  CALL 0x150FF x:=2  MOV FP+32, 0…010 goto L2  JMP [PC +] 0x000FF 0x11000 Code foo, extern_foo printf 0x22000 Static Data G, extern_G 0x33000 Stack x 0x88000 Heap 0x99000

Assembly  Image Source program Compiler Assembly lang. program (.s) Assembler Machine lang. Module (.o): program (+library) modules Linker “compilation” time “execution” time Executable (“.exe”): Loader Libraries (.o) (dynamic loading) Image (in memory):

Outline Assembly Linker / Link editor Loader Static linking Dynamic linking

Assembly  Image Assembler Compiler Source file (e.g., utils) Assembly (.s) Assembler Compiler Source file (e.g., main) Assembly (.s) Assembler Compiler library Assembly (.s) Object (.o) Object (.o) Object (.o) Linker Executable (“.elf”) Loader Image (in memory):

Assembler Converts (symbolic) assembler to binary (object) code Object files contain a combination of machine instructions, data, and information needed to place instructions properly in memory Yet another(simple) compiler One-to one translation Converts constants to machine repr. (30…011) Resolve internal references Records info for code & data relocation

Object File Format Header: Admin info + “file map” Text Segment Data Segment Relocation Information Symbol Table Debugging Information Header: Admin info + “file map” Text seg.: machine instruction Data seg.: (Initialized) data in machine format Relocation info: instructions and data that depend on absolute addresses Symbol table: “exported” references + unresolved references

Handling Internal Addresses

Resolving Internal Addresses Two scans of the code Construct a table label  address Replace labels with values One scan of the code (Backpatching) Simultaneously construct the table and resolve symbolic addresses Maintains list of unresolved labels Useful beyond assemblers

Backpatching

Handling External Addresses Record symbol table in “external” table Exported (defined) symbols G, foo() Imported (required) symbols Extern_G, extern_bar(), printf() Relocation bits Mark instructions that depend on absolute (fixed) addresses Instructions using globals,

External references resolved by the Linker using the relocation info. Example External references resolved by the Linker using the relocation info.

Example of External Symbol Table

Assembler Summary Converts symbolic machine code to binary addl %edx, %ecx  000 0001 11 010 001 = 01 D1 (Hex) Format conversions 3  0x0..011 or 0x000000110…0 Resolves internal addresses Some assemblers support overloading Different opcodes based on types

Linker Merges object files to an executable Enables separate compilation Combine memory layouts of object modules Links program calls to library routines printf(), malloc() Relocates instructions by adjusting absolute references Resolves references among files

Linker Code Segment 1 Code Segment 1 100 100 Data Code Segment 2 foo Data Segment 1 Code Segment 2 Segment 2 400 100 500 420 580 ext_bar 250 zoo 280 650 Code Segment 1 foo ext_bar() Code Segment 1 100 Data Segment 1 120 200 Code Segment 2 ext_bar 150 zoo 180 300 Data Segment 2 450 380

Relocation information Information needed to change addresses Positions in the code which contains addresses Data Code Two implementations Bitmap Linked-lists

External References The code may include references to external names (identifiers) Library calls External data Stored in external symbol table

Example of External Symbol Table

Example

Linker (Summary) Merge several executables User mode Resolve external references Relocate addresses User mode Provided by the operating system But can be specific for the compiler More secure code Better error diagnosis

Linker Design Issues Merges Retain information about static length Code segments Data segments Relocation bit maps External symbol tables Retain information about static length Real life complications Aggregate initializations Object file formats Large library Efficient search procedures

Loader Brings an executable file from disk into memory and starts it running Read executable file’s header to determine the size of text and data segments Create a new address space for the program Copies instructions and data into memory Copies arguments passed to the program on the stack Initializes the machine registers including the stack ptr Jumps to a startup routine that copies the program’s arguments from the stack to registers and calls the program’s main routine

Program Loading Registers Code Segment 2 Data Segment 2 400 100 500 400 100 500 420 580 ext_bar 250 zoo 280 650 Code Segment 1 Segment 1 foo Registers Code Segment Static Data Stack Heap Loader Image Program Executable

Loader (Summary) Initializes the runtime state Part of the operating system Privileged mode Does not depend on the programming language “Invisible activation record”

Static Linking (Recap) Assembler generates binary code Unresolved addresses Relocatable addresses Linker generates executable code Loader generates runtime states (images)

Dynamic Linking Why dynamic linking? Shared libraries Dynamic loading Save space Consistency Dynamic loading Load on demand

What’s the challenge? Source program Compiler Assembly lang. program (.s) Assembler Machine lang. Module (.o): program (+library) modules Linker “compilation” time “execution” time Executable (“.exe”): Loader Libraries (.o) (dynamic linking) Image (in memory):

Position-Independent Code (PIC) Code which does not need to be changed regardless of the address in which it is loaded Enable loading the same object file at different addresses Thus, shared libraries and dynamic loading “Good” instructions for PIC: use relative addresses relative jumps reference to activation records “Bad” instructions for : use fixed addresses Accessing global and static data Procedure calls Where are the library procedures located?

How? “All problems in computer science can be solved by another level of indirection" Butler Lampson

PIC: The Main Idea Keep the global data in a table Refer to all data relative to the designated register

Per-Routine Pointer Table Record for every routine in a table foo &foo &D.S. 1 &D.S. 2 PT ext_bar &ext_bar &D.S. 2 &zoo &D.S. 2 PT ext_bar

Per-Routine Pointer Table Record for every routine in a table foo &foo &D.S. 1 foo Data Segment 1 Code Segment 2 Segment 2 580 ext_bar zoo Code Segment 1 ext_g &D.S. 2 PT ext_bar &ext_bar &D.S. 2 &zoo &D.S. 2 PT ext_bar

Per-Routine Pointer Table Record for every routine in a table Record used as a address to procedure Caller: Load Pointer table address into RP Load Code address from 0(RP) into RC Call via RC Callee: RP points to pointer table Table has addresses of pointer table for sub-procedures Other data RP .func

PIC: The Main Idea Keep the global data in a table Refer to all data relative to the designated register Efficiency: use a register to point to the beginning of the table Troublesome in CISC machines

ELF-Position Independent Code Executable and Linkable code Format Introduced in Unix System V Observation Executable consists of code followed by data The offset of the data from the beginning of the code is known at compile-time GOT (Global Offset Table) Data Segment Code XX0000 call L2 L2: popl %ebx addl $_GOT[.-..L2], %ebx

ELF: Accessing global data

ELF: Calling Procedures (before 1st call)

ELF: Calling Procedures (after 1st call)

PIC benefits and costs Enable loading w/o relocation Share memory locations among processes Data segment may need to be reloaded GOT can be large More runtime overhead More space overhead

Shared Libraries Heavily used libraries Significant code space 5-10 Mega for print Significant disk space Significant memory space Can be saved by sharing the same code Enforce consistency But introduces some overhead Can be implemented either with static or dynamic loading

Shared Libraries Heavily used libraries Significant code space 5-10 Mega for print Significant disk space Significant memory space Can be saved by sharing the same code Enforce consistency But introduces some overhead

Content of ELF file Program Libraries Call Routine Text Text PLT PLT Data Data GOT GOT

Consistency How to guarantee that the code/library used the “right” library version

Loading Dynamically Linked Programs Start the dynamic linker Find the libraries Initialization Resolve symbols GOT Typically small Library specific initialization Lazy procedure linkage

Microsoft Dynamic Libraries (DLL) Similar to ELF Somewhat simpler Require compiler support to address dynamic libraries Programs and DLL are Portable Executable (PE) Each application has it own address Supports lazy bindings

Dynamic Linking Approaches Unix/ELF uses a single name space space and MS/PE uses several name spaces ELF executable lists the names of symbols and libraries it needs PE file lists the libraries to import from other libraries ELF is more flexible PE is more efficient

Costs of dynamic loading Load time relocation of libraries Load time resolution of libraries and executable Overhead from PIC prolog Overhead from indirect addressing Reserved registers

Summary Code generation yields code which is still far from executable Delegate to existing assembler Assembler translates symbolic instructions into binary and creates relocation bits Linker creates executable from several files produced by the assembly Loader creates an image from executable