CSc 453 Linking and Loading

Slides:



Advertisements
Similar presentations
Hand-Held Devices and Embedded Systems Course Student: Tomás Sánchez López Student ID:
Advertisements

Programs in Memory Bryce Boe 2012/08/29 CS32, Summer 2012 B.
Part IV: Memory Management
MIPS ISA-II: Procedure Calls & Program Assembly. (2) Module Outline Review ISA and understand instruction encodings Arithmetic and Logical Instructions.
Program Development Tools The GNU (GNU’s Not Unix) Toolchain The GNU toolchain has played a vital role in the development of the Linux kernel, BSD, and.
Assembler/Linker/Loader Mooly Sagiv html:// Chapter 4.3 J. Levine: Linkers & Loaders
Linking & Loading CS-502 Operating Systems
Chapter 3 Loaders and Linkers
Chapter 3 Loaders and Linkers. Purpose and Function Places object program in memory Linking – Combines 2 or more obj programs Relocation – Allows loading.
Loaders and Linkers CS 230 이준원. 2 Overview assembler –generates an object code in a predefined format »COFF (common object file format) »ELF (executable.
Lecture 10: Linking and loading. Lecture 10 / Page 2AE4B33OSS 2011 Contents Linker vs. loader Linking the executable Libraries Loading executable ELF.
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
Computer Organization CS224 Fall 2012 Lesson 12. Synchronization  Two processors or threads sharing an area of memory l P1 writes, then P2 reads l Data.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Compilation (Semester A, 2013/14) Lecture 13: Assembler, Linker & Loader Noam Rinetzky Slides credit: Eli Bendersky, Mooly Sagiv & Sanjeev Setia.
Memory Management Chapter 7.
Linking and Loading Fred Prussack CS 518. L&L: Overview Wake-up Questions Terms and Definitions / General Information LoadingLinking –Static vs. Dynamic.
1 Machine-Independent Features Automatic Library Search automatically incorporate routines from a subprogram library Loading Options.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
An introduction to systems programming
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
Software Development and Software Loading in Embedded Systems.
OBJECT MODULE FORMATS. The object module format we have employed as an educational device is called OMF (relocatable object format). It’s one of the earliest.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-7 Memory Management (1) Department of Computer Science and Software.
MIPS coding. SPIM Some links can be found such as:
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Objective At the conclusion of this chapter you will be able to:
Topic 2d High-Level languages and Systems Software
5-1 Chapter 5 - Languages and the Machine Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /29/2013 Lecture 13: Compile-Link-Load Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
Linking Ⅱ.
Static Shared Library. Non-shared v.s. Shared Library A library is a collection of pre-written function calls. Using existing libraries can save a programmer.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
1 CS503: Operating Systems Spring 2014 Part 0: Program Structure Dongyan Xu Department of Computer Science Purdue University.
Chapter 13 : Symbol Management in Linking
Different Types of Libraries
NETW3005 Memory Management. Reading For this lecture, you should have read Chapter 8 (Sections 1-6). NETW3005 (Operating Systems) Lecture 07 – Memory.
The Assembly Language Level Part C – Linking and Loading.
Loader and Linker.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Hello world !!! ASCII representation of hello.c.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 10 – Loaders.
Object Files & Linking. Object Sections Compiled code store as object files – Linux : ELF : Extensible Linking Format – Windows : PE : Portable Execution.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Lecture 3 Translation.
Assemblers, linkers, loaders
Instruction Set Architectures
Computer Architecture & Operations I
System Programming and administration
The University of Adelaide, School of Computer Science
Linking & Loading.
Separate Assembly allows a program to be built from modules rather than a single source file assembler linker source file.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
Loaders and Linkers: Features
Machine Independent Features
Memory Management Tasks
Computer Organization and Design Assembly & Compilation
The Assembly Language Level
Linking & Loading CS-502 Operating Systems
Program Execution in Linux
10/6: Lecture Topics C Brainteaser More on Procedure Call
Linking & Loading CS-502 Operating Systems
An introduction to systems programming
Program Assembly.
Chapter 3 Loaders and Linkers
Presentation transcript:

CSc 453 Linking and Loading Saumya Debray The University of Arizona Tucson

Tasks in Executing a Program Compilation and assembly. Translate source program to machine language. The result may still not be suitable for execution, because of unresolved references to external and library routines. Linking. Bring together the binaries of separately compiled modules. Search libraries and resolve external references. Loading. Bring an object program into memory for execution. Allocate memory, initialize environment, maybe fix up addresses. CSc 453: Linking and Loading

Contents of an Object File Header information Overall information about the file and its contents. Object code and data Relocations (may be omitted in executables) Information to help fix up the object code during linking. Symbol table (optional) Information about symbols defined in this module and symbols to be imported from other modules. Debugging information (optional) CSc 453: Linking and Loading

Example: ELF Files (x86/Linux) Linkable sections Executable segments ELF Header (optional, ignored) Program Header Table describes sections sections segments Section Header Table CSc 453: Linking and Loading

CSc 453: Linking and Loading ELF Files: cont’d ELF Header structure 16 bytes ELF file identifying information (magic no., addr size, byte order) 2 bytes object file type (relocatable, executable, shared object, etc.) machine info 4 bytes object file version entry point (address where execution begins) offset of program header table offset of section header table processor-specific flags ELF header size (in bytes) size of each entry in program header table no. of entries in program header table size of section header table (in bytes) no. of entries in section header table index of section name string table in in the section header table CSc 453: Linking and Loading

CSc 453: Linking and Loading Elf Files: cont’d Section Header structure 4 bytes section name (“.text”, “.data”, “.rodata”, etc.), given as an index into the section header string table section section type (specifies section contents and semantics) assorted section flags the address within a process where the section should begin (if the section actually appears in the executing process) byte offset from the beginning of the file to the first byte of the section section size (in bytes) index link (special information, depending on section type) special information, depending on section type address alignment constraints for the section, if any size of each entry in the section, for sections with fixed-size entries (e.g., symbol table) CSc 453: Linking and Loading

Linker Functions 1: Fixing Addresses Addresses in an object file are usually relative to the start of the code or data segment in that file. When different object files are combined: The same kind of segments (text, data, read-only data, etc.) from the different object files get merged. Addresses have to be “fixed up” to account for this merging. The fixing up is done by the linker, using information embedded in the executable for this purpose (“relocations”). CSc 453: Linking and Loading

CSc 453: Linking and Loading Relocation: Example CSc 453: Linking and Loading

Linker Function 2: Symbol Resolution Suppose: module B defines a symbol x; module A refers to x. The linker must: determine the location of x in the object module obtained from merging A and B; and modify references to x (in both A and B) to refer to this location. CSc 453: Linking and Loading

Information for Symbol Resolution Each linkable module contains a symbol table, whose contents include: Global symbols defined (maybe referenced) in the module. Global symbols referenced but not defined in the module (these are generally called externals). Segment names (e.g., text, data, rodata). These are usually considered to be global symbols defined to be at the beginning of the segment. Non-global symbols and line number information (optional), for debuggers. CSc 453: Linking and Loading

Actions Performed by a Linker Usually, linkers make two passes: Pass 1: Collect information about each of the object modules being linked. Pass 2: Construct the output, carrying out address relocation and symbol resolution using the information collected in Pass 1. CSc 453: Linking and Loading

CSc 453: Linking and Loading Linker Actions: Pass 1 Construct a table of all the object modules and their lengths. Based on this table, assign a load address to each module. For each module: Read in its symbol table into a global symbol table in the linker. Determine the address of each symbol defined in the module in the output: Use the symbol value together with the module load address. CSc 453: Linking and Loading

CSc 453: Linking and Loading Linker Actions: Pass 2 Copy the object modules in the order of their load addresses: Address relocation: find each instruction that contains a memory address; to each such address, add a relocation constant equal to the load address for its module. External symbol resolution: For each instruction that references an external object, insert the actual address for that object. CSc 453: Linking and Loading

Relocation Example: ELF (x86/Linux) ELF relocation entries take one of two forms: typedef struct { typedef struct { Addr32 offset; Addr32 offset; Word32 info; Word32 info; } SignedWord32 addend; } offset : specifies the location where to apply the relocation action. info : gives the symbol table entry w.r.t. which the relocation should be made, and the type of relocation to apply. E.g.: for a call instruction, the info field gives the index of the callee. addend : a value to be added explicitly during relocation. Depending on the architecture, one form or the other may be more convenient. CSc 453: Linking and Loading

CSc 453: Linking and Loading Programs are usually loaded at a fixed address in a fresh address space (so can be linked for that address). In such systems, loading involves the following actions: determine how much address space is needed from the object file header; allocate that address space; read the program into the segments in the address space; zero out any uninitialized data (“.bss” segment) if not done automatically by the virtual memory system. create a stack segment; set up any runtime information, e.g., program arguments or environment variables. start the program executing. CSc 453: Linking and Loading

Position-Independent Code (PIC) If the load address for a program is not fixed (e.g., shared libraries), we use position independent code. Basic idea: separate code from data; generate code that doesn’t depend on where it is loaded. PC-relative addressing can give position-independent code references. This may not be enough, e.g.: data references, instruction peculiarities (e.g., call instruction in Intel x86) may not permit the use of PC-relative addressing. CSc 453: Linking and Loading

PIC (cont’d): ELF Files ELF executable file characteristics: data pages follow code pages; the offset from the code to the data does not depend on where the program is loaded. The linker creates a global offset table (GOT) that contains offsets to all global data used. If a program can load its own address into a register, it can then use a fixed offset to access the GOT, and thence the data. CSc 453: Linking and Loading

CSc 453: Linking and Loading PIC code on ELF: cont’d Code to figure out its own address (x86): call L /* push address of next instruction on stack */ L: pop %ebx /* pop address of this instruction into %ebx */ Accessing a global variable x in PIC: GOT has an entry, say at position k, for x. The dynamic linker fills in the address of x into this entry at load time. Compute “my address” into a register, say %ebx (above); %ebx += offset_to_GOT; /* fixed for a given program */ %eax = contents of location k(%ebx) /* %eax = addr. of x */ access memory location pointed at by %eax; CSc 453: Linking and Loading

CSc 453: Linking and Loading PIC on ELF: Example (Based on Linkers and Loaders, by J. R. Levine (Morgan Kaufman, 2000)) CSc 453: Linking and Loading

PIC: Advantages and Disadvantages Code does not have to be relocated when loaded. (However, data still need to be relocated.) Different processes can share the memory pages of code, even if they don’t have the same address space allocated. Disadvantages: GOT needs to be relocated at load time. In big libraries, GOT can be very large, so this may be slow. PIC code is bigger and slower than non-PIC code. The slowdown is architecture dependent (in an architecture with few registers, using one to hold GOT address can affect code quality significantly.) CSc 453: Linking and Loading

CSc 453: Linking and Loading Shared Libraries Have a single copy of the library that is used by all running programs. Saves (disk and memory) space by avoiding replication of library code. Virtual memory management in the OS allows different processes to share “read-only” pages, e.g., text and read-only data. This lets us get by with a single physical-memory copy of shared library code. CSc 453: Linking and Loading

Shared Libraries: cont’d At link time, the linker: Searches a (specified) set of libraries, in some fixed order, to find modules that resolve any undefined external symbols. puts a list of libraries containing such modules into the executable. At load time, the startup code: finds these libraries; maps them into the program’s address space; carries out library-specific initialization. Startup code may be in the OS, in the executable, or in a special dynamic linker. CSc 453: Linking and Loading

Statically Linked Shared Libraries Program and data are bound to executables at link time. Each library is pre-allocated an appropriate amount of address space. The system has a master table of shared-library address space: libraries start somewhere far away from application code, e.g., at 0x60000000 on Linux; read-only portions of the libraries can be shared between processes. CSc 453: Linking and Loading

CSc 453: Linking and Loading Dynamic Linking Defers much of the linking process until the program starts running. Easier to create, update than statically linked shared libraries. Has higher runtime performance cost than statically linked libraries: Much of the linking process has to be redone each time a program runs. Every dynamically linked symbol has to be looked up in the symbol table and resolved at runtime. CSc 453: Linking and Loading

Dynamic Linking: Basic Mechanism A reference to a dynamically linked procedure p is mapped to code that invokes a handler. At runtime, when p is called, the handler gets executed: The handler checks to see whether p has been loaded already (due to some other reference); if so, the current reference is linked in, and execution continues normally. otherwise, the code for p is loaded and linked in. CSc 453: Linking and Loading

Dynamic Linking: ELF Files ELF shared libraries use PIC (position independent code), so text sections do not need relocation. Data references use a GOT: each global symbol has a relocatable pointer to it in the GOT; the dynamic linker relocates these pointers. We still need to invoke the dynamic linker on the first reference to a dynamically linked procedure. Done using a procedure linkage table (PLT); PLT adds a level of indirection for function calls (analogous to the GOT for data references). CSc 453: Linking and Loading

ELF Dynamic Linking: PLT and GOT CSc 453: Linking and Loading

ELF Dynamic Linking: Lazy Linkage Before: Initially, GOT entry points to PLT code that invokes the dynamic linker. offset identifies both the symbol being resolved and the corresponding GOT entry. The dynamic linker looks up the symbol value and updates the GOT entry. Subsequent calls bypass dynamic linker, go directly to callee. This reduces program startup time. Also, routines that are never called are not resolved. PLT GOT call p jmp *(GOT + m) push offset jmp dyn_linker m After: PLT GOT call p jmp *(GOT + m) push offset jmp dyn_linker m p’s code CSc 453: Linking and Loading