Linking Alan L. Cox Some slides adapted from CMU 15.213 slides.

Slides:



Advertisements
Similar presentations
Programs in Memory Bryce Boe 2012/08/29 CS32, Summer 2012 B.
Advertisements

Part IV: Memory Management
Fabián E. Bustamante, Spring 2007 Linking Today Static linking Object files Static & dynamically linked libraries Next time Exceptional control flows.
Program Development Tools The GNU (GNU’s Not Unix) Toolchain The GNU toolchain has played a vital role in the development of the Linux kernel, BSD, and.
Assembler/Linker/Loader Mooly Sagiv html:// Chapter 4.3 J. Levine: Linkers & Loaders
Linking & Loading CS-502 Operating Systems
Chapter 3 Loaders and Linkers
Linker and Loader. Program building into four stages (C Program) Preprocessing (Preprocessor) It processes include files, conditional compilation instructions.
Mr. D. J. Patel, AITS, Rajkot 1 Operating Systems, by Dhananjay Dhamdhere1 Static and Dynamic Memory Allocation Memory allocation is an aspect of a more.
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Memory Management Chapter 7.
Prof. Necula CS 164 Lecture 141 Run-time Environments Lecture 8.
Winter 2015 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University Linking.
The Environment of a UNIX Process. Introduction How is main() called? How are arguments passed? Memory layout? Memory allocation? Environment variables.
Generating Programs and Linking Professor Rick Han Department of Computer Science University of Colorado at Boulder.
CS 536 Spring Run-time organization Lecture 19.
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
“The course that gives CMU its Zip!”
CS 330 What software engineers need to know about linking and a few things about execution.
Linking Topics Static linking Object files Static libraries Loading Dynamic linking of shared libraries CS213.
Chapter 91 Memory Management Chapter 9   Review of process from source to executable (linking, loading, addressing)   General discussion of memory.
Security Exploiting Overflows. Introduction r See the following link for more info: operating-systems-and-applications-in-
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Topic 2d High-Level languages and Systems Software
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Linking February 20, 2001 Topics static linking object files static libraries loading dynamic linking of shared libraries class16.ppt “The course.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /29/2013 Lecture 13: Compile-Link-Load Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
Linking Topics Static linking Object files Static libraries Loading.
Linking Topics Static linking Dynamic linking Case study: Library interpositioning.
Linking Ⅱ.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
Linking Summer 2014 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University.
1 CS503: Operating Systems Spring 2014 Part 0: Program Structure Dongyan Xu Department of Computer Science Purdue University.
Chapter 13 : Symbol Management in Linking
1 Linking. 2 Outline What is linking and why linking Complier driver Static linking Symbols & Symbol Table Suggested reading: 7.1~7.5.
CSc 453 Linking and Loading
CS252: Systems Programming Ninghui Li Based on Slides by Gustavo Rodriguez-Rivera Topic 2: Program Structure and Using GDB.
Linking I Topics Assembly and symbol resolution Static linking Systems I.
Program Translation and Execution I: Linking Sept. 29, 1998 Topics object files linkers class11.ppt Introduction to Computer Systems.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Hello world !!! ASCII representation of hello.c.
Linking Winter 2013 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University.
Program Execution and ELF Files Extended System Programming Laboratory (ESPL) CS BGU Fall 2013/2014 Abed Asi.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Alan L. Cox Linking Alan L. Cox Some slides adapted from CMU slides.
Lecture 3 Translation.
Slides adapted from Bryant and O’Hallaron
CS 367 Linking Topics Static linking Make Utility Object files
Overview of today’s lecture
Linking Topics Static linking Object files Static libraries Loading
Linking & Loading.
Memory Management © 2004, D. J. Foreman.
Chapter 8 Main Memory.
Separate Assembly allows a program to be built from modules rather than a single source file assembler linker source file.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
User-Defined Functions
Topic 2e High-Level languages and Systems Software
Generating Programs and Linking
“The course that gives CMU its Zip!”
Memory Allocation CS 217.
“The course that gives CMU its Zip!”
Linking.
Linking & Loading CS-502 Operating Systems
Program Execution in Linux
Linking & Loading CS-502 Operating Systems
Instructors: Majd Sakr and Khaled Harras
“The course that gives CMU its Zip!”
Presentation transcript:

Linking Alan L. Cox Some slides adapted from CMU slides

Cox / RixnerLinking2 x86-64 Assembly Brief overview last time  Lecture notes available on the course web page and x86-64 assembly overview in the textbook You will not write any x86-64 assembly  Need to be able to recognize/understand code fragments  Need to be able to correlate assembly code to source code More assembly examples today

Objectives Be able to do your homework assignment on linking Understand how C type attributes (e.g. static, extern) control memory allocation for variables Be able to recognize some of the pitfalls when developing modular programs Appreciate how programs can optimize for efficiency, modularity, evolvability Cox / RixnerLinking3

Cox / RixnerLinking4 Example Program (2.c files) /* main.c */ void swap(void); int buf[2] = {1, 2}; int main(void) { swap(); return (0); } /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; void swap(void) { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; }

An Analogy for Linking Cox / RixnerLinking5

Cox / RixnerLinking6 Linking: collecting and combining various pieces of code and data into a single file that can be loaded into memory and executed Why learn about linking?  Make you a better jigsaw puzzle solver!  It will help you build large programs  It will help you avoid dangerous program errors  It will help you understand how language scoping rules are implemented  It will help you understand other important systems concepts (that are covered later in the class)  It will enable you to exploit shared libraries

Cox / RixnerLinking7 Compilation UNIX% gcc -v -O -g -o p main.c swap.c … cc1 -quiet -v main.c -quiet -dumpbase main.c -mtune=generic -auxbase main -g -O -version -o /tmp/cchnheja.s as -V -Qy -o /tmp/ccmNFRZd.o /tmp/cchnheja.s … cc1 -quiet -v swap.c -quiet -dumpbase swap.c -mtune=generic -auxbase swap -g -O -version -o /tmp/cchnheja.s as -V -Qy -o /tmp/ccx8FECg.o /tmp/ccheheja.s … collect2 --eh-frame-hdr –m elf_x86_64 --hash-style=gnu -dynamic- linker /lib64/ld-linux-x86-64.so.2 -o p crt1.o crti.o crtbegin.o –L /tmp/ccmNFRZd.o /tmp/ccx8FECg.o –lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed crtend.o crtn.o Compiler:.c C source code to.s assembly code Assembler:.s assembly code to.o relocatable object code Linker:.o to executable

Cox / RixnerLinking8 Compilation main.cswap.c ld (collect2) p ELF Format Files Linking step main.o as swap.o as Relocatable object code Executable C source code cc1 main.s cc1 swap.s Assembly code UNIX% gcc -O –g -o p main.c swap.c

Cox / RixnerLinking9 ELF (Executable Linkable Format) Order & existence of segments is arbitrary, except ELF header must be present and first ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking10 ELF Header Basic description of file contents:  File format identifier  Endianness  Alignment requirement for other sections  Location of other sections  Code’s starting address  … ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking11 Program and Section Headers Info about other sections necessary for loading  Required for executables & libraries Info about other sections necessary for linking  Required for relocatables ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking12 Text Section Machine instruction code  read-only ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking13 Data Sections Static data  initialized, read-only  initialized, read/write  uninitialized, read/write (BSS = “Block Started by Symbol” pseudo-op for IBM 704) Initialized  Initial values in ELF file Uninitialized  Only total size in ELF file Writable distinction enforced at run-time  Why? Protection; sharing  How? Virtual memory ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking14 Relocation Information Describes where and how symbols are used  A list of locations in the.text section that will need to be modified when the linker combines this object file with others  Relocation information for any global variables that are referenced or defined by the module  Allows object files to be easily relocated ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking15 Debug Section Relates source code to the object code within the ELF file ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking16 Other Sections Other kinds of sections also supported, including:  Other debugging info  Version control info  Dynamic linking info  C++ initializing & finalizing code ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata

Cox / RixnerLinking17 Symbol Table Describes where global variables and functions are defined  Present in all relocatable ELF files ELF header 0 Program header table.text.data.bss.symtab.rel.text.rel.data.debug Section header table ….rodata /* main.c */ void swap(void); int buf[2] = {1, 2}; int main(void) { swap(); return (0); }

Cox / RixnerLinking18 Linker Symbol Classification Global symbols  Symbols defined by module m that can be referenced by other modules  C: non- static functions & global variables External symbols  Symbols referenced by module m but defined by some other module  C: extern functions & variables Local symbols  Symbols that are defined and referenced exclusively by module m  C: static functions & variables Local linker symbols  local function variables!

Cox / RixnerLinking19 Linker Symbols /* main.c */ void swap(void); int buf[2] = {1, 2}; int main(void) { swap(); return (0); } /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; void swap(void) { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } Definition of global symbols buf and main Reference to external symbol swap Definition of global symbol swap Definition of global symbols bufp0 and bufp1 (even though not used outside file) Reference to external symbol buf Linker knows nothing about local variables

Cox / RixnerLinking20 Linker Symbols What’s missing?  swap – where is it? /* main.c */ void swap(void); int buf[2] = {1, 2}; int main(void) { swap(); return (0); } UNIX% gcc -O -c main.c UNIX% readelf -s main.o Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name … 8: FUNC GLOBAL DEFAULT 1 main 9: NOTYPE GLOBAL DEFAULT UND swap 10: OBJECT GLOBAL DEFAULT 3 buf main is a 19-byte function located at offset 0 of section 1 (.text) swap is referenced in this file, but is undefined (UND) buf is an 8-byte object located at offset 0 of section 3 (.data) use readelf –S to see sections

Cox / RixnerLinking21 Linker Symbols What’s missing?  buf – where is it? /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; void swap(void) { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } Symbol table '.symtab' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 8: FUNC GLOBAL DEFAULT 1 swap 9: NOTYPE GLOBAL DEFAULT UND buf 10: OBJECT GLOBAL DEFAULT COM bufp1 11: OBJECT GLOBAL DEFAULT 3 bufp0 swap is a 38-byte function located at offset 0 of section 1 (.text) buf is referenced in this file, but is undefined (UND) bufp1 is an 8-byte uninitialized (COMMON) object with an 8-byte alignment requirement bufp0 is an 8-byte object located at offset 0 of section 3 (.data)

Cox / RixnerLinking22 Linking Steps Symbol Resolution  Determine where symbols are located and what size data/code they refer to Relocation  Combine modules, relocate code/data, and fix symbol references based on new locations ld (collect2) p main.oswap.o Relocatable object code Executable

Cox / RixnerLinking23 Problem: Undefined Symbols Missing symbols are not compiler errors  May be defined in another file  Compiler just inserts an undefined entry in the symbol table During linking, any undefined symbols that cannot be resolved cause an error UNIX% gcc -O -o p main.c /tmp/cccpTy0d.o: In function `main’: main.c:(.text+0x5): undefined reference to `swap’ collect2: ld returned 1 exit status UNIX% forgot to type swap.c

Cox / RixnerLinking24 Problem: Multiply Defined Symbols Different files could define the same symbol  Is this an error?  If not, which one should be used? One or many?

Cox / RixnerLinking25 Linking: Example int x = 3; int y = 4; int z; int foo(int a) {…} int bar(int b) {…} extern int x; static int y = 6; int z; int foo(int a); static int bar(int b) {…} ? ? Note: Linking uses object files Examples use source-level for convenience

Cox / RixnerLinking26 Linking: Example int x = 3; int y = 4; int z; int foo(int a) {…} int bar(int b) {…} extern int x; static int y = 6; int z; int foo(int a); static int bar(int b) {…} int x = 3; int foo(int a) {…} Defined in one fileDeclared in other files Only one copy exists

Cox / RixnerLinking27 Linking: Example int x = 3; int y = 4; int z; int foo(int a) {…} int bar(int b) {…} extern int x; static int y = 6; int z; int foo(int a); static int bar(int b) {…} int x = 3; int y = 4; int y’ = 6; int foo(int a) {…} int bar(int b) {…} int bar’(int b) {…} Private names not in symbol table. Can’t conflict with other files’ names Renaming is a convenient source-level way to understand this

Cox / RixnerLinking28 Linking: Example int x = 3; int y = 4; int z; int foo(int a) {…} int bar(int b) {…} extern int x; static int y = 6; int z; int foo(int a); static int bar(int b) {…} int x = 3; int y = 4; int y’ = 6; int z; int foo(int a) {…} int bar(int b) {…} int bar’(int b) {…} C allows you to omit “ extern ” in some cases – Don’t!

Cox / RixnerLinking29 Strong & Weak Symbols Program symbols are either strong or weak strongprocedures & initialized globals weakuninitialized globals int foo=5; p1() {} int foo; p2() {} p1.cp2.c strong weak strong

Cox / RixnerLinking30 Strong & Weak Symbols A strong symbol can only appear once A weak symbol can be overridden by a strong symbol of the same name  References to the weak symbol resolve to the strong symbol If there are multiple weak symbols, the linker can pick an arbitrary one

Cox / RixnerLinking31 Linker Puzzles: What Happens? int x; p1() {} int x; p2() {} int x; int y; p1() {} double x; p2() {} int x=7; int y=5; p1() {} double x; p2() {} int x=7; p1() {} int x; p2() {} int x; p1() {} Link time error: two strong symbols p1 References to x will refer to the same uninitialized int. Is this what you really want? Writes to x in p2 might overwrite y ! Evil! Writes to x in p2 will overwrite y ! Nasty! Nightmare scenario: replace r.h.s. int with a struct type, each file then compiled with different alignment rules References to x will refer to the same initialized variable

Cox / RixnerLinking32 Advanced Note: Name Mangling Other languages (i.e. Java and C++) allow overloaded methods  Functions then have the same name but take different numbers/types of arguments  How does the linker disambiguate these symbols? Generate unique names through mangling  Mangled names are compiler dependent  Example: class “ Foo ”, method “ bar(int, long) ”: bar__3Fooil _ZN3Foo3BarEil  Similar schemes are used for global variables, etc.

Cox / RixnerLinking33 Linking Steps Symbol Resolution  Determine where symbols are located and what size data/code they refer to Relocation  Combine modules, relocate code/data, and fix symbol references based on new locations ld (collect2) p main.oswap.o Relocatable object code Executable

Cox / RixnerLinking34.symtab & Pseudo-Instructions in main.s.file "main.c".text.globl main.type main:.LFB2: subq $8, %rsp.LCFI0: call swap movl $0, %eax addq $8, %rsp ret UNIX% gcc -O -c main.c UNIX% readelf -s main.o Symbol table '.symtab' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name … 8: FUNC GLOBAL DEFAULT 1 main 9: NOTYPE GLOBAL DEFAULT UND swap 10: OBJECT GLOBAL DEFAULT 3 buf.LFE2:.size main,.-main.globl buf.data.align 4.type buf, 8 buf:.long 1.long 2....

Cox / RixnerLinking35.symtab & Pseudo-Instructions in swap.s.file "swap.c".text.globl swap.type swap:.LFB2: movq $buf+4, bufp1(%rip) movq bufp0(%rip), %rdx movl (%rdx), %ecx movl buf+4(%rip), %eax movl %eax, (%rdx) movq bufp1(%rip), %rax movl %ecx, (%rax) ret.LFE2:.size swap,.-swap.globl bufp0.data.align 8.type bufp0, 8 bufp0:.quad buf.comm bufp1,8,8.... Symbol table '.symtab' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 8: FUNC GLOBAL DEFAULT 1 swap 9: NOTYPE GLOBAL DEFAULT UND buf 10: OBJECT GLOBAL DEFAULT COM bufp1 11: OBJECT GLOBAL DEFAULT 3 bufp0

Cox / RixnerLinking36 Symbol Resolution Undefined symbols must be resolved  Where are they located  What size are they? Linker looks in the symbol tables of all relocatable object files  Assuming every unknown symbol is defined once and only once, this works well main.o.text ….symtab.data swap.o.text ….symtab.data where’s swap? where’s buf?

Cox / RixnerLinking37 Relocation Once all symbols are resolved, must combine the input files  Total code size is known  Total data size is known  All symbols must be assigned run-time addresses Sections must be merged  Only one text, data, etc. section in final executable  Final run-time addresses of all symbols are defined Symbol references must be corrected  All symbol references must now refer to their actual locations

Cox / RixnerLinking38 Relocation: Merging Files main.o.text ….symtab.data swap.o.text ….symtab.data p.text ….symtab.data

Cox / RixnerLinking39 Linking: Relocation /* main.c */ void swap(void); int buf[2] = {1, 2}; int main(void) { swap(); return (0); } UNIX% objdump -r -d main.o main.o: file format elf64-x86-64 Disassembly of section.text: : 0: ec 08 sub $0x8,%rsp 4: e callq 9 5: R_X86_64_PC32 swap+0xfffffffffffffffc 9: b mov $0x0,%eax e: c4 08 add $0x8,%rsp 12: c3 retq can also use readelf –r to see relocation information Offset into text section (relocation information is stored in a different section of the file) Type of symbol (PC relative 32-bit signed)Symbol name

Cox / RixnerLinking40 Linking: Relocation /* swap.c */ extern int buf[]; int *bufp0 = &buf[0]; int *bufp1; void swap() { int temp; bufp1 = &buf[1]; temp = *bufp0; *bufp0 = *bufp1; *bufp1 = temp; } UNIX% objdump -r -D swap.o swap.o: file format elf64-x86-64 Disassembly of section.text: : 0: 48 c movq $0x0,0(%rip) 7: : R_X86_64_PC32 bufp1+0xfffffffffffffff8 7: R_X86_64_32S buf+0x4 Disassembly of section.data: :... 0: R_X86_64_64 buf Need relocated address of bufp1Need to initialize bufp0 with &buf[0] (== buf)Need relocated address of buf[]

Cox / RixnerLinking41 After Relocation : 0: ec 08 sub $0x8,%rsp 4: e callq 9 5: R_X86_64_PC32 swap+0xfffffffffffffffc 9: b mov $0x0,%eax e: c4 08 add $0x8,%rsp 12: c3 retq : : ec 08 sub $0x8,%rsp 40044c: e8 0b callq 40045c : b mov $0x0,%eax : c4 08 add $0x8,%rsp 40045a: c3 retq 40045b: 90 nop c : 40045c: 48 c movq $0x600848, (%rip)

Cox / RixnerLinking42 After Relocation : 0: 48 c movq $0x0,0(%rip) 7: : R_X86_64_PC32 bufp1+0xfffffffffffffff8 7: R_X86_64_32S buf+0x :... 0: R_X86_64_64 buf c : 40045c: 48 c movq $0x600848, (%rip) : # : :

Cox / RixnerLinking43 Libraries How should functions commonly used by programmers be provided?  Math, I/O, memory management, string manipulation, etc.  Option 1: Put all functions in a single source file Programmers link big object file into their programs Space and time inefficient  Option 2: Put each function in a separate source file Programmers explicitly link appropriate object files into their programs More efficient, but burdensome on the programmer Solution: static libraries (.a archive files)  Multiple relocatable files + index  single archive file  Only links the subset of relocatable files from the library that are used in the program  Example: gcc –o fpmath main.c float.c -lm

Cox / RixnerLinking44 Two Common Libraries libc.a (the C standard library)  4 MB archive of 1395 object files  I/O, memory allocation, signal handling, string handling, data and time, random numbers, integer math  Usually automatically linked libm.a (the C math library)  1.3 MB archive of 401 object files  floating point math (sin, cos, tan, log, exp, sqrt, …)  Use “ -lm ” to link with your program UNIX% ar t /usr/lib64/libc.a … fprintf.o … feof.o … fputc.o … strlen.o … UNIX% ar t /usr/lib64/libm.a … e_sinh.o e_sqrt.o e_gamma_r.o k_cos.o k_rem_pio2.o k_sin.o k_tan.o …

Cox / RixnerLinking45 Creating a Library /* addvec.c */ #include “vector.h” void addvec(int *x, int *y, int *z, int n) { int i; for (i = 0; i < n; i++) z[i] = x[i] + y[i]; } /* multvec.c */ #include “vector.h” void multvec(int *x, int *y, int *z, int n) { int i; for (i = 0; i < n; i++) z[i] = x[i] * y[i]; } UNIX% gcc –c addvec.c multvec.c UNIX% ar rcs libvector.a addvec.o multvec.o /* vector.h */ void addvec(int *x, int *y, int *z, int n); void multvec(int *x, int *y, int *z, int n);

Cox / RixnerLinking46 Using a library /* main.c */ #include #include “vector.h” int x[2] = {1, 2}; int y[2] = {3, 4}; int z[2]; int main(void) { addvec(x, y, z, 2); printf(“z = [%d %d]\n”, z[0], z[1]); return (0); } UNIX% gcc –O –c main.c UNIX% gcc –static –o program main.o./libvector.a main.olibvector.alibc.a ld addvec.oprintf.o program

Cox / RixnerLinking47 How to Link: Basic Algorithm Keep a list of the current unresolved references. For each object file (.o and.a) in command-line order  Try to resolve each unresolved reference in list to objects defined in current file  Try to resolve each unresolved reference in current file to objects defined in previous files  Concatenate like sections (.text with.text, etc.) If list empty, output executable file, else error UNIX% gcc main.o libvector.a UNIX% gcc libvector.a main.o main.o: In function `main': main.o(.text+0x4): undefined reference to `addvec' Problem: Command line order matters! Link libraries last:

Why UNIX% gcc libvector.a main.o Doesn’t Work Linker keeps list of currently unresolved symbols and searches an encountered library for them If symbol(s) found, a.o file for the found symbol(s) is obtained and used by linker like any other.o file By putting libvector.a first, there is not yet any unresolved symbol, so linker doesn’t obtain any.o file from libvector.a! Cox / RixnerLinking48

Cox / RixnerLinking49 Dynamic Libraries Static Linked at compile-time UNIX: foo.a Relocatable ELF File Dynamic Linked at run-time UNIX: foo.so Shared ELF File What are the differences?

Cox / RixnerLinking50 Static & Dynamic Libraries Static  Library code added to executable file  Larger executables  Must recompile to use newer libraries  Executable is self- contained  Some time to load libraries at compile-time  Library code shared only among copies of same program Dynamic  Library code not added to executable file  Smaller executables  Uses newest (or smallest, fastest, …) library without recompiling  Depends on libraries at run-time  Some time to load libraries at run-time  Library code shared among all uses of library

Cox / RixnerLinking51 Static & Dynamic Libraries Static Dynamic gcc –o zap zap.o -lfoo ar rcs libfoo.a bar.o baz.o ranlib libfoo.a gcc –shared –Wl,-soname,libfoo.so -o libfoo.so bar.o baz.o Use Creation Adds library’s code, data, symbol table, relocation info, … Adds library’s symbol table, relocation info

Cox / RixnerLinking52 Loading Linking yields an executable that can actually be run Running a program  unix%./program  Shell does not recognize “ program ” as a shell command, so assumes it is an executable  Invokes the loader to load the executable into memory (any unix program can invoke the loader with the execve function – more later)

Cox / RixnerLinking53 Creating the Memory Image (sort of…) Create code and data segments  Copy code and data from executable into these segments Create initial heap segment  Grows up from read/write data Create stack  Starts near the top and grows downward Call dynamic linker to load shared libraries and relocate references User Stack Shared Libraries Heap Read/Write Data Read-only Code and Data Unused 0x7FFFFFFFFFFF %rsp 0x x

Cox / RixnerLinking54 Starting the Program Jump to program’s entry point (stored in ELF header)  For C programs, this is the _start symbol Execute _start code (from crt1.o – same for all C programs)  call __libc_init_first  call _init  call atexit  call main  call _exit

Cox / RixnerLinking55 Position Independent Code Static libraries compile with unresolved global & local addresses  Library code & data concatenated & addresses resolved when linking

Cox / RixnerLinking56 Position Independent Code By default (in C), dynamic libraries compile with resolved global & local addresses  E.g., libfoo.so starts at 0x in every application using it  Advantage: Simplifies sharing  Disadvantage: Inflexible – must decide ahead of time where each library goes, otherwise libraries can conflict

Cox / RixnerLinking57 Position Independent Code Can compile dynamic libraries with unresolved global & local addresses  gcc –shared –fPIC …  Advantage: More flexible – no conflicts  Disadvantage: Code less efficient – referencing these addresses involves indirection

Cox / RixnerLinking58 Library Interpositioning Linking with non-standard libraries that use standard library symbols  “Intercept” calls to library functions Some applications:  Security Confinement (sandboxing) Behind the scenes encryption –Automatically encrypt otherwise unencrypted network connections  Monitoring & Profiling Count number of calls to functions Characterize call sites and arguments to functions malloc tracing –Detecting memory leaks –Generating malloc traces

Cox / RixnerLinking59 Dynamic Linking at Run-Time Application access to dynamic linker via API: #include void dlink(void) { void *handle = dlopen(“mylib.so”, RTLD_LAZY); /* type */ myfunc = dlsym(handle, “myfunc”); myfunc(…); dlclose(handle); } Error-checking omitted for clarity Symbols resolved at first use, not now

Cox / RixnerLinking60 Next Time Lab: Hash tables and linked Lists Exceptions