Generating Programs and Linking Professor Rick Han Department of Computer Science University of Colorado at Boulder.

Slides:



Advertisements
Similar presentations
Programs in Memory Bryce Boe 2012/08/29 CS32, Summer 2012 B.
Advertisements

Fabián E. Bustamante, Spring 2007 Linking Today Static linking Object files Static & dynamically linked libraries Next time Exceptional control flows.
Program Development Tools The GNU (GNU’s Not Unix) Toolchain The GNU toolchain has played a vital role in the development of the Linux kernel, BSD, and.
Linking & Loading CS-502 Operating Systems
Chapter 3 Loaders and Linkers
Linker and Loader. Program building into four stages (C Program) Preprocessing (Preprocessor) It processes include files, conditional compilation instructions.
Lecture 10: Linking and loading. Lecture 10 / Page 2AE4B33OSS 2011 Contents Linker vs. loader Linking the executable Libraries Loading executable ELF.
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
Computer Organization CS224 Fall 2012 Lesson 12. Synchronization  Two processors or threads sharing an area of memory l P1 writes, then P2 reads l Data.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Memory Management Chapter 7.
1 COMP 144 Programming Language Concepts Felix Hernandez-Campos Lecture 33: Code Generation and Linking COMP 144 Programming Language Concepts Spring 2002.
Software Language Levels Machine Language (Binary) Assembly Language –Assembler converts Assembly into machine High Level Languages (C, Perl, Shell)
Memory Management CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
Operating Systems Concepts Professor Rick Han Department of Computer Science University of Colorado at Boulder.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
03/05/2008CSCI 315 Operating Systems Design1 Memory Management Notice: The slides for this lecture have been largely based on those accompanying the textbook.
Process. Process Concept Process – a program in execution Textbook uses the terms job and process almost interchangeably A process includes: – program.
Chapter 91 Memory Management Chapter 9   Review of process from source to executable (linking, loading, addressing)   General discussion of memory.
1 uClinux course Day 3 of 5 The uclinux toolchain, elf format and ripping a “hello world”
System Calls 1.
Enabling the ARM Learning in INDIA ARM DEVELOPMENT TOOL SETUP.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-7 Memory Management (1) Department of Computer Science and Software.
MIPS coding. SPIM Some links can be found such as:
University of Amsterdam Computer Systems – a guided tour Arnoud Visser 1 Computer Systems A guided Tour.
Intro to Computer Systems Summer 2014 COMP 2130 Introduction to Computer Systems Computing Science Thompson Rivers University.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Topic 2d High-Level languages and Systems Software
CE Operating Systems Lecture 14 Memory management.
5-1 Chapter 5 - Languages and the Machine Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /29/2013 Lecture 13: Compile-Link-Load Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
Linking Ⅱ.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
1 CS503: Operating Systems Spring 2014 Part 0: Program Structure Dongyan Xu Department of Computer Science Purdue University.
Chapter 13 : Symbol Management in Linking
Week 4 - Friday.  What did we talk about last time?  Some extra systems programming stuff  Scope.
1 Linking. 2 Outline What is linking and why linking Complier driver Static linking Symbols & Symbol Table Suggested reading: 7.1~7.5.
CSc 453 Linking and Loading
CS252: Systems Programming Ninghui Li Based on Slides by Gustavo Rodriguez-Rivera Topic 2: Program Structure and Using GDB.
Program Translation and Execution I: Linking Sept. 29, 1998 Topics object files linkers class11.ppt Introduction to Computer Systems.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Hello world !!! ASCII representation of hello.c.
Operating Systems A Biswas, Dept. of Information Technology.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Lecture 3 Translation.
Computer Architecture & Operations I
Slides adapted from Bryant and O’Hallaron
Computer Systems MTSU CSCI 3240 Spring 2016 Dr. Hyrum D. Carroll
Overview of today’s lecture
ENERGY 211 / CME 211 Lecture 25 November 17, 2008.
Linking & Loading.
Chapter 8 Main Memory.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
Generating Programs and Linking
CALL & Pthread.
Memory Management Overview
Linking.
Computer Organization and Design Assembly & Compilation
The Assembly Language Level
Linking & Loading CS-502 Operating Systems
Introduction to Computer Systems
Program Execution in Linux
10/6: Lecture Topics C Brainteaser More on Procedure Call
Linking & Loading CS-502 Operating Systems
Program Assembly.
Presentation transcript:

Generating Programs and Linking Professor Rick Han Department of Computer Science University of Colorado at Boulder

CSCI 3753 Announcements Moodle - posted last Thursday’s lecture Programming shell assignment 0 due Thursday at 11:55 pm, not 11 am Introduction to Operating Systems Read Chapters 3 and 4 in the textbook

System Libraries and Tools (Compilers, Shells, GUIs) Operating System Architecture App3 DiskMemoryCPUDisplayMouse App2 App1 I/O SchedulerVM File System OS “Kernel” Posix, Win32, Java, C library API System call API Device Manager

What is an Application? A software program consist of a sequence of code instructions and data –for now, let a simple app = a program Computer executes the instructions line by line –code instructions operate on data Code Data Program P1

CPU Program Counter (PC) Registers ALU Fetch Code and Data Write Data Code Data Main Memory OS Loader Program P1 binary Loading and Executing a Program Code Data Code Data P1 binary P2 binary Disk

Loading and Executing a Program Code Data Code Data Code Data P1 binary P2 binary Disk Main Memory OS Loader Program P1 binary shift left by 2 register R1 and put in address A invoke low level system call n to OS: syscall n jump to address B Machine Code instructions of binary executable

Generating a Program’s Binary Executable We program source code in a high-level language like C or Java, and use tools like compilers to create a program’s binary executable Code Program P1’s Binary Executable Source Code Compiler file P1.c AssemblerLinker Data gcc can generate any of these stages P1.s P1.o technically, there is a preprocessing step before the compiler. “gcc -c” will generate relocatable object files, and not run linker

Linking Multiple Object Files Into an Executable linker combines multiple.o object files into one binary executable file –why split a program into multiple objects and then relink them? –breaking up a program into multiple files, and compiling them separately, reduces amount of recompilation if a single file is edited don’t have to recompile entire program, just the object file of the changed source file, then relink object files Code P1 or P1.exe Source Code Compiler cc1 file P1.c Assembler as Linker ld Data P1.s P1.o foo2.o foo3.o

Linking Multiple Object Files Into an Executable in combining multiple object files, the linker must –resolve references to variables and functions defined in other object files - this is called symbol resolution –relocate each object’s internal addresses so that the executable’s combination of objects is consistent in its memory references an object’s code and data are compiled in its own private world to start at address zero Code P1 or P1.exe Source Code Compiler cc1 file P1.c Assembler as Linker ld Data P1.s P1.o foo2.o foo3.o

Linker Resolves Unknown Symbols P1.c int globalvar1=0; main(...) { f1(...) } foo2.c void f1(...) { ---- } void f2(...) { ---- globalvar1 = 4; ---- } extern void f1(...);extern int globalvar1; P1.o the P1.o object file will contain a list of unknown symbols, e.g. f1, in a symbol table foo2.o foo2.o’s symbol table lists unknown symbols, e.g. globalvar1

Linker Resolves Unknown Symbols ELF relocatable object file contains following sections: –ELF header (type, size, size/# sections) –code (.text) –data (.data,.bss,.rodata).data = initialized global variables.bss = uninitialized global variables (does not actually occupy space on disk, just a placeholder) –symbol table (.symtab) –relocation info (.rel.text,.rel.data) –debug symbol table (.debug only if “-g” compile flag used) –line info (map C &.text line #s only if “-g”) –string table (for symbol tables) ELF header.text.rodata.data.bss.symtab.rel.text.rel.data.debug.line.strtab Section header table ELF relocatable object file

Linker Resolves Unknown Symbols Symbol table contains 3 types of symbols: –global symbols - defined in this object –global symbols referenced but not defined here –local symbols defined and referenced exclusively by this object, e.g. static global variables and functions local symbols are not equivalent to local variables, which get allocated on the stack at run time

Linker Resolves Unknown Symbols extern float f1(); int globalvar1=0; void f2(...) { static int x=-1; } global symbols defined here global symbol referenced here but defined elsewhere “local” symbol The symbol table informs the Linker where symbols referenced or referenceable by each object file can be found: –if another file references globalvar1, then look here for info –if this file reference f2, then another object file’s symbol table will mention f2

Linker Resolves Unknown Symbols Each entry in the ELF symbol table looks like: typedef struct { int name; /* string table offset */ int value; /* section offset or VM address */ int size; /* object size in bytes */ char type:4, /* data, func, section or src file name (4 bits) */ binding:4;/* local or global (4 bits) */ char reserved; /* unused */ char section; /* section header index, ABS, UNDEF, */ } ELF_Symbol; here’s where we flag the undefined status

Linker Resolves Unknown Symbols During linking, the linker goes through each input object file and determines if unknown symbols are defined in other object files Linker Code Data.symtab P1.o relocatableobject file Code Data.symtab P2.o Code Data.symtab P3.o function f1() in P1.o is referenced but not defined, hence unknown defined in P2? No defined in P3? Yes

Linker Resolves Unknown Symbols What if two object files use the same name for a global variable? –Linker resolves multiply defined global symbols –functions and initialized global variables are defined as strong symbols, while uninitialized global variables are weak symbols Rule 1: multiple strong symbols are not allowed Rule 2: choose the strong symbol over the weak symbol Rule 3: given multiple weak symbols, choose any one

Linker Resolves Unknown Symbols Linking with static libraries –Bundle together many related.o files together into a single file called a library or.a file e.g. the C library libc.a contains printf(), strcpy(), random(), atoi(), etc. library is created using the archive ar tool –the library is input to the linker as one file –linker can accept multiple libraries –linker copies only those object modules in the library that are referenced by the application program –Example: gcc main.c /usr/lib/libm.a /usr/lib/libc.a

Linker Resolves Unknown Symbols a static library is a collection of relocatable object modules –group together related object modules –within each object, can further group related functions –if an application links to libfoo.a, and only calls a function in foo3.o, then only foo3.o will be linked into the program libfoo.a foo1.o foo2.o foo3.o foo4.o

Linker Resolves Unknown Symbols Linker scans object files and libraries sequentially left to right on command line to resolve unknown symbols –for each input file on command line, linker updates a list of defined symbols with object’s defined symbols tries to resolve the undefined symbols (from object and from list of previously undefined symbols) with the list of previously defined symbols carries over the list of defined and undefined symbols to next input object file –so linker looks for undefined symbols only after they’re undefined! it doesn’t go back over the entire set of input files to resolve the unknown symbol if an unknown symbol becomes referenced after it was defined, then linker won’t be able to resolve the symbol! Thus, order on the command line is important - put libraries last!

Linker Resolves Unknown Symbols Example: gcc libfoo.a main.c –main.c calls a function f1 defined in libfoo.a –scanning left to right, when linker hits libfoo.a, there are no unresolved symbols, so no object modules are copied –when linker hits main.c, f1 is unresolved and gets added to unresolved list –Since there are no more input files, the linker stops and generates a linking error: /tmp/something.o: In function ‘main’: /tmp/something.o: undefined reference to ‘f1’

Linker Resolves Unknown Symbols Example: gcc main.c libfoo.a –main.c calls a function f1 defined in libfoo.a –scanning left to right, when linker hits main.c, it will add f1 to the list of unresolved references –when linker next hits libfoo.a, it will look for f1 in the library’s object modules, see that it is found, and add the object module to the linked program –No errors are generated. A binary executable is generated. Lesson #1: the order of linking can be important, so put libraries at the end of command lines Lesson #2: an undefined symbol error can also mean that you –didn’t link in the right libraries, didn’t add right library path –forgot to define the symbol somewhere in your code

Linker Relocates Addresses After resolving symbols, the linker relocates addresses when combining the different object modules –merges separate code.text sections into a single.text section –merges separate.data sections into a single.data section –each section is assigned a memory address –then each symbol reference in the code and data sections is reassigned to the correct memory address looks at.relo.text and.relo.data to find relocation entries of references that needed address translation –these are virtual memory addresses that are translated at load time into real run-time memory addresses

Linked ELF Executable Object File ELF executable object file contains following sections: –ELF header (type, size, size/# sections) –segment header table –.init (program’s entry point, i.e. address of first instruction) –other sections similar –Note the absence of.rel.tex and.rel.data - they’ve been relocated! Ready to be loaded into memory and run –only sections through.bss are loaded into memory –.symtab and below are not loaded into memory –code section is read-only –.data and.bss are read/write ELF header segment header table.init.text.rodata.data.bss.symtab.debug.line.strtab Section header table ELF executable object file

Loading Executable Object Files Run-time memory image Essentially code, data, stack, and heap Code and data loaded from executable file Stack grows downward, heap grows upward User stack Heap Read/write.data,.bss Read-only.init,.text,.rodata Unallocated Run-time memory