OBJECT MODULE FORMATS. The object module format we have employed as an educational device is called OMF (relocatable object format). It’s one of the earliest.

Slides:



Advertisements
Similar presentations
SYMBOL TABLES &CODE GENERATION FOR EXECUTABLES. SYMBOL TABLES Compilers that produce an executable (or the representation of an executable in object module.
Advertisements

Part IV: Memory Management
Program Development Tools The GNU (GNU’s Not Unix) Toolchain The GNU toolchain has played a vital role in the development of the Linux kernel, BSD, and.
The Assembly Language Level
Assembler/Linker/Loader Mooly Sagiv html:// Chapter 4.3 J. Levine: Linkers & Loaders
Chapter 3 Loaders and Linkers
Linking & Loading CS-502 Operating Systems
3. Loaders & Linkers1 Chapter III: Loaders and Linkers Chapter goal: r To realize how a source program be loaded into memory m Loading m Relocation m Linking.
Chapter 3 Loaders and Linkers
Chapter 3 Loaders and Linkers. Purpose and Function Places object program in memory Linking – Combines 2 or more obj programs Relocation – Allows loading.
Loaders and Linkers CS 230 이준원. 2 Overview assembler –generates an object code in a predefined format »COFF (common object file format) »ELF (executable.
Lecture 10: Linking and loading. Lecture 10 / Page 2AE4B33OSS 2011 Contents Linker vs. loader Linking the executable Libraries Loading executable ELF.
SYSTEM PROGRAMMING & SYSTEM ADMINISTRATION
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
Machine Independent Assembler Features
The Functions and Purposes of Translators Code Generation (Intermediate Code, Optimisation, Final Code), Linkers & Loaders.
1 Starting a Program The 4 stages that take a C++ program (or any high-level programming language) and execute it in internal memory are: Compiler - C++
Linking and Loading Fred Prussack CS 518. L&L: Overview Wake-up Questions Terms and Definitions / General Information LoadingLinking –Static vs. Dynamic.
1 Machine-Independent Features Automatic Library Search automatically incorporate routines from a subprogram library Loading Options.
Loader- Machine Independent Loader Features
Guide To UNIX Using Linux Third Edition
03/05/2008CSCI 315 Operating Systems Design1 Memory Management Notice: The slides for this lecture have been largely based on those accompanying the textbook.
Software Development and Software Loading in Embedded Systems.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Chapter 2 Software Tools and Assembly Language Syntax.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-7 Memory Management (1) Department of Computer Science and Software.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 8: Main Memory.
MIPS coding. SPIM Some links can be found such as:
Chapter 3 Elements of Assembly Language. 3.1 Assembly Language Statements.
Chapter 4 Storage Management (Memory Management).
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Background Program must be brought into memory and placed within a process for it to be run. Input queue – collection of processes on the disk that are.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
The LC-3 – Chapter 7 COMP 2620 Dr. James Money COMP
Topic 2d High-Level languages and Systems Software
5-1 Chapter 5 - Languages and the Machine Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
CSE451 Linking and Loading Autumn 2002 Gary Kimura Lecture #21 December 9, 2002.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
Different Types of Libraries
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 31 Memory Management.
LINKERS Execution of a program written in a language L involves the following steps: 1.Translation of the program: Performed by the translator for language.
Chapter 11  Getting ready to program  Hardware Model  Software Model  Programming Languages  Facts about C++  Program Development Process  The Hello-world.
Background Program must be brought into memory and placed within a process for it to be run. Input queue – collection of processes on the disk that are.
CSc 453 Linking and Loading
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 10 – Loaders.
Object Files & Linking. Object Sections Compiled code store as object files – Linux : ELF : Extensible Linking Format – Windows : PE : Portable Execution.
Chapter Linkers and Loaders
Lecture 3 Translation.
Machine dependent Assembler Features
System Programming and administration
The University of Adelaide, School of Computer Science
Linking & Loading.
Separate Assembly allows a program to be built from modules rather than a single source file assembler linker source file.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
Topic 2e High-Level languages and Systems Software
Loaders and Linkers: Features
Machine Independent Features
Background Program must be brought into memory and placed within a process for it to be run. Input queue – collection of processes on the disk that are.
Memory Management Tasks
Computer Organization and Design Assembly & Compilation
Linking & Loading CS-502 Operating Systems
CSC 497/583 Advanced Topics in Computer Security
Linking & Loading CS-502 Operating Systems
COMP755 Advanced Operating Systems
OPERATING SYSTEMS MEMORY MANAGEMENT BY DR.V.R.ELANGOVAN.
Presentation transcript:

OBJECT MODULE FORMATS

The object module format we have employed as an educational device is called OMF (relocatable object format). It’s one of the earliest forms, but all the subsequent formats contain the basic elements that are present in OMF

Here is a depiction of the main formats that followed pe/coff+ mach-o for Mac osx10.6 pe/coff elf coff mach-o omf a.out

All of them contain separate sections for data, code, and relocation information (i.e. fixups). All of them, incidentally, were designed by committees with the objective of making them machine and language indepedent to varying degrees. So the committees included a wealth of fields that they thought might possibly be helpful, but which are in fact never used in practice.

So why didn’t we pick on one of these later formats to employ for our Project 4? It just would not have been possible to do this in a one-semester compiler course. Even in a two-semester course, the amount of extra detail required would be out of proportion to the gain in education value.

OMF was devised by Intel and at roughly the same time period, AT&T released A.OUT for use with Unix systems.

In order to provide for debugging information and shared libraries, COFF (common object file format) was released by AT&T together with the introduction of Unix System V.

The object module formats in use today by Linux, Unix, and Microsoft, are basically variants of COFF

COFF supported symbolic debugging by in effect including a symbol table which specified not only the offset of variables, but also the offset of code corresponding to the line number of the source - so as to aid e.g. in the setting of breakpoints.

Limitations of COFF include: It places a limit on section names (which correspond to our segment names) and on the number of sections allowed, and its symbolic debugging information is insufficient for supporting some of the features of languages such as C++.

In response, AT&T released ELF, a minor variant of COFF with the introduction of System V, version 4.

Microsoft created its own version of COFF. For the sake of concreteness let’s examine its main features - as described in the Microsoft document “Microsoft Portable Executable and Common Object File Format Specification”, September 21, 2010.

The name of the specification is abbreviated as PE / COFF while the version released to accommodate 64 bit machines is called PE / COFF+.

PE is the format of the output of the linker and. loader, in which the various modules that make up the program are linked all external references resolved all relocation (fixups) completed and the image obtained finally written into memory

The COFF component of PE / COFF is the format of the object module that serves as input to the linker

It closely follows that of the original COFF specification. The main difference is that the Microsoft version does not make use of the debugging facilities supplied by the original COFF such as e.g the line number information It relies on Visual C++ type debug information.

As a compiler writer, your responsibility in writing a compiler for Windows is the production of an object module for input to the linker. The PE formatted output of the linker, and the operating system, are the responsibility of Microsoft.

MICROSOFT’S COFF FORMAT Here is an illustration of the coff structure

SECTIONS The sections correspond to our segments. Except for the segment associated with uninialized data, each segment consists of a header, the raw data, and a relocation component.

The.text section is the code section and the relocation information corresponds to our fixups.

There are two data sections. One is for initialized data, to e.g contain the initial value of variables, as in: num dw 23 The other data section, called.bss above, is for unitialized data, as in: array2 dw 1000 dup(?)

The.bss section consists only of a header that specifies what space is to be involved at execution time. The “named sections”, if present, may be used for purposes such as functions that the program employs. The name of the section would then normally be the same as that of the function.

Section Headers. The fields involved in the section headers include: the section name. If the name has 8 characters or less, it is contained in the header, otherwise it is included in the String table (which corresponds to our ID_S), and the name field of the section header then contains a pointer to its offset there. the section’s virtual address (i.e. offset within the object module itself). the sections’s physical address (i.e. the offset from the start of the program that it will have at execution time)

the size of the section a pointer to the section’s raw data a pointer to the corresponding relocation entries a specification of whether the section contains executable code, initialized data, or unitialized data a specification of whether the section may or may not be read, written, or executed

THE FILE HEADER The fields involved in the file header include: a number identifying the target machine e.g. those employing the 386 or later Pentium, or various machines produced by Hitachi, Mitsubishi, etc. a time and date stamp, indicating when the file was created the number of section headers a pointer to the symbol table’s starting address

THE SYMBOL TABLE The symbol table entries are each 18 bytes long, and include: the name of the symbol. The same scheme is employed as described above for section header names, i.e. if the name is longer than 8 bytes it is stored in the string table, and a pointer to it employed instead

the section the item is defined in it’s offset within that section it’s storage class, e.g. whether it is external, static, or is a function

Some of the entries, such as e.g. those for functions, require more than the 18 bytes an entry provides for its information. In such cases, the main entry for the name is followed by an additional entry (referred to as an auxillary entry).

THE STRING TABLE As mentioned, this corresponds to our id_s. It starts off with 4 bytes specifying its length. This is followed by null-terminated strings, in general representing names.

Note that the segdef, pubdef, and extdef records we have been using are replaced by entries in the symbol table and the string table.

THE PE MODULE FORMAT As mentioned, the compiler writer, in the case where target is not an intermediate language, is concerned with producing the object module input to the linker. He or she is not directly involved with the PE module that the linker produces. Let us however look at the main features of the PE format.

Here is a diagram of its structure

The components the linker has added to the Coff format are: (a) the DOS stub (b) the optional file header (c) the data directories

THE DOS STUB The purpose of the DOS stub is to detect when an attempt is made to execute the program under DOS, and then issue an error message such as: This program can only be run under Windows

THE OPTIONAL FILE HEADER The loader needs to be able to relocate the program in the case where it is unable to load it into the base location employed by the linker. Some of the items listed on the next slide are included for this purpose

The information the optional file header contains includes: (a)the amount of memory space that will be occupied by executable code, initialized data, and uninialized data (b) the offsets from the beginning of the program where the above items will be located in memory (c) the offset from the beginning of the program of it’s entry point

(d) the amount of space needed for the stack (e) the amount of space needed for the heap (f) the alignment of the sections. The default is at an address divisible by 512, but any power of 2 up to 64k can be used. (g) the offsets within the module of the data (h) directories and their sizes.

THE DATA DIRECTORIES These include: (a) the Export Table (b) the Import Table (c) the Resource Table (d) the Base Relocation Table

The Export Table is employed mainly by DLLs to supply the entry points of the various functions they provide. The Import Table is used by programs to supply the externals references that the linker was unable resolve, usually those to DLL functions. Note that the location of the DLL functions may change between one Load & execute of the program to another.

The unresolved calls in the memory image to such external routines are not directly fixed up. They are instead replaced by the linker as calls to a table of external addresses which the loader fills in. The pentium has a call indirect instruction for this purpose.

The Resource Table table contains information about resources the program employs, such as dialog boxes, menus, icons, etc. The Base Relocation Table replaces the Coff version, as much of the relocation and linking involved has already be carried out by the linker.

SOURCES 1. Microsoft Portable Executable and Common Object File Format Specification, Revision 8.2, Sept Application Report spraa08-April 2009, Texas Instruments.