CSE451 Linking and Loading Autumn 2002 Gary Kimura Lecture #21 December 9, 2002
Today’s Topic How do programs actually get loaded into memory The Windows executable image format
From source to execution –A programmer writes a Source file (helloworld.c file) –A compiler then translates it into an Object module (helloworld.obj file) –The linker combines various object modules it an Executable image (helloworld.exe file) –The loader does the final work in getting the image executing on the system But what does a “.obj” or a “.exe” file really contain?
First, a little theory then the real stuff Three ways a program can get loaded –Absolute loading – Load program at the same address (virtual and/or physical) every time –Relocatable loading – Load program at different addresses based on what is available –Dynamic run-time loading – Load and reload the program at different addresses while the program is running Address Binding –Where a symbolic label/name is translated (bound) to an actual address –The actual binding can be specified in the program, or resolved at compile time, link time, load time, or run time.
COFF and PE Files Common Object File Format (COFF) Portable Executable (PE) File Format † We are going to concentrate on the PE File format for executable images. Roughly the same format is used for object modules and dynamic link libraries. The PE file closely resembles what is needed in memory to run the program. The PE file itself is divided into various sections representing code, data, etc. † There are many more formats, such as ELF, etc.
Overall PE File Mapping Copyright © 2001 Microsoft Corporation, One Microsoft Way, Redmond, Washington U.S.A. All rights reserved.
Relative Virtual Addresses Some important things to note: –The address where the program runs is not equal to the file offset where code is stored in the PE file. –The address where the program runs may not be known at link time. –So any addresses stored in the code by the linker need to all be relative. A Relative Virtual Addresses (RVA) is an offset in memory relative to where the PE file is loaded We’ll see a example of this later with base relocation fixups
The PE File The PE File starts with a DOS Header –Signature (“MZ”) –Offset to the PE Header Followed by a PE Header –Machine type –Number of sections –Timestamp –Data Directory (table of where in the image is stored the export, import, resource, exception, security, base relocation, debug, etc.) Followed by a Section Table –Named list of the sections in the PE file (.text,.data,.rdata,.idata,.edata,.rsrc,.reloc, etc.) Followed by the sections
The usual suspects.text –Executable code.data –Read/write initialized data.rdata –Read only data Note that the linker combines text and data from various object modules to form the executable image. Compilers can append “$…” to the end of the names to dictate the ordering within a section. For example “.text$X” is before “.text$Y” in the.text section
Exporting names and ordinals To run an image that requires calling a dll the loader needs to be able to find the entry points into the dll Conceptually associated with a dll is a list of addresses (RVAs) that other modules can call Each exported entry point is assigned a unique ordinal value The module that then wants to call an entry point only needs to know the dll’s name and the ordinal value. However we as programmers really know the name and not the ordinal value that gets assigned by the linker. The export table saves us by specifying ordinal values and translating names to their ordinal value
The.edata section (what I export) Copyright © 2001 Microsoft Corporation, One Microsoft Way, Redmond, Washington U.S.A. All rights reserved.
Kernel 32 Exports exports table: Name: KERNEL32.dll Characteristics: TimeDateStamp: 3B7DDFD8 -> Fri Aug 17 23:24: Version: 0.00 Ordinal base: # of functions: A0 # of Names: A0 Entry Pt Ordn Name 00012ADA 1 ActivateActCtx C2 2 AddAtomA remainder of exports omitted Copyright © 2001 Microsoft Corporation, One Microsoft Way, Redmond, Washington U.S.A. All rights reserved.
Function calls Consider these three ways to call the function AddAtomA 1.call AddAtomA 2.call PTR [0x1234] 3.call 0x x67890: call PTR [0x1234] (where 0x1234 contains the address of AddAtomA) But compilers usually output call 0x And expect the linker to put in the correct address for the function imported or not. So the linker is stuck using method #3.
Importing functions The PE contains a table of imported modules (identified by the imported dll name) Each table entry identifies the module and lists the functions that need to be imported There a three ways of naming the imported function –Virtual address (nice if the dll never moves and the linker knows this address) –Ordinal value (nice if the linker knows the ordinal value) –Function name (refer back to how exports works) This information is stored in two tables –Import Address Table (IAT) –Import Name Table (INT)
The IAT and INT The IAT and INT are simply array of dwords (4 bytes) Each dword is either the function address (see earlier discussion on function calls), ordinal value, or a pointer to the function name. The loader changes the IAT values at load time to function addresses. For faster image startup images can be bound. Binding an image means resolving and overwriting the IAT table in the actual PE file. However if the imported dll changes the binding needs to be redone. The INT is used for this purpose.
The.idata section (what I import) Copyright © 2001 Microsoft Corporation, One Microsoft Way, Redmond, Washington U.S.A. All rights reserved.
Base Relocation Each module has a preferred load address. However the loader may not be able to always honor the request. If the module is relocated then the loader must fixup the addresses. The.reloc section specifies each location that needs to be fixed if the modules is moved. Don’t do this too often because it is a big performance hit
Other sections. resrc –Resources for the image such as icons, bitmaps, etc. –Organized like a file system.debug… –Debug information –Was “coff” up to NT 4.0 and has moved onto “pdb” in Window XP
Debugging Speaking of debugging…
Things to come Wednesday we’ll wrap everything up Final is on Tuesday December 17 th at 2:30