Copyright 2013 – Noah Mendelsohn Compiling C Programs Noah Mendelsohn Tufts University Web:

Slides:



Advertisements
Similar presentations
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advertisements

Copyright © 2002 Pearson Education, Inc. Slide 1.
Chapter 11 Separate Compilation and Namespaces. Copyright © 2006 Pearson Addison-Wesley. All rights reserved Learning Objectives Separate Compilation.
Chapter 4 Parameters and Overloading. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 4-2 Learning Objectives Parameters Call-by-value Call-by-reference.
Chapter 1 C++ Basics. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 1-2 Learning Objectives Introduction to C++ Origins, Object-Oriented.
Copyright © 2002 Pearson Education, Inc. Slide 1.
Chapter 11 Introduction to Programming in C
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
1 Copyright © 2005, Oracle. All rights reserved. Introducing the Java and Oracle Platforms.
17 Copyright © 2005, Oracle. All rights reserved. Deploying Applications by Using Java Web Start.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
Excel Functions. Part 1. Introduction 2 An Excel function is a formula or a procedure that is performed in the Visual Basic environment, outside the.
Programs in Memory Bryce Boe 2012/08/29 CS32, Summer 2012 B.
Chapter 7 Memory Management
Configuration management
ABC Technology Project
1 Advanced C Programming from Expert C Programming: Deep C Secrets by Peter van der Linden CIS*2450 Advanced Programming Concepts.
Procedures. 2 Procedure Definition A procedure is a mechanism for abstracting a group of related operations into a single operation that can be used repeatedly.
25 seconds left…...
Arithmetic of random variables: adding constants to random variables, multiplying random variables by constants, and adding two random variables together.
Chapter 10 Linking and Loading. Separate assembly creates “.mob” files.
We will resume in: 25 Minutes.
Manipulating Bit Fields in C Noah Mendelsohn Tufts University Web: COMP 40: Machine.
User Defined Functions Lesson 1 CS1313 Fall User Defined Functions 1 Outline 1.User Defined Functions 1 Outline 2.Standard Library Not Enough #1.
Abstraction, Modularity, Interfaces and Pointers Original slides by Noah Mendelsohn, including content from Mark Sheldon, Noah Daniels, Norman Ramsey COMP.
Data Structures Using C++ 2E
Copyright 2014 – Noah Mendelsohn Assemblers, Macros and The UM Macro Assembler (UMASM) Noah Mendelsohn Tufts University
Copyright 2014 – Noah Mendelsohn UM Macro Assembler Functions Noah Mendelsohn Tufts University Web:
Machine/Assembler Language Putting It All Together Noah Mendelsohn Tufts University Web:
1 Lecture 4: Procedure Calls Today’s topics:  Procedure calls  Large constants  The compilation process Reminder: Assignment 1 is due on Thursday.
CSE 303 Lecture 16 Multi-file (larger) programs
Program Development Tools The GNU (GNU’s Not Unix) Toolchain The GNU toolchain has played a vital role in the development of the Linux kernel, BSD, and.
Linking & Loading CS-502 Operating Systems
Linkers and Loaders 1 Linkers & Loaders – A Programmers Perspective.
CS 31003: Compilers ANIRUDDHA GUPTA 11CS10004 G2 CLASS DATE : 24/07/2013.
Software Development and Software Loading in Embedded Systems.
Copyright 2013 – Noah Mendelsohn Process Memory Noah Mendelsohn Tufts University Web:
System Calls 1.
The LC-3 – Chapter 7 COMP 2620 Dr. James Money COMP
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
Compilation & Linking Computer Organization I 1 November 2009 © McQuain, Feng & Ribbens The Preprocessor When a C compiler is invoked, the.
ECE 103 Engineering Programming Chapter 36 C Storage Classes Herbert G. Mayer, PSU CS Status 8/4/2014 Initial content copied verbatim from ECE 103 material.
CS412/413 Introduction to Compilers and Translators April 14, 1999 Lecture 29: Linking and loading.
1 CS503: Operating Systems Spring 2014 Part 0: Program Structure Dongyan Xu Department of Computer Science Purdue University.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
CS252: Systems Programming Ninghui Li Based on Slides by Gustavo Rodriguez-Rivera Topic 2: Program Structure and Using GDB.
LECTURE 3 Translation. PROCESS MEMORY There are four general areas of memory in a process. The text area contains the instructions for the application.
Hello world !!! ASCII representation of hello.c.
Operating Systems A Biswas, Dept. of Information Technology.
Binding & Dynamic Linking Presented by: Raunak Sulekh(1013) Pooja Kapoor(1008)
Program Execution in Linux David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
1 CS 192 Lecture 4 Winter 2003 December 8-9, 2003 Dr. Shafay Shamail.
Programs – Preprocessing, Compilation and Linking
Lecture 3 Translation.
Process Memory COMP 40: Machine Structure and
The University of Adelaide, School of Computer Science
Linking & Loading.
Separate Assembly allows a program to be built from modules rather than a single source file assembler linker source file.
Program Execution in Linux
CS-3013 Operating Systems C-term 2008
Linking.
Linking & Loading CS-502 Operating Systems
CSE 303 Concepts and Tools for Software Development
Program Execution in Linux
Appendix F C Programming Environment on UNIX Systems
Linking & Loading CS-502 Operating Systems
SPL – PS1 Introduction to C++.
Presentation transcript:

Copyright 2013 – Noah Mendelsohn Compiling C Programs Noah Mendelsohn Tufts University Web: COMP 40: Machine Structure and Assembly Language Programming (Fall 2014)

© 2010 Noah Mendelsohn 2 Today  Much of this material is well- covered in course lecture notes  Here we just present a few diagrams and samples

© 2010 Noah Mendelsohn 3 How do we get from source to executable program?

© 2010 Noah Mendelsohn Executable files  Executable file: –A single file with all code ready to run at a fixed address in memory –Typically the same address for all programs  Requirements –Code divided into multiple source files (.c files and.h files) –Functions in shared.c files need to show up in lots of executables –Often we want to share only the compiled versions (.o files) [you don’t have the source for printf() but you use it all the time]  The challenge –In different executables using the same shared code… –… the same functions and global variables may wind up at different addresses … –… but we still need to make references work across source files 4

© 2010 Noah Mendelsohn Resolving external references 5 #include int main(int argc, char *argv[]) { printf(“The sum is %d\n”,sum(1,2)); } two_plus_one.c int sum(int a, int b) { return a+b; } arith.c call to sum(1,2)code for sum() How do we know where sum() wound up? two_plus_one (executable)

© 2010 Noah Mendelsohn From source code to executable (simplified) 6 two_plus_one.c int sum(int a, int b) { return a+b; } arith.c gcc –c arith.c Relocateable object code for sum() arith.o gcc –c two_plus_one.c Relocateable object code for main() two_plus_one.o #include int main(int argc, char *argv[]) { printf(“The sum is %d\n”,sum(1,2)); }

© 2010 Noah Mendelsohn From source code to executable (simplified) 7 #include int main(int argc, char *argv[]) { printf(“The sum is %d\n” sum(1,2)); } two_plus_one.c int sum(int a, int b) { return a+b; } gcc –c arith.c Relocateable object code for sum() arith.c arith.o gcc –c two_plus_one.c Relocateable object code for main() two_plus_one.o Relocatable.o files Contain machine code References within the file are resolved References to external files not resolved Some address fields may need adjusting later depending on final location in executable program Includes lists of: 1) Names and addresses of defined externals 2) Names and referents of things needing relocation

© 2010 Noah Mendelsohn Linking.o files to create executable 8 gcc –o two_plus_one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one

© 2010 Noah Mendelsohn Linking.o files to create executable 9 gcc –o two_plus_one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one gcc actually runs a program named “ld” to create the executable.

© 2010 Noah Mendelsohn Linking.o files to create executable 10 gcc –o two_plus_one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one To create executable: Code from all.o files collected in one executable Fixed load address assumed All references resolved – code & vars updated

© 2010 Noah Mendelsohn Linking.o files to create executable 11 gcc –o two_plus one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one The executable contains all the code, with references resolved, loadable at a fixed addr. It is ready to be invoked using the exec_() family of system calls or from the command line [which uses exec()].

© 2010 Noah Mendelsohn Linking.o files to create executable 12 gcc –o two_plus_one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one The default name for an executable is a.out so programmers sometimes informally refer to any executable as an “a.out”.

© 2010 Noah Mendelsohn 13 We left out two important steps!

© 2010 Noah Mendelsohn Preprocessor 14 #include #define TWO 2 int main(int argc, char *argv[]) { printf(“The sum is %d\n”, sum(1,TWO)); } Before the compiler even sees the code… …the preprocessor rewrites the code handling all #define, #include, #ifdef and macro substitution… These are gone before the compiler sees the code

© 2010 Noah Mendelsohn Preprocessor used for sharing declarations 15 #include #include “arith.h” int main(int argc, char *argv[]) { printf(“The sum is %d\n”,sum(1,2)); } two_plus_one.c #include “arith.h” int sum(int a, int b) { return a+b; } arith.c int sum(int a, int b); arith.h Caller and callee agree on function prototype for sum()

© 2010 Noah Mendelsohn We also left out the assembler step  The object code in a.o is binary (not human-readable)  Assembly language is a human-reable form of machine code –Symbolic names for machine instructions –Symbolic labels for addresses (like variables and branch targets in code) –Etc.  When you run gcc –c it actually does three steps: –Run the preprocessor –Run the compiler itself to create an assembler file –Run the assembler to create a.o –Normally, we do these steps together, but you can use switches to run them separately 16

© 2010 Noah Mendelsohn Common invocations of gcc 17 gcc –c two_plus_two.c  Runs preprocessor, compiler & assembler to make two_plus_two.o gcc –c arith.c  Same: makes arith.o gcc –o two_plus_two two_plus_two.o arith.o  Use ld to link.o files + system libraries to make two_plus_two executale gcc –E two_plus_two.c  Runs just preprocessor gcc –S two_plus_two.c  Runs just preprocessor & compiler, produces assembler in.s file gcc –c two_plus_two.s  Notices.s extension, runs assembler

© 2010 Noah Mendelsohn 18 Putting it All Together

© 2010 Noah Mendelsohn Compiling a program 19 #include int main(int argc, char *argv[]) { printf(“The sum is %d\n” sum(1,2)); } Preprocessor (cpp) Pre processed source Compiler (cpp) Assembler Source Assembler (as).o file Preprocessor (cpp) Pre processed source Compiler (cpp) Assembler Source Assembler (as).o file int sum(int a, int b) { return a+b; } Loader (ld) Two_plus_two (executable)

© 2010 Noah Mendelsohn 20 Shared Libraries (not required for COMP 40) (these slides on shared libraries were used in COMP 111 …you may find them interesting to read)

© 2010 Noah Mendelsohn Ooops! Where does printf come from? 21 gcc –o two_plus one two_plus_one.o arith.o libc.a Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one Routines like printf live in libraries.

© 2010 Noah Mendelsohn Ooops! Where does printf come from? 22 gcc –o two_plus one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one Routines like printf live in libraries. These are created with the “ ar ” command, which packages up several.o files together into a “.a ” archive or library. You can list the.a along with your separate.o files and ld will pull from it any.o files it needs.

© 2010 Noah Mendelsohn Ooops! Where does printf come from? 23 gcc –o two_plus one two_plus_one.o arith.o Relocateable object code for sum() two_plus_one.o Relocateable object code for sum() arith.o Executable Program two_plus_one Routines like printf live in libraries. These are created with the “ ar ” command, which packages up several.o files together into a “.a ” archive or library. You can list the.a along with your separate.o files and ld will pull from it any.o files it needs. printf used to live in the system library named libc.a, which the compiler links automatically into the executable (so you don’t have to list it).

© 2010 Noah Mendelsohn Why shared libraries?  Problem: if printf is linked from the libc.a, then we get a separate copy in each program that uses printf  Idea: what if we could have one copy and use memory mapping to put it into every executable that needs it?  Challenges: –We can’t link it when ld builds the rest of the executable: we can just note we need it –The same copy is likely to be mapped at different addresses in different programs 24

© 2010 Noah Mendelsohn Why shared libraries?  Problem: if printf is linked from the libc.a, then we get a separate copy in each program that uses printf  Idea: what if we could have one copy and use memory mapping to put it into every executable that needs it?  Challenges: –We can’t link it when ld builds the rest of the executable: we can just note we need it –The same copy is likely to be mapped at different addresses in different programs  Solution: compiler, linker and OS work together to support shared libraries –gcc –fPIC printf.c  generates “position-independent code” that can load at any address –gcc –shared –o libc.so printf.o xxx.o obj3.o  creates shared library –gcc –o two_plus_one two_plus_one.o arith.o libc.so 25 We’ll use printf as an example even though it’s built in to the system… Compile the source with –fPIC to make a position-independent.o file.

© 2010 Noah Mendelsohn Why shared libraries?  Problem: if printf is linked from the libc.a, then we get a separate copy in each program that uses printf  Idea: what if we could have one copy and use memory mapping to put it into every executable that needs it?  Challenges: –We can’t link it when ld builds the rest of the executable: we can just note we need it –The same copy is likely to be mapped at different addresses in different programs  Solution: compiler, linker and OS work together to support shared libraries –gcc –fPIC printf.c  generates “position-independent code” that can load at any address –gcc –shared –o libc.so printf.o xxx.o obj3.o  creates shared library –gcc –o two_plus_one two_plus_one.o arith.o libc.so 26 Link that printf.o and any other files with the –shared option to create a shared library (.so) file.

© 2010 Noah Mendelsohn Why shared libraries?  Problem: if printf is linked from the libc.a, then we get a separate copy in each program that uses printf  Idea: what if we could have one copy and use memory mapping to put it into every executable that needs it?  Challenges: –We can’t link it when ld builds the rest of the executable: we can just note we need it –The same copy is likely to be mapped at different addresses in different programs  Solution: compiler, linker and OS work together to support shared libraries –gcc –fPIC printf.c  generates “position-independent code” that can load at any address –gcc –shared –o libc.so printf.o xxx.o obj3.o  creates shared library –gcc –o two_plus_one two_plus_one.o arith.o libc.so 27 The linker recognizes.so files…instead of including the code, it leaves a little stub that tells the OS to find and map the shared copy of the.so file when exec loads the program. (Actually, libc.so is so widely used that it’s automatically linked, so you don’t need to list it as you would your own. so libraries).

© 2010 Noah Mendelsohn MAIN MEMORY CPU Angry Birds Play Video Browser OPERATING SYSTEM Angry Birds Stack (Angry Birds Call Stack) Text (Angry Birds code) Static initialized (Angry Birds Data) Static uninitialized (Angry Birds Data) Heap (malloc’d) argv, environ ??? libc.so Stack (Browser Call Stack) Text (Browser code) Static initialized (Browser Data) Static uninitialized (Browser Data) Heap (malloc’d) argv, environ libc.so libc.so (with printf code) shows up at different locations in the two programs Memory mapping allows sharing of.so libraries

© 2010 Noah Mendelsohn Memory mapping allows sharing of.so libraries MAIN MEMORY CPU Angry Birds Play Video Browser OPERATING SYSTEM Stack (Angry Birds Call Stack) Text (Angry Birds code) Static initialized (Angry Birds Data) Static uninitialized (Angry Birds Data) Heap (malloc’d) argv, environ Stack (Angry Birds Call Stack) Text (Browser code) Static initialized (Browser Data) Static uninitialized (Browser Data) Heap (malloc’d) argv, environ Angry Birds ??? libc.so Only one copy lives in memory… everyone shares it!

© 2010 Noah Mendelsohn Memory mapping allows sharing of.so libraries MAIN MEMORY CPU Angry Birds Play Video Browser OPERATING SYSTEM Stack (Angry Birds Call Stack) Text (Angry Birds code) Static initialized (Angry Birds Data) Static uninitialized (Angry Birds Data) Heap (malloc’d) argv, environ Stack (Angry Birds Call Stack) Text (Browser code) Static initialized (Browser Data) Static uninitialized (Browser Data) Heap (malloc’d) argv, environ Angry Birds ??? libc.so Memory mapping hardware can do this… Code must be position- independent!