C – Multi-file development and make CS/COE 0449 (term 2174) Jarrett Billingsley
Class announcements Project 4… is a shell! You know, the thing you type commands into. Project 3… Let's have a look at my implementation! Maybe this will convince you of the power of functions. Yes, even if you only use it once. Functions label your thoughts. Functions separate the "what" from the "how." Use them. We're in the final stretch here. Can you believe it's almost April??? Let's start it off with something a bit light. 3/21/2017 CS/COE 0449 term 2174
Multi-file compilation 3/21/2017 CS/COE 0449 term 2174
The Old Ways The C compiler was a single pass compiler. It would read a source file and for each line of code, it would output some machine code. C calls this a translation unit: one source file one object file. Multiple translation units are linked together to make one program. one.c gcc one.o ld two.c gcc two.o executable three.c gcc three.o 3/21/2017 CS/COE 0449 term 2174
Every file's an island Because the files are not actually linked together until after they're compiled, this leads to a weird situation: Source files in C don't actually know anything about each other. To prove this let's look at main_island.c and sub_island.c. Compile each with gcc -c <filename> How on earth did main_island.c compile? Now link them with gcc -o main *.o It… works! Now use nm *.o What do you see? 3/21/2017 CS/COE 0449 term 2174
U T U T Symbol tables, again! printf main main_island.o sub_island.o The nm command shows the names (symbols) in an ELF file. Read the man page for info on what the letters mean! A U is a hole – an undefined, or imported, symbol. A T is a bump – an exported symbol (in the Text segment). main_island.o sub_island.o U print_message T print_message U printf T main 3/21/2017 CS/COE 0449 term 2174
That piece doesn't go there… The compiler really has no idea about other files. Look at bad_sub.c. Uh oh… Now do $ gcc -c bad_sub.c $ gcc -o bad main_island.o bad_sub.o $ ./bad If you run it in gdb, what happens? (try p print_message ) Try running nm on bad_sub.o D means Data! See, the compiler and linker don't actually care. They assume you know what you're doing. How can we prevent mistakes like this? 3/21/2017 CS/COE 0449 term 2174
Header files 3/21/2017 CS/COE 0449 term 2174
Okay, what are they REALLY for? Typically, each compilation unit has an accompanying header file. The header contains a compilation units' public interface. This includes function prototypes, structs, enums etc. The source file includes its own header. It can also include headers of other compilation units. This is how we indicate dependencies in a safer way! public one.c one.h two.c two.h three.c three.h private 3/21/2017 CS/COE 0449 term 2174
t U static printf sub_island.o At global scope, static is very similar to private in Java. static means "do not export this to other compilation units." Let's put static before print_message in sub_island.c, compile, and see what happens when we try to link. Have a look at what nm sub_island.o prints. Little t? This is a local symbol. Any lowercase letters by a name mean it's local; uppercase mean external (exported). It's contained within sub_island.o and no one else can see it. sub_island.o U printf print_message t 3/21/2017 CS/COE 0449 term 2174
extern Leaving static off a function makes a "bump." extern makes a "hole." It has no effect on functions at all. The only time you need to use extern is on global variables that are shared across files. Global variables are bad enough, but shared globals? NEVER. EVER. DO. THIS. OKAY? I'm not even gonna teach you how to make them. That's how bad I think they are. 3/21/2017 CS/COE 0449 term 2174
The general form of a header Suppose you have users.h. Then you'd write: #ifndef _USERS_H_ #define _USERS_H_ // all the contents! #endif The conditional compilation directives are called an include guard. This prevents the header's contents from being copied and pasted multiple times in a single compilation unit. What a world. What goes in the header? 3/21/2017 CS/COE 0449 term 2174
Don't put this in the header Header dos and don'ts Put this in the header Don't put this in the header Exported function prototypes Struct and enum definitions #defines (constants, macros) Function definitions Static function prototypes Variables (ever!) 3/21/2017 CS/COE 0449 term 2174
Simple shell scripts 3/21/2017 CS/COE 0449 term 2174
Who's tired of typing gcc -o blah blah… EVERYONE IS!!!!!!!! A shell script is a file containing a list of shell commands. (Okay, really it's a fully-featured programming language but it's really terrible and please use Python for shell scripting tasks instead) It's a text file whose name ends in .sh and contains the following: #! /bin/sh ...commands... The first line is called the shebang and it says which program to execute this script with. /bin/sh is almost always what you want. Yeah, shebang. # is hash, ! is bang. ssssssshhhhhhhhebang. # is used for comments in shell scripts. 3/21/2017 CS/COE 0449 term 2174
build.sh If you've got a very small program, maybe something as simple as this will be sufficient for building stuff: #! /bin/sh gcc -Wall -Werror -g -m32 -o myprogram file1.c file2.c Once you create a file like this, you have to make it executable. Use the chmod command to change the mode of the file. $ chmod +x build.sh If you ls, the file's permissions on the left have changed. Now you can run it like any other program! $ ./build.sh 3/21/2017 CS/COE 0449 term 2174
make 3/21/2017 CS/COE 0449 term 2174
Incremental compilation Compilation and linking actually take time. Sometimes a lot. Repeating compilation of unchanged files is a waste of time. Incremental compilation only recompiles the sources which have changed, while reusing previously-compiled object files. Say we only changed one.c… one.c gcc one.o two.o three.o ld executable 3/21/2017 CS/COE 0449 term 2174
Build tools To simplify incremental compilation and solve many other problems, we've come up with build tools. make is the classic; others like cmake and scons exist, and many other languages have their own (Java has ant, Rust has cargo). For example, say you want to compile… multiple versions (debug, release, 32-bit, 64-bit…) a whole directory without listing every file only the file you changed (incremental compilation) Or you might have other steps such as: converting data files between formats setting up operating system-specific files (icons, resources etc.) installing the program/library Build tools are great for this! 3/21/2017 CS/COE 0449 term 2174
Dependencies and targets Each file in your project depends on some other files. Source files depend on header files. Header files can depend on other header files. The executable depends on having up-to-date object files. Whenever a file changes, all the files that depend on it need to be rebuilt. The way we indicate dependencies is with this syntax: target: dependency dependency dependency... The target is the thing being built, and the dependencies are what it needs. 3/21/2017 CS/COE 0449 term 2174
Generic targets A common pattern is to make .o files depend on the .c files that create them. You do this with: %.o: %.c Then for the build commands, $< refers to the dependencies, and $@ to the target: gcc -g -c -o $@ $< The build commands must be indented with hard tabs, not spaces!!! 3/21/2017 CS/COE 0449 term 2174
An example makefile The makefile has to be named Makefile, with a capital letter. The good target builds a good executable! The bad target builds a bad executable. The clean target cleans up any build results. This is a very common practice. To make a target, just run make targetname like make good 3/21/2017 CS/COE 0449 term 2174