C Module System C and Data Structures Baojian Hua
Software Systems are Large Practical software systems tend to be large and complex: Linux kernel consists of ~1000K LOC So the general principals in designing large (even small) software systems are: dividing into manageable smaller ones separating specification (interface) from code (implementation)
Module System Module systems offer a systematic method to organize software Different parts can be separately developed, compiled, tested, debugged Different parts are finally linked together This process evolves compiling, linking and libraries, etc.
Module System Different styles of module systems in languages: ML signature and structure Java interface and class C (C++) header files (.h) and C files (.c) This slide shows how to manage C ’ s module system: source programs (separate) compiling, linking, loading and tools
Typical C Program Organization Program file1 function1functionm … … … filen function1functionn dec & stm exp dec & stm exp dec & stm exp dec & stm exp
General Process // General process from source files (.c) to // executables (.exe): 1.c 2.c n.c 1.i 2.i n.i 1.o 2.o n.o libraries a.exe preprocessingCompiling linking ………
Example Revisited double area (int r); double area (int r) { double pi = 3.14; return (pi*r*r); } int main() { double f; f = area(5); return 0; }
First Try // main.c int main () { double f; f = area (5); return 0; } // area.c double area (int r); #define PI 3.14 double area (int r) { double f = PI *r *r; return f; } Try this demo! Wrong result? Why?Or even worse? f = area(5, 6, 7, 8, 9);
Second Try // main.c double area (int r); int main () { double f; f = area (5); return 0; } // area.c double area (int r); #define PI 3.14 double area (int r) { double f = PI *r *r; return f; } Try this demo! Is this perfect? What about here is files contains “ area ” and “ area.c ” being changed?
Third Try // area.h double area (int r); // main.c #include “area.h” int main () … // area.c #include “area.h” #define PI 3.14 double area (int r) { double f = PI *r *r; return f; } Try this demo! Is this perfect!?
Pitfalls // area.h double area (int r); int i = 0; // main.c #include “area.h” int main () … // area.c #include “area.h” #define PI 3.14 double area (int r) { double f = PI *r *r; return f; } Try this demo!
Final Version // area.h #ifndef AREA_H #define AREA_H double area (int r); #endif // main.c #include “area.h” … // area.c #include “area.h” #define PI 3.14 double area (int r) { double f = PI *r *r; return f; }
Preprocessing Take source files (.c.h), generate intermediate files file inclusion Macro substitution comments removal … afterwards, no header file needed any more So, what ’ s the role of “.h ” files?
Example // area.h #ifndef AREA_H #define AREA_H double area (int r); #endif // main.c #include “area.h” int main () { area (5); } // area.c #include “area.h” #define PI 3.14 double area (int r) { double f = PI*r*r; return f; }
Example // area.h #ifndef AREA_H #define AREA_H double area (int r); #endif // main.c #include “area.h” int main () { area (5); } // area.c double area (int r); #define PI 3.14 double area (int r) { double f = PI*r*r; return f; }
Example // area.h #ifndef AREA_H #define AREA_H double area (int r); #endif // main.c #include “area.h” int main () { area (5); } // area.c double area (int r); #define PI 3.14 double area (int r) { double f = 3.14*r*r; return f; }
Example // area.h #ifndef AREA_H #define AREA_H double area (int r); #endif // main.c double area(int r); int main () { area (5); } // area.c double area (int r); double area (int r) { double f = 3.14*r*r; return f; }
Example // area.h #ifndef AREA_H #define AREA_H double area (int r); #endif // main.c double area(int r); int main () { area (5); } // area.c double area (int r); double area (int r) { double f = 3.14*r*r; return f; }
Compiling Generate binary object files (.obj) object files in assembly or binary may involve several intermediate phases analysis optimizations … See demo …
Example // main.c double area(int r); int main () { area (5); } // area.c double area (int r); double area (int r) { double f = 3.14*r*r; return f; } // area.o // main.o
Linking Object files often called relocatable they are incomplete function names, extern variables, etc. Linking the process of linking all object files together resolve reference to external entities See demo …
Linking // Object files are incomplete: // main.o area(…) printf(…)
Linking // Resolve external references: // main.o call area call printf // area.o area: … // printf.o printf: …
Example // main.c double area(int r); int main () { area (5); } // area.c double area (int r); double area (int r) { double f = 3.14*r*r; return f; } // area.o // main.o // a.exe ……………
Static vs Dynamic Linking Static: all object files must be available and link together the generated.exe files are complete what ’ s the pros and cons? Dynamic: some object files are absent the generated.exe files are incomplete then how these absent object files are referenced? what ’ s the pros and cons?
What are Libraries? Libraries just are pre-written pre-compiled object files Normally offered by the compiler company For user program linking purpose Header files are available Ex: stdio.h, stdlib.h, ctype.h, …, from C standard library Source code available (ex: gcc), or unavailable (ex: vc) Same linking technique, but …
How to Implement Libraries? In order to familiarize you with libraries implementation techniques and others, we next study carefully an example stdio.h Our goal is to study the “ printf ” function int printf (const char *format, …);
General Strategy User program call the library function printf () printf () internally makes operating system call to do the real work details vary on different OS OS calls hardware driver Offered by hardware company user program libraries: printf() OS routines hardware driver
Case Study: Linux vs Windows fwrite () user program write () int 0x80 sys_write () Kernel fwrite () user program write () NtWriteFile () int 0x2e IoWriteFile () Kernel
API & SDK Application programming interface (API) a set of routines that nicely wrap up operating system calls libraries are built on top of them standard C libraries, runtime, etc. Why API? Software Development Kit (SDK) A collection of APIs header, libraries, files, and tools Varies on different Windows version
Runtime Runtime is a special set of libraries For program startup and exit get system info libraries preparation, etc. Normally NOT for called by user program Again, vary between different systems