Programming Practice (6) - Sorting Algorithms
Introduction One of the fundamental problems of computer science is ordering a list of items. There are solutions to this problem, known as sorting algorithms. Some sorting algorithms are simple and intuitive, such as the bubble sort. Others, such as the quick sort are extremely complicated, but produce lightening-fast results.
Sorting Algorithms Bubble sort Heap sort Insertion sort
Algorithmic Complexity The two classes of sorting algorithms are O(n2), which includes the bubble, insertion, selection, and shell sorts and O(n log n) which includes the heap, merge, and quick sorts.
Bubble Sort The bubble sort is the oldest and simplest sort in use. Unfortunately, it's also the slowest. The bubble sort works by comparing each item in the list with the item next to it, and swapping them if required. The algorithm repeats this process until it makes a pass all the way through the list without swapping any items
Source Code void bubbleSort(int numbers[], int array_size) { int i, j, temp; for (i = (array_size - 1); i >= 0; i--) { for (j = 1; j <= i; j++) { if (numbers[j-1] > numbers[j]) { temp = numbers[j-1]; numbers[j-1] = numbers[j]; numbers[j] = temp; }
Bubble Sort Demo ort/ ort/
Heap Sort The heap sort is the slowest of the O(n log n) sorting algorithms, but unlike the merge and quick sorts it doesn't require massive recursion or multiple arrays to work.mergequick This makes it the most attractive option for very large data sets of millions of items. Elementary implementations require two arrays one to hold the heap and the other to hold the sorted elements.
Heap Sort Definition Let T = (V, E) an almost complete binary tree with a vertex labelling a : V M that assigns to each vertex u a label a(u) from an ordered set (M, ).almost complete binary tree A vertex u V has the heap property if it has no direct descendant with a greater label, i.e. v V : (u, v) E a(u) a(v)
Heap Sort Algorithm Heap Sort Implementation Heap Sort Algorithm
void heapSort(int numbers[], int array_size) { int i, temp; for (i = (array_size / 2)-1; i >= 0; i--) siftDown(numbers, i, array_size); for (i = array_size-1; i >= 1; i--) { temp = numbers[0]; numbers[0] = numbers[i]; numbers[i] = temp; siftDown(numbers, 0, i-1); } void siftDown(int numbers[], int root, int bottom) { int done, maxChild, temp; done = 0; while ((root*2 <= bottom) && (!done)) { if (root*2 == bottom) maxChild = root * 2; else if (numbers[root * 2] > numbers[root * 2 + 1]) maxChild = root * 2; else maxChild = root * 2 + 1; if (numbers[root] < numbers[maxChild]) { temp = numbers[root]; numbers[root] = numbers[maxChild]; numbers[maxChild] = temp; root = maxChild; } else done = 1; }
Heap Sort Visualizaiton pplet.html pplet.html
Insertion Sort It inserts each item into its proper place in the final list. The simplest implementation of this requires two list structures the source list and the list into which sorted items are inserted. To save memory, most implementations use an in-place sort works by moving the current item past the already sorted items and repeatedly swapping it with the preceding item until it is in place. Like the bubble sort, the insertion sort has a complexity of O(n2).bubble Although it has the same complexity, the insertion sort is a little over twice as efficient as the bubble sort. Pros Relatively simple and easy to implement Cons Inefficient for large lists.
Insertion Sort Algorithm void insertionSort(int numbers[], int array_size) { int i, j, index; for (i=1; i < array_size; i++) { index = numbers[i]; j = i; while ((j > 0) && (numbers[j-1] > index)) { numbers[j] = numbers[j-1]; j = j - 1; } numbers[j] = index; }
The Preprocessor
Introduction Preprocessing Occurs before a program is compiled Inclusion of other files Definition of symbolic constants and macros Conditional compilation of program code Conditional execution of preprocessor directives Format of preprocessor directives Lines begin with # Only whitespace characters before directives on a line
The #include Preprocessor Directive #include Copy of a specified file included in place of the directive #include Searches standard library for file Use for standard library files #include "filename" Searches current directory, then standard library Use for user-defined files Used for: Programs with multiple source files to be compiled together Header file – has common declarations and definitions (classes, structures, function prototypes) #include statement in each file
The #define Preprocessor Directive #define Preprocessor directive used to create symbolic constants and macros Symbolic constants When program compiled, all occurrences of symbolic constant replaced with replacement text Format #define identifier replacement-text Example: #define PI Everything to right of identifier replaces text #define PI = Replaces “ PI ” with " = " Cannot redefine symbolic constants once they have been created
The #define Preprocessor Directive Macro Operation defined in #define A macro without arguments is treated like a symbolic constant A macro with arguments has its arguments substituted for replacement text, when the macro is expanded Performs a text substitution – no data type checking The macro #define CIRCLE_AREA( x ) ( PI * ( x ) * ( x ) ) would cause area = CIRCLE_AREA( 4 ); to become area = ( * ( 4 ) * ( 4 ) );
The #define Preprocessor Directive Use parenthesis Without them the macro #define CIRCLE_AREA( x ) PI * ( x ) * ( x ) would cause area = CIRCLE_AREA( c + 2 ); to become area = * c + 2 * c + 2; Multiple arguments #define RECTANGLE_AREA( x, y ) ( ( x ) * ( y ) ) would cause rectArea = RECTANGLE_AREA( a + 4, b + 7 ); to become rectArea = ( ( a + 4 ) * ( b + 7 ) );
The #define Preprocessor Directive #undef Undefines a symbolic constant or macro If a symbolic constant or macro has been undefined it can later be redefined
Conditional Compilation Conditional compilation Control preprocessor directives and compilation Cast expressions, sizeof, enumeration constants cannot be evaluated in preprocessor directives Structure similar to if #if !defined( NULL ) #define NULL 0 #endif Determines if symbolic constant NULL has been defined If NULL is defined, defined( NULL ) evaluates to 1 If NULL is not defined, this function defines NULL to be 0 Every #if must end with #endif #ifdef short for #if defined( name ) #ifndef short for #if !defined( name )
Conditional Compilation Other statements #elif – equivalent of else if in an if statement #else – equivalent of else in an if statement "Comment out" code Cannot use /*... */ Use #if 0 code commented out #endif To enable code, change 0 to 1
Conditional Compilation Debugging #define DEBUG 1 #ifdef DEBUG cerr << "Variable x = " << x << endl; #endif Defining DEBUG to 1 enables code After code corrected, remove #define statement Debugging statements are now ignored
The # and ## Operators ## Concatenates two tokens The statement #define TOKENCONCAT( x, y ) x ## y would cause TOKENCONCAT( O, K ) to become OK
Line Numbers #line Renumbers subsequent code lines, starting with integer value File name can be included #line 100 "myFile.c" Lines are numbered from 100 beginning with next source code file Compiler messages will think that the error occurred in "myfile.C" Makes errors more meaningful Line numbers do not appear in source file
Predefined Symbolic Constants Four predefined symbolic constants Cannot be used in #define or #undef
Modular Programming
Introduction As programs grow larger and larger, it is more desirable to split them into sections or modules. C allows programs to be split into multiple files, compiled separately, and then combined (linked) to form a single program. we will go through a programming example, discussing the C techniques needed to create good modules. You will be shown how to use make to put these modules together to form a pro gram.
Modules A module is a collection of functions that perform related tasks. database functions such as lookup, enter, and sort. An efficient way of splitting up a large project is to assign each programmer a different module. In this manner, each programmer only worries about the internal details of a particular module. Modules are divided into two parts public and private. The public part how to call the functions in the module. It contains the definition of data structures and functions that are to be used outside the module. The private part Anything that is internal to the module is private. Everything that is not directly usable by the outside world should be kept private.
Definition, implementation, and use of the module
The extern Modifier The extern modifier is used to indicate that a variable or function is defined outside the current file. #include /* number of times through the loop */ extern int counter; /* routine to increment the counter */ extern void inc_counter(void); main() { int index; /* loop index */ for (index = 0; index < 10; index++) inc_counter(); printf("Counter is %d\n", counter); return (0); } /* number of times through the loop */ int counter = 0; /* trivial example */ void inc_counter(void) { ++counter; }
Modifiers
Headers Information that is shared between modules should be put in a header file. By convention, all header filenames end with.h. The header should contain all the public information, such as: A comment section describing clearly what the module does and what is available to the user. Common constants. Common structures. Prototypes of all the public functions. extern declarations for public variables.
/******************************************************** * Definitions for the infinite array (ia) package. * * * * An infinite array is an array whose size can grow * * as needed. Adding more elements to the array * * will just cause it to grow. * * * * struct infinite_array Used to hold the information for an infinite * * array. * * * * Routines * * * * ia_init -- Initializes the array. * * ia_store -- Stores an element in the array. * * ia_get -- Gets an element from the array. * ********************************************************/ /* number of elements to store in each cell of the infinite array */ #define BLOCK_SIZE 10 struct infinite_array { /* the data for this block */ float data[BLOCK_SIZE]; /* pointer to the next array */ struct infinite_array *next; }; /******************************************************** * ia_init -- Initializes the infinite array. * * * * Parameters * * array_ptr -- The array to initialize. * ********************************************************/ #define ia_init(array_ptr) {(array_ptr)->next = NULL;}
/******************************************************** * ia_get -- Gets an element from an infinite array. * * * * Parameters * * array_ptr -- Pointer to the array to use. * * index -- Index into the array. * * * * Returns * * The value of the element. * * * * Note: You can get an element that * * has not previously been stored. The value * * of any uninitialized element is zero. * ********************************************************/ extern int ia_get(struct infinite_array *array_ptr, int index); /******************************************************** * ia_store -- Store an element in an infinite array. * * * * Parameters * * array_ptr -- Pointer to the array to use. * * index -- index into the array. * * store_data -- Data to store. * ********************************************************/ extern void ia_store(struct infinite_array * array_ptr, int index, int store_data);
The Makefile for Multiple Files As programs grow, the number of commands needed to create them grows. Typing a series of 10 or 20 commands can be tiresome and error prone, so programmers started writing shell scripts The program make is designed to aid the programmer in compiling and linking programs. The program make was created to make compilation dependent upon whether a file has been updated since the last compilation. The file Makefile (case sensitivity is important in UNIX) contains the rules used by make to decide how to build the program.
Makefile The Makefile contains the following sections: Comments Any line beginning with a hash mark (#) is a comment. Macros A macro has the format: name = data name is any valid identifier and data is the text that will be substituted whenever make sees $(name). Explicit rules Explicit rules tell make what commands are needed to create the program. Default rules make uses a set of built-in rules to determine what command to execute.
Explicit rules Format target is the name of a file to create. It is "made" or created out of the source file source. If the target is created out of several files, they are all listed. The command that generates the target is specified on the next line. Commands are listed one per line. Each is indented by a tab. target: source [source2] [source3] command [command] [command]... hello: hello.c cc -g -ohello hello.c
Makefile CFLAGS = -g OBJ=ia.o hist.o all: hist hist: $(OBJ) $(CC) $(CFLAGS) -o hist $(OBJ) hist.o:ia.h hist.c ia.o:ia.h ia.c Macros Explicit Rules Default Rules $(CC) $(CFLAGS) -c file.c
Example [File: ia/makefile.gcc] # # # Makefile for UNIX systems # # using a GNU C compiler. # # # CC=gcc CFLAGS=-g -Wall -D__USE_FIXED_PROTOTYPES__ -ansi all: hist hist: hist.o ia.o $(CC) $(CFLAGS) -o hist hist.o ia.o hist.o: hist.c ia.h ia.o: ia.c ia.h clean: rm -f hist hist.o ia.o
Modular Programming Exercise Requirements Input the width, length, height from keyboard. Calculate volume, area, girth. Print the value. Files main.c Main function file to call the functions. input.c The functions and data of input-related calc.c The functions and data of calcuate-related print.c The functions and data of print-related global.h The global header file. Makefile
Modules int input_lengths (int * width, int * length, int * height); int calc_volume (int width, int length, int height); int calc_area(int width, int length, int height); int calc_girth(int width, int length, int height); void print_data(char* string, int value);