Introduction to Programming 3D Applications Lecture 8 Files and Program Parameters
What is a File? A file is a collection of related data that a computers treats as a single unit. Computers store files to secondary storage so that the contents of files remain intact when a computer shuts down. When a computer reads a file, it copies the file from the storage device to memory; when it writes to a file, it transfers data from memory to the storage device.
Buffers A buffer is a “special work area” that holds data as the computer transfers them to/from memory. Buffers help to synchronize data the physical devices with the program. The physical requirements of the devices can deliver more data for input than a program can use at any one time. The buffer handles the overflow data until a program can use it. Moreover, the buffer also holds data until it is efficient to write that data to the storage device for output.
File Information Table A program requires several pieces of information about a file, including the name the OS uses for it, the position of the current character, etc. C uses a structure called FILE (defined in stdio.h ) to store the attributes of a file.
Streams In C, we input/output data using streams. We can associate a stream with a device (i.e. the terminal) or with a file. C supports two types of files Text Stream Files Binary Stream Files
Text Streams & Binary Streams Text streams consist of sequential characters divided into lines. Each line terminates with the newline character ( \n ). Binary streams consist of data values such as integers, floats or complex data types, “using their memory representation.”
Files & Streams A file is an “independent entity” with a name recorded by the operating system. A stream is created by a program. To work with a file, we must associate our stream name with the file name recorded by the OS.
Steps in Processing a File 1. Create the stream via a pointer variable using the FILE structure: FILE* out_file; 2. Open the file, associating the stream name with the file name. 3. Read or write the data. 4. Close the file.
File Open The file open function ( fopen ) serves two purposes: It makes the connection between the physical file and the stream. It creates “a program file structure to store the information” C needs to process the file. Syntax: fopen(“filename”, “mode”);
More On fopen The file mode tells C how the program will use the file. The filename indicates the system name and location for the file. We assign the return value of fopen to our pointer variable: spData = fopen(“MYFILE.DAT”, “w”); spData = fopen(“A:\\MYFILE.DAT”, “w”);
More On fopen
File Open Modes
More on File Open Modes from Figure 7-4 in Forouzan & Gilberg, p. 401
Closing a File When we finish with a mode, we need to close the file before ending the program or beginning another mode with that same file. To close a file, we use fclose and the pointer variable: fclose(spData);
Additional I/O Functions
File Input and Output Each file must be explicitly opened before reading or writing (using actual file name as a string) and closed before execution finishes. Input:fscanf reads from a file e.g. in_file = fopen(“data1.dat”, “r”); fscanf(in_file, “%d%d”, &a, &b); Output:fprintf writes to a file e.g. out_file = fopen(“data2.dat”, “w”); fprintf(out_file, “%d %d\n”, a, b);
File output example The program on the next slide prompts the user for a file name, reads the file name, prompts for input data, reads 2 numbers from keyboard and writes their sum to the named output file.
#include "stdio.h" int main(void) { int a,b,c; char filename[21];// string file name FILE *out_file; // file pointer for output printf("\ntype name of output file: "); // prompt on screen gets(filename); // input from keyboard
out_file = fopen(filename, "w"); // open file for output if (out_file == NULL) { printf("\ncannot open: %s", filename); return 1; // abnormal program exit } printf("\ntype 2 integers");// prompt scanf("%d%d", &a, &b);// from keyboard c = a+b; fprintf(out_file, "%d\n", c); // output to file fclose (out_file); return 0;// normal program exit }
System-Created Streams C automatically creates three streams that it opens and closes automatically for us in order to communicate with the terminal: stdin stdout stderr We cannot re-declare these streams in our programs.
Why stdout and stderr ? There are two output streams because of redirection, supported by Unix, DOS, OS/2 etc. #include intmain(void) { printf("written to stdout\n"); fprintf(stderr, "written to stderr\n"); return 0; } #include intmain(void) { printf("written to stdout\n"); fprintf(stderr, "written to stderr\n"); return 0; } C:> outprog written to stderr written to stdout C:> outprog > file.txt written to stderr C:> type file.txt written to stdout C:> outprog written to stderr written to stdout C:> outprog > file.txt written to stderr C:> type file.txt written to stdout output written to stderr first because it is unbuffered
stdin is Line Buffered Characters typed at the keyboard are buffered until Enter/Return is pressed #include intmain(void) { intch; while((ch = getchar()) != EOF) printf("read '%c'\n", ch); printf("EOF\n"); return 0; } #include intmain(void) { intch; while((ch = getchar()) != EOF) printf("read '%c'\n", ch); printf("EOF\n"); return 0; } C:> inprog abc read 'a' read 'b' read 'c' read ' ' d read 'd' read ' ' ^Z EOF C:> C:> inprog abc read 'a' read 'b' read 'c' read ' ' d read 'd' read ' ' ^Z EOF C:> declared as an int, even though we are dealing with characters
#include intmain(void) { FILE*in; if((in = fopen("autoexec.bat", "r")) == NULL) { fprintf(stderr, "open of autoexec.bat failed "); perror("because"); return 1; } #include intmain(void) { FILE*in; if((in = fopen("autoexec.bat", "r")) == NULL) { fprintf(stderr, "open of autoexec.bat failed "); perror("because"); return 1; } Dealing with Errors fopen may fail for one of many reasons, how to tell which? voidperror(const char* message); open of autoexec.bat failed because: No such file or directory
File Access Problem Can you see why the following will ALWAYS fail, despite the file existing and being fully accessible? if((in = fopen("C:\autoexec.bat", "r")) == NULL) { fprintf(stderr, "open of autoexec.bat failed "); perror("because"); return 1; } if((in = fopen("C:\autoexec.bat", "r")) == NULL) { fprintf(stderr, "open of autoexec.bat failed "); perror("because"); return 1; } C:> dir C:\autoexec.bat Volume in drive C is MS-DOS_62 Directory of C:\ autoexec bat /07/90 8:15 1 file(s) 805 bytes 1,264,183,808 bytes free C:> myprog open of autoexec.bat failed because: No such file or directory C:> dir C:\autoexec.bat Volume in drive C is MS-DOS_62 Directory of C:\ autoexec bat /07/90 8:15 1 file(s) 805 bytes 1,264,183,808 bytes free C:> myprog open of autoexec.bat failed because: No such file or directory
Whitespace in Format Control Strings For input, one or more whitespace characters in a format control string cause C to discard leading whitespace characters. For output, C copies whitespace characters in a format control string to the output stream.
Text in Format Control Strings For input, text must match exactly in the format control string to that of the input stream. For output, C copies text in the format control string to the output stream.
Conversion Specifications
“The number, order, and type of the conversion specifications must match the number, order, and type of the parameters in the list. Otherwise, the result will be unpredictable and may terminate the input/output function.”
Input Data Formatting fscanf / scanf will process input characters until one of the following happens: The function reaches the EOF indicator. The function encounters an inappropriate character. The function reads in a number of characters explicitly programmed as a maximum width field.
Side Effect & Value of fscanf / scanf
Input Stream Issues 1. There is always a return character at the end of an input stream due to the fact that C buffers the stream. 2. fscanf / scanf functions leave the return character in the buffer. To force a discard of the character, begin your format control string with a space character. 3. fscanf / scanf terminate when all specified operations in the control string complete; if the control string ends with a whitespace character, fscanf / scanf continue (they terminate only with a non-whitespace control string).
Width & Precision in Output Width for output specifies a minimum width. If data are wider, C will print all the data. We specify precision with a period followed by an integer: For integers, precision specifies the minimum number of digits to print (incl. leading zeroes). For floating-point numbers, precision specifies the number of digits to print to the right of the floating point. For scientific numbers (g and G), precision specifies how many significant digits to print.
Output Side Effect & Value
Displaying a File #include intmain(void) { charin_name[80]; FILE*in_stream; intch; printf("Display file: "); gets(in_name); if((in_stream = fopen(in_name, "r")) == NULL) { fprintf(stderr, "open of %s for reading failed ", in_name); perror("because"); return 1; } while((ch = fgetc(in_stream)) != EOF) putchar(ch); fclose(in_stream); return 0; } #include intmain(void) { charin_name[80]; FILE*in_stream; intch; printf("Display file: "); gets(in_name); if((in_stream = fopen(in_name, "r")) == NULL) { fprintf(stderr, "open of %s for reading failed ", in_name); perror("because"); return 1; } while((ch = fgetc(in_stream)) != EOF) putchar(ch); fclose(in_stream); return 0; }
Example - Copying Files #include intmain(void) { charin_name[80], out_name[80]; FILE*in_stream, *out_stream; intch; printf("Source file: "); gets(in_name); if((in_stream = fopen(in_name, "r")) == NULL) { fprintf(stderr, "open of %s for reading failed ", in_name); perror("because"); return 1; } printf("Destination file: "); gets(out_name); if((out_stream = fopen(out_name, "w")) == NULL) { fprintf(stderr, "open of %s for writing failed ", out_name); perror("because"); return 1; } while((ch = fgetc(in_stream)) != EOF) fputc(ch, out_stream); fclose(in_stream); fclose(out_stream); return 0; } #include intmain(void) { charin_name[80], out_name[80]; FILE*in_stream, *out_stream; intch; printf("Source file: "); gets(in_name); if((in_stream = fopen(in_name, "r")) == NULL) { fprintf(stderr, "open of %s for reading failed ", in_name); perror("because"); return 1; } printf("Destination file: "); gets(out_name); if((out_stream = fopen(out_name, "w")) == NULL) { fprintf(stderr, "open of %s for writing failed ", out_name); perror("because"); return 1; } while((ch = fgetc(in_stream)) != EOF) fputc(ch, out_stream); fclose(in_stream); fclose(out_stream); return 0; }
Convenience Problem Although our copy file program works, it is not as convenient as the “real thing” C:> copyprog Source file: \autoexec.bat Destination file: \autoexec.bak C:> dir C:\autoexec.* Volume in drive C is MS-DOS_62 Directory of C:\ autoexec bak /12/99 12:34 autoexec bat /07/90 8:15 2 file(s) 1610 bytes 1,264,183,003 bytes free C:> copyprog \autoexec.bat \autoexec.000 Source file: C:> copyprog Source file: \autoexec.bat Destination file: \autoexec.bak C:> dir C:\autoexec.* Volume in drive C is MS-DOS_62 Directory of C:\ autoexec bak /12/99 12:34 autoexec bat /07/90 8:15 2 file(s) 1610 bytes 1,264,183,003 bytes free C:> copyprog \autoexec.bat \autoexec.000 Source file: program still prompts despite begin given file names on the command line
Accessing the Command Line The command line may be accessed via two parameters to main, by convention these are called “argc” and “argv” The first is a count of the number of words - including the program name itself The second is an array of pointers to the words intmain(int argc, char *argv[]) argcargv 3 NULL c o p y p r o g. e x e \0 \ a u t o e x e c. b a t \0 \ a u t o e x e c \0
Example #include intmain(int argc, char *argv[]) { intj; for(j = 0; j < argc; j++) printf("argv[%i] = \"%s\"\n", j, argv[j]); return 0; } #include intmain(int argc, char *argv[]) { intj; for(j = 0; j < argc; j++) printf("argv[%i] = \"%s\"\n", j, argv[j]); return 0; } C:> argprog one two three argv[0] = "C:\argprog.exe" argv[1] = "one" argv[2] = "two" argv[3] = "three" C:> argprog one two three argv[0] = "C:\argprog.exe" argv[1] = "one" argv[2] = "two" argv[3] = "three"
Files as Program Parameters File names and run-time options can be provided in Unix on the command line when a program is executed. The normal command line a.out could be replaced by, a.out datain dataout where “datain” and “dataout” are the input and output files. These 2 strings must be recognised by the C program so that these files can be used for the input/output. Note that the input file “datain” must already contain data.
Program parameters, conventionally called argc and argv, are used to determine which file names and options were supplied. int main(void)// ANSI convention is replaced by: int main(int argc, char* argv[]) // universal int main(int argc, char** argv) // also used
For the command line:a.out datain dataout argc is 3, andargv is an array of “a.out”, “datain” and “dataout” Execution options may be similarly specified, conventionally preceded with a ‘-’ so as to be distinguished from file names. For example:a.out datain dataout -option
Program Parameter Example Copy all 25 integers from the given input file to the given output file. The input file comprises 25 integers with no formatting but just separated by spaces or new lines. The output file shall comprise 5 rows of 5 integers each separated by a space. There is an option to echo the output on to the screen, where the 2 file names and any “-echo” option are program parameters.
#include "stdio.h" #include "string.h" int main(int argc, char *argv[]) // program parameters on command line, // e.g. a.out datain dataout -echo // argc is 4 and argv is // array of these as 4 strings { int num; // for copying each integer int row, col, option = 0; // no echo on screen, by default FILE *myfile_in, *myfile_out; // for 2 files
// check for FILE names if (argc < 3) { printf("\nMissing file name(s).\n"); printf("Too few parameters %d", argc); return 1; // abnormal exit from program } // the 2 file names cannot be the same if ((strcmp(argv[1], argv[2]) == 0)) { printf("\nsame file names !\n"); return 1; // abnormal exit }
// open first file for input myfile_in = fopen(argv[1], "r"); if (myfile_in == NULL) // check if input file exists { printf("\ncan’t open input file:%s\n", argv[1]); return 1;// abnormal exit } // open second file for output myfile_out = fopen(argv[2], "w"); if (myfile_out == NULL)// playing safe! { printf("\ncan’t open O/P file:%s\n", argv[2]); fclose(myfile_in);// already opened return 1;// abnormal exit }
// check option, should be -echo if (argc == 4) // 4th parameter { if (strcmp(argv[3],"-echo") == 0) { option = 1;// echo on screen } else { printf("illegal %s\n", argv[3]); printf("must be -echo\n"); fclose(myfile_in);// already fclose(myfile_out);// opened return 1;// abnormal exit } // else no 4th parameter specified, // option remains 0
for (row=0; row<5; row++) {// copy each row for (col=0; col<5; col++) {// copy each column fscanf(myfile_in, “%d”, &num); fprintf(myfile_out, “%d ”, num); // after each integer is a space if (option)// option == 1 printf(“%d ”, num);// echo } fprintf(myfile_out, “\n”); // end row if (option) printf(“\n”);// echo } fclose(myfile_in); fclose(myfile_out); return 0;// normal exit }// end main
Useful Routines File reading routines: intfscanf(FILE* stream, const char* format,...); intfgetc(FILE* stream); char*fgets(char* buffer, int size, FILE* stream); intfscanf(FILE* stream, const char* format,...); intfgetc(FILE* stream); char*fgets(char* buffer, int size, FILE* stream); File writing routines: intfprintf(FILE* stream, const char* format,...); intfputc(int ch, FILE* stream); intfputs(const char* buffer, FILE* stream); intfprintf(FILE* stream, const char* format,...); intfputc(int ch, FILE* stream); intfputs(const char* buffer, FILE* stream);
Example longl1, l2; intj, ch; doubled; floatf; charbuf[200]; in = fopen("in.txt", "r").... out = fopen("out.txt", "w").... fscanf(in, "%lf|%li:%li/%i", &d, &l1, &l2, &j); fprintf(out, "%li:%i:%.2lf\n", l1, j, d); fgetc(in); fgets(buf, sizeof(buf), in); fputs(buf, out); longl1, l2; intj, ch; doubled; floatf; charbuf[200]; in = fopen("in.txt", "r").... out = fopen("out.txt", "w").... fscanf(in, "%lf|%li:%li/%i", &d, &l1, &l2, &j); fprintf(out, "%li:%i:%.2lf\n", l1, j, d); fgetc(in); fgets(buf, sizeof(buf), in); fputs(buf, out); example input | :68000/13 write that line to the output file (null terminator provided by fgets tells fputs how long the line was) read next line, or next 199 characters, whichever is less ignore next character in input file (newline?) :13:28.33
Binary Files The Standard Library also allows binary files to be manipulated “b” must be added into the fopen options Character translation is disabled Random access becomes easier Finding the end of file can become more difficult Data is read and written in blocks size_tfread(void* p, size_t size, size_t n, FILE* stream); size_tfwrite(const void* p, size_t size, size_t n, FILE* stream); intfseek(FILE* stream, long offset, int whence); longftell(FILE* stream); voidrewind(FILE* stream); intfgetpos(FILE* stream, fpos_t* pos); intfsetpos(FILE* stream, const fpos_t* pos); size_tfread(void* p, size_t size, size_t n, FILE* stream); size_tfwrite(const void* p, size_t size, size_t n, FILE* stream); intfseek(FILE* stream, long offset, int whence); longftell(FILE* stream); voidrewind(FILE* stream); intfgetpos(FILE* stream, fpos_t* pos); intfsetpos(FILE* stream, const fpos_t* pos);
Example doubled; long doublelda[35]; fpos_twhere; in = fopen("binary.dat", "rb"); out = fopen("binnew.dat", "wb"); fread(&d, sizeof(d), 1, in); fgetpos(in, &where); fread(lda, sizeof(lda), 1, in); fsetpos(in, &where); fread(lda, sizeof(long double), 35, in); fwrite(lda, sizeof(long double), 20, out); fseek(in, 0L, SEEK_END); doubled; long doublelda[35]; fpos_twhere; in = fopen("binary.dat", "rb"); out = fopen("binnew.dat", "wb"); fread(&d, sizeof(d), 1, in); fgetpos(in, &where); fread(lda, sizeof(lda), 1, in); fsetpos(in, &where); fread(lda, sizeof(long double), 35, in); fwrite(lda, sizeof(long double), 20, out); fseek(in, 0L, SEEK_END); read one chunk of 8 bytes read one chunk of 350 bytes read 35 chunks of 10 bytes remember current position in file return to previous position write 20 long doubles from lda move to end of binary.dat
Summary Streams stdin, stdout, stderr fopen opening text files functions: perror, fprintf, fscanf, fgetc, fputc variables: argc, argv “b” option to fopen to open binary files functions: fread, fwrite, fseek, ftell