Page 1 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Chapter 5
Page 2 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings What you MUST know before we start: What characters are How characters are stored in RAM How characters vary from/are the same as Integers (Remember: The topics in this course build on each other) The use of C/C++ to manipulate characters
Page 3 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Strings char A string is simply a numeric (of data type char) array void main() { int i; char chararray[5]; chararray[0] = ‘H’; chararray[1] = ‘e’; chararray[2] = ‘l’; chararray[3] = ‘l’; chararray[4] = ‘o’; The declaration: chararray Reserves 5-bytes of RAM at address chararray Initialized each element of the array with a character
Page 4 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings We could have also written the C Code as: void main() { int i; char chararray[5]; chararray[0] = 72; chararray[1] = 101; chararray[2] = 108; chararray[3] = 108; chararray[4] = 111; Which would have Exactly the same effect To Print the Array: for for (i = 0; i < 5; i++) printf(“%c”, chararray[i]);
Page 5 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings How would this be stored in RAM ?? Assume that the base address of chararray (== &chararray[0]) is 1200: H 1200 e 1201 l 1202 l 1203 o ‘Garbage’
Page 6 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings This is exactly the same as numeric arrays !!! True True -- Except for one difference Consider the sentences: The quality of mercy is not strained. It droppeth as the gentle rains from the heavens upon the place beneath. How many characters are there in the sentences (don’t forget to count spaces and special characters) ?? --- There are actually There are a few points to be made: Do we really want to count how many characters there are ?? Do we really care about the positions (offsets from the base address) of the characters ??
Page 7 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings What’s the Solution ??? need What do we need to know ??? base address The base address of the array We can readily determine this by referring to the variable name (in our case, chararray == &chararray[0]) Where the string ends How do we know where a string ends ??? don’t Right now, we don’t. BUT …… if we were to add an additional character at the end of the string, we could check to see if we had reached the end of the string
Page 8 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings What Character ??? NULL In c, we add a NULL (‘\0’) character at the end of the array Rewriting our (previous) c code: void main() { int i; char chararray[6]; chararray[0] = ‘H’; chararray[1] = ‘e’; chararray[2] = ‘l’; chararray[3] = ‘l’; chararray[4] = ‘o’; chararray[5] = ‘\0’; NOTICE: We must allocate 1-byte more than anticipate
Page 9 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings How does this make things easier for us ??? In a number of respects: Because we know that a NULL character will be added at the end of a string, we do NOT have to declare how many characters are in a string when we initialize We could have declared our string as: void main() { int i; char chararray[] = “Hello”; Which would have the same effect as our previous code We could have also printed our string with the command: puts(chararray); OR for for (i = 0; i < 5; i++) printf(“%c”, chararray[i]); Which would have the same effect as the code: printf(“%s\n”,chararray);
Page 10 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings How would this be stored in RAM ?? Again, Assume that the base address of chararray (== &chararray[0]) is 1200: H 1200 e 1201 l 1202 l 1203 o ‘Garbage’ \
Page 11 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings How does puts work ??? putsputs is a standard function found in base addressThe function receives a base address and continues printing the elements of the array until it encounters a NULL (‘\0’) character puts(chararray);If we were to pass the base address of our string (chararray) as: puts(chararray); The C function necessary might appear as: voidchar void puts (char *base) int { // we could have used: int i = 0; while while while (*base != '\0') // or: while (chararray[i] != ‘\0’) printf("%c", *base++);// or: printf(“%c”, chararray[i++]); } How does this work ???
Page 12 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Assume, again, that the base address of: char char chararray[] = “Hello”; was 1200 The call: puts(chararray); voidchar void puts (char *base) Places the address 1200 in location base (assume address 1750) Looking at RAM, we would see: H 1200 e 1201 l 1202 l 1203 o 1204 \ The first pass: while while (*base != '\0') True printf("%c", *base++); H
Page 13 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Given our RAM layout: H 1200 e 1201 l 1202 l 1203 o 1204 \ The next pass: while while (*base != '\0') True printf("%c", *base++); HeRAM now appears as: H 1200 e 1201 l 1202 l 1203 o 1204 \ The next pass: while while (*base != '\0') True printf("%c", *base++); H el
Page 14 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Given our New RAM layout: H 1200 e 1201 l 1202 l 1203 o 1204 \ The next pass: while while (*base != '\0') True printf("%c", *base++); H e llRAM now appears as: H 1200 e 1201 l 1202 l 1203 o 1204 \ The next pass: while while (*base != '\0') True printf("%c", *base++); H e l lo
Page 15 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Given our New RAM layout: H 1200 e 1201 l 1202 l 1203 o 1204 \ The next pass: while while (*base != '\0') False We are done with the loop Are there other functions associated with strings (in the file ) ??? Yes, a number of them, including: getsGet a string from the keyboard (until CR entered) fputs Write a string to a file fgetsGet a string from a file There is even a header file for strings: There is even a header file for strings:
Page 16 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings What happens if we forget to add a NULL Character ??? Strange things can happen.Consider the following c code: void main() { char chararray[5]; chararray[0] = ‘H’; chararray[1] = ‘e’; chararray[2] = ‘l’; chararray[3] = ‘l’; chararray[4] = ‘o’; puts(chararray); } The output of this program might appear as: Z HelloZ Why ???
Page 17 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Let’s look at how chararray might be stored in RAM (Again, assuming a base address of 1200): H 1201 e l 1202 l Z 1205 1206 \ REMEMBER: putsThe function puts will keep printing characters until the NULL character is reached.
Page 18 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Notice also that we have a convenient way to declare strings: void main() { int i; char chararray[] = “Hello”; NOT We do NOT need to count the number of characters (The compiler will count for us) NOT We do NOT need to add a NULL character at the end (The compiler will add one for us) In this case, 6-bytes (including one for the NULL character) will be reserved at base address chararray, and it will appear in RAM as before
Page 19 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings gets Another function we mentioned, gets, will get a string from the keyboard (until a CR is entered), AND place a NULL character at the end Assume we wished to get a number from the keyboard: void main() { char number[5]; gets(number); } If we were to enter the number: 42 And the base address of number were 9832, it would appear as: ‘4’ 9832 ‘2’ 9833 ‘/0’ BUT, this is NOT a number !!!
Page 20 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings characters When we type keystrokes, we enter characters. integer If we wish to store keystrokes as integer values, we must convert them. HOW ??? FIRST, we need to determine what characters can be converted the characters ‘0’ to ‘9’ onlythe characters ‘+’ and ‘-’, but only if they are the first characters in the string When converting, what must we consider??? Whether the character is legal The position of the character in the array Why position? If the string = ‘6’, the integer value is:6 If the string = ‘32’, the integer value is: 3* = 32 If the string = ‘675’, the integer value is: 6* * = = 675
Page 21 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Consider the following C Code: void main() { char nstring[] = “724”; // the character string to convert int num = 0,// num will hold the converted value offset = 0;// array index/offset // check if the characters are legal while ((nstring[offset] >= ‘0’) && (nstring[offset] <= ‘9’)) // if yes, then convert AND set positional value num = num*10 + nstring[offset++] - ‘0’; } Following the instructions through the loop: offsetnstring[offset]conditionnum=num*10+nstring[offset++]-’0’num0‘7’ = 55TRUE0* = 701‘2’ = 50TRUE7* = 7272‘4’ = 52TRUE72* = ‘\0’ = 0FALSE** Loop Terminated724
Page 22 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings The only drawback to this program is that it does not take into account the sign nor ‘white spaces’ (spaces, CR & tabs) Consider the following C Code: int atoi(const char *stringnum); // function prototype void main() { int num; char *nstring = "-82"; // let’s check a neg. no. num = atoi(nstring); } // call the function int atoi(const char *stringnum) { int n = 0, // n will hold the number sign = 1; // if unsigned then positive while (*stringnum == ' ' || *stringnum == '\n' || *stringnum == '\t') stringnum++; // skip the white spaces if ((*stringnum == '+') || (*stringnum == '-')) // if signed { if (*stringnum == '-') // negative number ? sign = -1; // then mark it stringnum++; } // and go to next character // the rest is the old procedure, BUT using pointers while ((*stringnum >= '0') && (*stringnum <= '9')) // Legal value?? { n = n * 10 + *stringnum - '0'; // determine number to date stringnum++; } // go to next position return(sign * n); } // return the SIGNED value
Page 23 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Must we write this code each time we input characters from the keyboard (OR from a file)??? YESNo YES and No ---- MUSTEach time we get numeric data from the keyboard (or an ASCII file), we MUST convert. Because the conversions are so common, there are readily available library routines in <stdlib.h> Function Name atoi Meaning atoi alpha to integer Action Convert: string to int atol atol alpha to longConvert: string to long atof atof alpha to floatConvert: string to float
Page 24 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Do we need to convert integers (or longs, or floats) to strings ?? YES --- Whenever we store numeric values to an ASCII file, we MUST convert to a string How do we convert ??? Much like we did when converted from decimal to binary. Consider the conversion needed for to binary: Remember the conversion: Divide by 2 (for binary) Collect from bottom to top What does this have to do with converting from integers to string ???
Page 25 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Instead of dividing by 2 and keeping track of the remainder, we divide by 10 and keep track of the remainder Consider the integer: Convert to: ‘9’‘9’ Convert to: ‘0’‘0’ Convert to: ‘4’‘4’ 5105 Convert to: ‘5’‘5’ 0 Collect from bottom: “5409” The only difference is that we must first convert the digit to a character How do we convert to a character ??? Simple Simple:Add 48 (the character ‘0’) to the remainder
Page 26 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Consider the following C Code: void main() { int decimal = 5409, // The integer we wish to convert idx = 0; // The character array index char bin[10]; // the character array for out string while (decimal > 0) // Continue until the quotient is 0 (zero) { bin[idx++] = decimal % 10 +'0'; // store the remainder decimal = decimal / 10; } // get the new quotient bin[idx] = '\0'; } // set in the null character & decrement the index This would execute as: decimal 5409 idx 0 decimal > 0 TRUE decimal % % =57 decimal / /10 =540 bin “9”“9”“9”“9” 5401TRUE 540 % = /10 =54 “90” 542TRUE 54 % = 52 54/10 =5 “904” 53TRUE 5 % = 53 5/10 =0 “9045” 04FALSE and: bin[idx] = '\0' “9045\0”
Page 27 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings NOTE, however, that our string is: “9045\0” --- We must reverse the order We must reverse the order --- If we were to ADD the c code: int offset = 0; // we will start with the first character char temp; // for temporary storage idx--; // DON’T move ‘\0’ (idx now contains 3) while (idx > offset) { temp = bin[idx]; // store uppermost non-swapped character bin[idx--] = bin[offset]; // move lower char. to upper & decrement bin[offset++] = temp; } } // move in the old uppermost character Following the loop: idx 3 offset 0 bin “ 9045\0 ” idx>offset TRUE temp ‘5’ bin[idx--] ‘9’ bin[offset++] ‘5’ bin “ 5049\0 ” 21 TRUE‘4’‘0’‘4’ “ 5409\0 ” 12 FALSE *** We are out of the loop And And the string is in the correct order
Page 28 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings Must we write this code each time we write numeric values to an ASCII file ??? YESNo YES and No ---- MUSTEach time we write numeric data to an ASCII, we MUST convert. Because the conversions are so common, there are readily available library routines in <stdlib.h> Function Name itoa Meaning ito a integer to alpha Action Convert: int to string ltoa longto a long to alphaConvert: long to string ftoa floatto a float to alphaConvert: float to string Check Your Manuals:The parameters passed are different
Page 29 Data Structures in C for Non-Computer Science Majors Kirs and Pflughoeft Strings