1 Data Structures A Data Structure is an arrangement of data in memory. A Data Structure is an arrangement of data in memory. The purpose is to map real world data organization to the logical world in such a way that makes problems easy to understand and solve. To make the link between program and real world clear To make programs process data as efficiently as possible.
2 Algorithms The step by step sequence of instructions that operate on data structures for a specific purpose. The step by step sequence of instructions that operate on data structures for a specific purpose. Some data structures are better than others for data processing. Some data structures are better than others for data processing.
3 Silly example Suppose we want a program that manipulates peoples names. We could use the basic char data type as our data structure. That is we could define 30 separate variables, char chr1, chr2, chr3 … chr30; Then assign each one a character in a name. This would be very clumsy and hard work and inflexible. A better data structure would be to create a string. A better data structure would be to create a string. that enables manipulation of a string as a whole and as individual characters
4 String Data Structure Basic input and output Basic input and output Basic string processing Basic string processing Character by character input Character by character input Library functions Library functions Using Strings with functions Using Strings with functions Common errors Common errors
5 Fundamentals String literal – a sequence of characters in double quotes. String literal – a sequence of characters in double quotes. “This is a string”, “Hello world”, “mydata.dat” A string is an array of characters. A string is an array of characters. What is an array? It is a data structure, a container, a way of organising data. It is a chunk of memory, a series of adjacent bytes.
6 String Arrays A sequence of characters plus one special character. A sequence of characters plus one special character. The end of string character. A named constant NULL NULL takes the value of ‘\0’ E.g. the string “Good Morning!” has one extra character the NULL character which is not displayed.
7 “Good Morning!” GodoMorning!\O Each box is one byte of memory the string takes up one continuous chunk of memory the sting takes up 14 bytes not 13 the end of string character acts as a sentinel (a marker for many string processing problems)
8 String Variables Think of a name for your variable Think of a name for your variable e.g. FirstName, StreetName, Town, Country etc. e.g. FirstName, StreetName, Town, Country etc. Think about the maximum characters needed to store. 5, 10, 80, 100, etc. Think about the maximum characters needed to store. 5, 10, 80, 100, etc. Then add one to that number for the NULL end of string character. Then add one to that number for the NULL end of string character. Declare string variable as an array of characters. using special array notation. Declare string variable as an array of characters. using special array notation.
9 Example char FirstName[20]; //able to store a string of 19 chars char LastName[20]; //able to store a string of 19 chars char Sentence[80]; //able to store a string of 79 chars char word[10]; //able to store a string of 9 chars Unititialised string variables. Only memory allocated. They do not contain any useful information. NO this is not an error! Take the string word for instance. It appears that 10 spaces have been allocated, enough for 10 characters? however strings of this sort require one extra space for the end of string character. Which means you only have 9 “usable” spaces.
10 Example – declaration plus initialisation char S[14] = “Chris”; char S[14] = “Chris”; initialising this way automatically places a NULL character at the end of the string. Compare with unititialised version char S[14]; char S[14]; Chirs\0 ?????????????? s[0]s[1]s[3]s[2]s[4]s[5]s[6]s[7]s[8]s[9]s[10]s[11]s[12]s[13] note max subscript 1 less than length of string
11 String Indexes A string has a certain length. A string has a certain length. That is, the number of characters not including NULL. This is not the same as the size of the array. That is, the number of characters not including NULL. This is not the same as the size of the array. The name of the string is a reference to the address of the first character. The name of the string is a reference to the address of the first character. We can access any character we wish by using the square bracket notation. We can access any character we wish by using the square bracket notation. e.g. FirstName[0] allows us to access the first character in the string FirstName. FirstName[1] accesses the second character, FirstName[2] the third and so on. Problem this array index or array subscript is confusing. One off problems
12 Demo 1 Accessing elements of a string Accessing elements of a string char message[14] = “Good Morning!”; Illustrate access of unititialised string. Illustrate access of unititialised string. Illustrate going past the end of string character but still in bounds of array. Illustrate going past the end of string character but still in bounds of array. Illustrate going off the end of the array Illustrate going off the end of the array Illustrate difference between strlen and sizof Illustrate difference between strlen and sizof
13 Basic input of strings Using extraction operator (with cin or a file stream). Using extraction operator (with cin or a file stream). cin >> FirstName; fin >> FirstName; Problem Problem Extraction operator uses a space as terminator Note no square brackets [ ]
14 More string input tools cin.get() or fin.get() cin.get() or fin.get() extracts a single character inlcuding white space! cin.getline() or fin.getline() cin.getline() or fin.getline() extracts a whole line of characters including white space.
15 cin.get() cin.get() returns a character entered at the keyboard. cin.get() returns a character entered at the keyboard. e.g. to enter a number of characters at the keyboard including spaces and assign then to individual string elements. const int MAXCHARS = 80; char line[MAXCHARS+1]; int i=0; do { line[i] = cin.get(); i++; } while (line[i-1] != ‘\n’ && i < MAXCHARS); line[i] = ‘\0’; DO DEMO 2
16 Using getline() basic use of getline needs three arguments. cin.getline(stringvar,lengthvar, stop_char) stringvar is the name of a string variable. lengthvar is the name of an integer variable that denotes the maximum number of characters to be read. A literal int can be used. extraction stops when stop_char encountered. cin.getline(LastName,14,’\n’); extracts up to 14 characters from the keyboard, stops extraction when ‘\n’ encountered
17 Demo 3 Creating string variables Creating string variables Inputting with extraction operator Inputting with extraction operator problems with spaces Using getline() Using getline()
18 #include int main() { const int MAXCHARS 80; char message[MAXCHARS+1]; cout << “Please enter a message : “; cin.getline(message,MAXCHARS,’\n’); cout << “Your message is : “ << message << endl; return 0; }
19 Using = and == with strings You cannot use these operators in the same way as you would with basic data types. You cannot use these operators in the same way as you would with basic data types. E.g. E.g. char Name[10]; Name = “Thomas”; Assignment Illegal! Only allowable in declaration
20 Assignment of strings Use strcpy() library function Use strcpy() library function Needs Needs #include strcpy(destination, source); e.g. e.g. strcpy(FirstName,”Chris”); “Assigns” the string “Chris” to FirstName
21 Checking for equality The statement The statement if (FirstName1 == FirstName2) { do something.. } is not illegal. It does NOT test whether FirstName1 is the same as FirstName2 Use library function which again needs Use strcmp() library function which again needs # include
22 Comparisons of Strings strcmp(string1, string2) returns an integer, strcmp(string1, string2) returns an integer, > 0 if string1 is “bigger” than string2 < 0 if string1 is “smaller” than string2 0 (False) if string1 is the same as string2 It compares each string a character at a time. It compares each string a character at a time. typical use if (strcmp(FirstName, “Chris”) == 0) cout << “My Name” << endl; True
23 Finding the length of a string A very important task in string processing is to find the number of characters in a string. A very important task in string processing is to find the number of characters in a string. To set bounds of loops to process the string. To set bounds of loops to process the string. Remember the declaration gives the maximum number of characters. Remember the declaration gives the maximum number of characters. The NULL character \0 marks the end of a string. We want the count up to the NULL character.
24 Library function strlen() Needs Needs #inlcude This function returns the length of a string char FirstName[40]; int size; strcpy(FirstName, “Christopher”); size = strlen(FirstName); for (int i=0;i<size; i++) FirstName[i] = “*”; cout << FirstName << endl;
25 String Processing Loops, Loops and Loops Loops, Loops and Loops for loops, while loops, do while loops E.g. To fill a string my_string, with ‘*’ characters E.g. To fill a string my_string, with ‘*’ characters while loop version int i = 0; while (my_string[i] != ‘\0’) { my_string[i] = ‘*’; i++; } for loop version for (i=0; my_string[i] != ‘\0’; i++){ my_string[i] = ‘*’; i++; } NOTE THIS IS DANGEROUS since no check is made on whether there is a NULL character
26 Safer Version while loop version int i = 0; while (my_string[i] != ‘\0’ && i < MAXCHARS) { my_string[i] = ‘*’; i++; } I leave it as an exercise to adapt the for loop version to a safe version.
27 CAUTION When manipulating strings make sure you do not lose or replace the NULL character. When manipulating strings make sure you do not lose or replace the NULL character. Make sure that the NULL character is in the correct place. That is, one position after the last character in string. Make sure that the NULL character is in the correct place. That is, one position after the last character in string.
28 Strings as function arguments Passing a string to a function. Passing a string to a function. e.g. size = strlen(mystring); Notice no square brackets [ ] are used in the call. What is passed?
29 A bit of technical stuff. Take a deep breath! The string variable name without square brackets contains the address in memory of the first position of the string. The string variable name without square brackets contains the address in memory of the first position of the string. All that is passed is an address! All that is passed is an address! a variable that can store addresses is called a pointer. When an array is passed to a function all that is passed is the address of the first element of the array. When an array is passed to a function all that is passed is the address of the first element of the array.
30 How does the function know the size of the string? How does the function know the size of the string? The standard functions use the NULL character as a sentinel. The standard functions use the NULL character as a sentinel. They have a precondition that the strings passed are properly formed They have a precondition that the strings passed are properly formed A dangerous assumption. It would be safer if we provided information about the sizes of strings being manipulated. It would be safer if we provided information about the sizes of strings being manipulated. E.g. What happens if we attempt to copy a string with 50 characters into a string that can only take 5 characters max? E.g. What happens if we attempt to copy a string with 50 characters into a string that can only take 5 characters max? memory is overwritten!
31 Creating your own functions that operate on strings: e.g. A safe string copy function Prototype Prototype void my_string_copy(char target[], char source[], int target_size); Notice syntax of prototype. The size of target and source is not provided. Since all that is passed is the address of the first element. target and source receive addresses. This is similar to pass by reference. Note also choice of generic names for the parameters, that do not conflict with identifiers else where in your program
32 Function Definition Function Definition void my_string_copy(char target[], char source[], int target_size) { int new_length; // temporarily hold length of source string new_length = strlen(source); // can target hold a string this long? if (new_length > (target_size-1)) // allow for NULL new_length = target_size; // fit in all we can // copy character by character for (int i = 0; i < new_length; i++) target[i] = source[i]; target[i] = ‘\0’;//add the NULL to the end of target }
33 Putting it all together #include //prototype void my_string_copy(char target[], char source[], int target_size); int main() { char shortSTR[5]; //holds up to 4 chars char longSTR[] = “This is a long string”; my_string_copy(shortSTR,”Hello”,5); cout << shortSTR << “STRING ENDS HERE. \n”; my_string_copy(shortSTR,longSTR,5); cout << shortSTR << “STRING ENDS HERE. \n”; return 0; } void my_string_copy(char target[], char source[], int target_size) { int new_length; // temporarily hold length of source string new_length = strlen(source); // can target hold a string this long? if (new_length > (target_size-1)) // allow for NULL new_length = target_size; // fit in all we can // copy character by character for (int i = 0; i < new_length; i++) target[i] = source[i]; target[i] = ‘\0’;//add the NULL to the end of target } DO DEMO 4
34 Summary We have been discussing strings in particular the Cstring form. This sort of string uses a NULL sentinel We have been discussing strings in particular the Cstring form. This sort of string uses a NULL sentinel The string is an array or characters The string is an array or characters You cannot assign with = operator, use strcpy() or write your own. You cannot assign with = operator, use strcpy() or write your own. You cannot compare with == use strcmp() or write your own. You cannot compare with == use strcmp() or write your own. Be careful when operating on strings not to write past the end of the string! Be careful when operating on strings not to write past the end of the string! protect with a check for length. Be careful not to overwrite the NULL character, or forget to insert it. Be careful not to overwrite the NULL character, or forget to insert it. When strings are used with functions it is like passing by reference. There is no pass by value for arrays! When strings are used with functions it is like passing by reference. There is no pass by value for arrays! Note there is another way of defining strings using the string class (not covered here) Note there is another way of defining strings using the string class (not covered here)