Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Janice Regan, CMPT 128, February. 2007 0 CMPT 128: Introduction to Computing Science for Engineering Students C-strings and strings.

Similar presentations


Presentation on theme: "© Janice Regan, CMPT 128, February. 2007 0 CMPT 128: Introduction to Computing Science for Engineering Students C-strings and strings."— Presentation transcript:

1 © Janice Regan, CMPT 128, February. 2007 0 CMPT 128: Introduction to Computing Science for Engineering Students C-strings and strings

2 1 Simple and Composite Variables  We have studied simple variables  A simple variable describes a single value  A simple variable has an identifier  A simple variable has a type that describes the properties of the value of the variable, the permissible operations for the variable, and the representation of the variable in computer memory  We can also have composite variables  These variables describe a group of values  Arrays: all values in the group have the same type  Structures: different values in the group can have different types

3 2 Composite Variables  composite variables describe a group of values  1 dimensional arrays or variables of a particular type (all entries must have the same type)  multi dimensional arrays or variables of a particular type (all entries must have the same type)  Structures containing groups of variables of different types  Strings are another special type that builds on arrays  An array of characters  A set of special operations appropriate for text

4 3 One-Dimensional (1-D) Arrays  An array is an indexed data structure  All variables stored in an array are of the same data type  An element of an array is accessed using the array name and an index or subscript  The name of the array is the address of the first element and the subscript is the offset  In C and C++, the subscripts always start with 0 and increment by 1

5 Strings (C++ string objects)  Easy to manipulate using operators like =, ==, =,, +  Only available in C++, there are no objects in C  Cannot mix C-strings and strings without doing explicit conversion/copy  Often system functions use C-strings so changing your strings to C-strings may be necessary 4

6 C-strings  In C if we wish to use a group of characters as a variable we can create a C-string.  The C-string is an extension of an array of characters.  The C-string can be manipulated using functions from the library for pure C for using C-strings in C++ 5

7 C-strings: limitations  C-strings cannot be combined using the operators we are used to like =, ==, +  Instead of using operators we use functions from the C-string library  When using C strings we must take extra care to check that we are not accessing memory past the end of the allocated character array that holds the C-string 6

8 7 1-D Character array char list[10];  allocates memory for 10 characters.  Ten adjacent locations in memory are allocated  Remember C++ does not perform any bounds checking on arrays list[0] list[1] list[2] list[3] list[4] list[5] list[6] list[7] list[9] list[8]

9 8 Initializing 1-D Arrays  Strings are not the same as 1-D character arrays  You can specify individual values for each character in a 1-D character array /* put one character in each element of the array*/ char list[5] = {‘h’,’e’,’l’,’l’, ‘o’}; After initialization memory looks like list[0] ‘h’ list[1] ‘e’ list[2] ‘l’ list[3] ‘l’ list[4] ‘o’ Because there is no null termination character the array cannot be used as a C-string

10 9 Difference: C-string vs 1D character array  C-Strings are character arrays with special properties that are operated on using functions from the string library  C-strings are partially filled character arrays that are used in a particular way in C++ or in C  A C-string always ends with a null termination character (\0) The null termination character tells all the functions in the string library where the string ends, allowing C-strings to be manipulated by the C string library

11 10 Strings of different lengths  Strings of different lengths can be stored in a character array  The maximum number of characters in the string is the number of characters in the array minus one  Blanks can be included in the string  Blanks count as characters char list[9] = {“hello”}; char list1[9] = {“hi jane”}; list[0] ‘h’ list[1] ‘i’ list[2] ‘ ’ list[3] ‘j’ list[4] ‘a’ list[5] ‘n’ list[6] ‘e’ list[7] ‘\0’ list[8] ‘\0’ list[0] ‘h’ list[1] ‘e’ list[2] ‘l’ list[3] ‘l’ list[4] ‘o’ list[5] ‘\0’ list[6] ‘\0’ list[7] ‘\0’ list[8] ‘\0’

12 11 Avoid a common problem (1)  C and C++ do not perform any bounds checking on arrays  This means that you can accidentally change the values of other variables. Changing the value of an array element that is not actually part of your array may change the value of one of your other variables  For a C-string variable this is particularly easy. You must remember that character array mystring[20] holds a string of no more than 19 characters “hello my friend” has 15+1 characters “joe” has 3+1 characters REMEMBER THE NULL TERMINATION CHARACTER

13 12 Avoid a common problem (2) int count = 3; char myArray[5] = {“hello”}; After the first declaration memory looks like After the second declaration statement above REMEMBER: Leave room for the \0 myArray[0] ‘h’ myArray[1] ‘e’ myArray[2] ‘l’ myArray[3] ‘l’ myArray[4] ‘o’ count ‘\0” myArray[0] ? myArray[1] ? myArray[2] ? myArray[3] ? myArray[4] ? count 3

14 13 Avoid a common problem (3)  C++ does not perform any bounds checking on arrays  By initializing or changing the contents of a string using functions from, a string that is longer than will fit into the character array associated with the string can be placed in the string, This may change the value of a completely different variable  It is imperative that you be very careful to avoid using strings longer than the allocated space

15 14 Avoid a common problem (3) int count [4]= {1,2,3,5}; char mySt[5] ; After the first declarations memory looks like After putting the string “my name” into the variable mySt mySt has no terminating \0, string library breaks In addition the array count has been corrupted mySt[0] ? mySt[1] ? mySt[2] ? mySt[3] ?? Count[0] 1 Count[1] 2 Count[2] 3 Count[3] 5 mySt[0] ‘m’ mySt[1] ‘y’ mySt[2] ‘ mySt[3] ‘n’ mySt[4] ‘a’ Count[0] ‘m’ Count[1] ‘e’ Count[2] ‘\0’ Count[3] 5 mySt[4]

16 15 Arrays of strings or C-strings  Declare your array of Strings #define NUMNAMES 20 #define MAXNAMELEN 32 char names[NUMNAMES][MAXNAMELEN]  Declare and Initialize your array #define NUMMONTHS 12 #define MONTHNAMESIZE 10 char month[NUMMONTHS][MONTHNAMESIZE] = { “January”, “February”, “March”, “April”, “May”, “June”, “July”, “August”, “September”, “October”, “November”, December” };

17 16 Initializing and array of strings or C-strings char month[12][10] = { “January”, “February”, “March”, “April”, “May”, “June”, “July”, “August”, “September”, “October”, “November”, December” }; ‘J’ ‘a’‘n’‘u’‘a’‘r’ ‘y’ ‘\0’ ‘F’ ‘e’‘b’‘r’‘u’‘a’ ‘r’ ‘y’‘\0’ ‘M’ ‘a’‘r’‘c’‘h’‘\0’ ‘A’ ‘p’‘r’‘i’‘l’‘\0’ ‘M’ ‘a’‘y’‘\0’ ‘J’ ‘u’‘n’‘e’‘\0’ ‘J’ ‘u’‘l’‘y’‘\0’ ‘A’ ‘u’‘g’‘u’‘s’‘t’ ‘0’ ‘\0’

18 17 Initializing and array of strings or C-Strings char month[12][10]; Month[0] = “January”: Month[10] = “November”; Month[11][1] = ‘D’; Month[11][7] = ‘r’; ‘J’ ‘a’‘n’‘u’‘a’‘r’ ‘y’ ‘\0’ ‘N’ ‘o’‘v’‘e’‘m’‘b’ ‘e’ ‘r’‘\0’ ‘D’ ‘e’‘c’‘e’‘m’‘b’ ‘e’ ‘r’?? ‘\0’

19 18 Initializing or changing a string or C-string  You can initialize or change the value of a C++ string variable by using the statement myString = “this is my string”; Anywhere in your code  You can initialize the values of a C-string in a declaration by using a statement like char myString[20] = “this is my string”;  After declaration you can only change the value of a C-string using a function from the string.h like strcpy(myString, “this is my string”);

20 19 Putting data into a 1-D Array  Another common way of assigning values to C- strings or arrays of C-strings is to read data values from a file directly into the string or array of strings  Each value read from the file is assigned to a single string(for example names[6])  A single row stored in the ith row in the array of strings names is referred to as names[i],  Note that checks to determine the file was opened correctly and that data was read correctly have been omitted from the example, they should not be omitted from your code

21 20 Array of C-strings read from a data file #define NUMPEOPLE 30 #define NAMELEN 32 char names[NUMPEOPLE][NAMELEN]; char title[30]; int ages[NUMPEOPLE]; int k; ifstream inStream; inStream.open(“registrants”); cin >> title; cout << title << endl; for(k=0; k<NUMPEOPLE; k++) { inStream >> names[k] >> ages[k]; cout << names[k] << “ “ << ages[k] << endl; }

22 21 Notes on array input  When you read or write a string or C-string your read or write all characters in that string  The final character in the string is determined by the location of the null termination character \0

23 22 Strings as function parameters  Arrays, or parts of arrays, can be passed as arguments to functions.  An element of a C-string can be used as a simple character variable parameter It can be passed by value or by reference  An entire array containing a C-string can be used as a parameter of a function It can only be passed by reference using the name of the string (the name of the string is a reference to the location in memory of the first character in the string)

24 23 Strings as a data type  Remember a data type has a group of objects (things) that can be combined in different ways using the operands for that data type.  The operands for C-strings are not those used for other data types (like +, -, = …)  All operations on C-strings are performed using functions from the string library (other than reading and writing)  To include the C-string library in your program include C++ include C

25 24 Assigning C-strings  You have seen that = can be used when assigning values to strings in a declaration  = cannot be used to assign a string literal to a C-string: The following is not valid mystring = “testinput”;  To copy one string to another the string library functions strcpy is usually used strcpy(mystring, “testinput”); strcpy(myCopiedString, myOriginalString);

26 25 Assigning strings  To copy one string to another the string library functions strcpy is usually used  To copy part of a string, (a substring) or to assure you do not copy more characters into a string that it can hold you can also use strncpy strncpy(mystring, “testinput”, 5);  Note that strncpy copies the first 5 characters of “testinput” only (testi) and does not add a \0 to the end of the copied string  strcpy copies the entire string (even it it is longer than the available space!) including the \0

27 26 Avoid a common problem (3) int count [4]= {1,2,3,5}; char mySt[5] ; strcpy(mySt, “my name”); After the declarations memory looks like After the strcpy statement above mySt has no terminating \0 in its array mySt[0] ? mySt[1] ? mySt[2] ? mySt[3] ? mySt[4] ? Count[0] 1 Count[1] 2 Count[2] 3 Count[3] 5 mySt[0] ‘m’ mySt[1] ‘y’ mySt[2] ‘ mySt[3] ‘n’ mySt[4] ‘a’ Count[0] ‘m’ Count[1] ‘e’ Count[2] ‘\0’ Count[3] 5

28 27 Finding the length of a string  To find the number of characters actually stored in a string use the string library function strlen int len; char mystring[30]; strcpy(mystring, “testing”); len = strlen(mystring); /* len now has a value 7 */  strlen counts the number of characters in the string not including the terminating \0

29 28 Concatenating Strings  Combining two strings into a single string  Use the string library functions strcat or strncat  strcat and strncat take one string and append it to the end of another string  The terminating \0 is removed from the end of the first string before the second string is added  The terminating \0 is replaced at the end of the second string  strcat and strncat can create a string too long to fit in the allocated string storage:

30 29 Example: using strcat char name1[10] = “marie”; char name2[10] = “anne”; strcat( name2, name1); marei\0‘\0’ ann\0e‘\0’ marei\0‘\0’ annmeari\0e

31 30 Concatenating Strings char mystring[20]=“start input: “; char mystring1[20] = “input1”; char mystring2[20] = “ and output” /* after the following strcat mystring1 contains */ /* “input1 and output” */ strcat(mystring1, mystring2); /* after the following strcat mystring1 contains */ /* “start input: input1 and output” */ /* this string overflows the array mystring */ strcat(mystring, mystring1);

32 31 Example: using strcat char mystring[20]=“start input: “; char mystring1[20] = “input1 and output”; strncat( mystring, mystring1); The array is overrun statr inup inptu1adn t: \0 ouptut statr inup inptu1adn t: ni p ut1 ouptut

33 32 Concatenating Strings #define STRLEN 20 int len, added; char mystring[STRLEN]=“start input: “; char mystring1[STRLEN] = “input1 and output”; /* To prevent overflow find the number of */ /* characters that can be added to mystring */ /* added (6) = STRLEN (20) – len(13) – 1 */ len = strlen(mystring); added = STRLEN – len -1; /* after the following strcat mystring1 contains */ /* “start input: input1” */ strncat(mystring, mystring1, added);

34 33 Comparing Strings  To compare 2 strings usually use the string library function strcmp strcmp(mystring1, mystring2)  strcmp returns an integer, if mystring1 is alphabetically before mystring2 a negative number will be returned If the strings are identical 0 will be returned if mystring2 is alphabetically before mystring1 a positive number will be returned

35 34 ASCII equivalents  Each alphabetic character, number, or other character (including whitespace characters) has an integer equivalent value  These integer values are used by strcmp to determine the alphabetical ordering.  All uppercase letters precede lower case letters  All numbers precede uppercase letter  A string st1 contains the first few characters of a longer string st2. st1 precedes st2 when compared

36  Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex | Char Dec Oct Hex --------------------------------------------- ---------------------------------------- (nul) 0 0000 0x00 | (sp) 32 0040 0x20 | @ 64 0100 0x40 | ` 96 0140 0x60 (soh) 1 0001 0x01 | ! 33 0041 0x21 | A 65 0101 0x41 | a 97 0141 0x61 (stx) 2 0002 0x02 | " 34 0042 0x22 | B 66 0102 0x42 | b 98 0142 0x62 (etx) 3 0003 0x03 | # 35 0043 0x23 | C 67 0103 0x43 | c 99 0143 0x63 (eot) 4 0004 0x04 | $ 36 0044 0x24 | D 68 0104 0x44 | d 100 0144 0x64 (enq) 5 0005 0x05 | % 37 0045 0x25 | E 69 0105 0x45 | e 101 0145 0x65 (ack) 6 0006 0x06 | & 38 0046 0x26 | F 70 0106 0x46 | f 102 0146 0x66 (bel) 7 0007 0x07 | ' 39 0047 0x27 | G 71 0107 0x47 | g 103 0147 0x67 (bs) 8 0010 0x08 | ( 40 0050 0x28 | H 72 0110 0x48 | h 104 0150 0x68 (ht) 9 0011 0x09 | ) 41 0051 0x29 | I 73 0111 0x49 | i 105 0151 0x69 (nl) 10 0012 0x0a | * 42 0052 0x2a | J 74 0112 0x4a | j 106 0152 0x6a (vt) 11 0013 0x0b | + 43 0053 0x2b | K 75 0113 0x4b | k 107 0153 0x6b (np) 12 0014 0x0c |, 44 0054 0x2c | L 76 0114 0x4c | l 108 0154 0x6c (cr) 13 0015 0x0d | - 45 0055 0x2d | M 77 0115 0x4d | m 109 0155 0x6d (so) 14 0016 0x0e |. 46 0056 0x2e | N 78 0116 0x4e | n 110 0156 0x6e (si) 15 0017 0x0f | / 47 0057 0x2f | O 79 0117 0x4f | o 111 0157 0x6f (dle) 16 0020 0x10 | 0 48 0060 0x30 | P 80 0120 0x50 | p 112 0160 0x70 (dc1) 17 0021 0x11 | 1 49 0061 0x31 | Q 81 0121 0x51 | q 113 0161 0x71 (dc2) 18 0022 0x12 | 2 50 0062 0x32 | R 82 0122 0x52 | r 114 0162 0x72 (dc3) 19 0023 0x13 | 3 51 0063 0x33 | S 83 0123 0x53 | s 115 0163 0x73 (dc4) 20 0024 0x14 | 4 52 0064 0x34 | T 84 0124 0x54 | t 116 0164 0x74 (nak) 21 0025 0x15 | 5 53 0065 0x35 | U 85 0125 0x55 | u 117 0165 0x75 (syn) 22 0026 0x16 | 6 54 0066 0x36 | V 86 0126 0x56 | v 118 0166 0x76 (etb) 23 0027 0x17 | 7 55 0067 0x37 | W 87 0127 0x57 | w 119 0167 0x77 (can) 24 0030 0x18 | 8 56 0070 0x38 | X 88 0130 0x58 | x 120 0170 0x78 (em) 25 0031 0x19 | 9 57 0071 0x39 | Y 89 0131 0x59 | y 121 0171 0x79 (sub) 26 0032 0x1a | : 58 0072 0x3a | Z 90 0132 0x5a | z 122 0172 0x7a (esc) 27 0033 0x1b | ; 59 0073 0x3b | [ 91 0133 0x5b | { 123 0173 0x7b (fs) 28 0034 0x1c | 62 0076 0x3e | ^ 94 0136 0x5e | ~ 126 0176 0x7e (us) 31 0037 0x1f | ? 63 0077 0x3f | _ 95 0137 0x5f | (del) 127 0177 0x7f 35

37 36

38 37 Comparing Parts of Strings  To compare the first n characters of 2 strings use the string library function strncmp strncmp(mystring1, mystring2, n)  strncmp returns an integer, if mystring1 is alphabetically before mystring2 a negative number will be returned If the strings are identical 0 will be returned if mystring2 is alphabetically before mystring1 a positive number will be returned

39 38 Character analysis  You can also analyze a string (or a character array) one character at a time  The ctype library #include includes functions for such analysis. Each of these functions returns an integer value. The value is nonzero if the condition checked is true (0 if it is false) isalpha(char mychar); /* is an alphanumeric char */ isdigit( char mychar); /* is a numeral */ ispunct(char mychar); /* is a non whitespace punctuation character */ isspace(char mychar); /* is a whitespace character */ tolower(char mychar); /* converts alphanumeric to lower case */ toupper(char mychar); /* converts alphanumeric to upper case */h

40 39 The ctype and cstring Libraries  We have had an introduction to some of the functions in these libraries.  These libraries are much more flexible than this subset of functions indicates  You should be able to read the function descriptions for the other functions in the string library and then use those functions in your programs

41 40 C-String Output  Can output with insertion operator, <<  As we’ve been doing already: cout << news << " Wow.\n";  Where news is a c-string variable  Possible because << operator is available for C-strings!

42 41 C-String Input  Can input with extraction operator, >>  Issues exist, however  Whitespace is "delimiter"  Tab, space, line breaks are "skipped"  Input reading "stops" at delimiter  Watch size of c-string Must be large enough to hold entered string! C++ gives no warnings of such issues!

43 42 C-String Line Input  Can receive entire line into c-string  Use getline(), a predefined member function: char a[80]; cout << "Enter input: "; cin.getline(a, 80); cout << a << "END OF OUTPUT\n";  Dialogue: Enter input: Hello friend! Hello friend!END OF INPUT

44 43 More getline()  Can explicitly tell length to receive: char shortString[5]; cout << "Enter input: "; cin.getline(shortString, 5); cout << shortString << "END OF OUTPUT\n";  Results: Enter input: Goodbye friend; GoodEND OF OUTPUT  Forces FOUR characters only be read Recall need for null character!

45 44 Character I/O  Input and output data  ALL treated as character data  e.g., number 10 outputted as '1' and '0'  Conversion done automatically Uses low-level utilities  Can use same low-level utilities ourselves as well

46 45 Member Function get()  Reads one char at a time  Member function of cin object: char nextSymbol; cin.get(nextSymbol);  Reads next char & puts in variable nextSymbol  Argument must be char type Not "string"!

47 46 Member Function put()  Outputs one character at a time  Member function of cout object:  Examples: cout.put('a');  Outputs letter "a" to screen char myString[10] = "Hello"; cout.put(myString[1]);  Outputs letter "e" to screen

48 47 More Member Functions  putback()  Once read, might need to "put back"  cin.putback(lastChar);  peek()  Returns next char, but leaves it there  peekChar = cin.peek();  ignore()  Skip input, up to designated character  cin.ignore(1000, '\n'); Skips at most 1000 characters until '\n'

49 48 Standard Class string  Defined in library: #include using namespace std;  String variables and expressions  Treated much like simple types  Can assign, compare, add: string s1, s2, s3; s3 = s1 + s2;//Concatenation s3 = "Hello Mom!"//Assignment  Note c-string "Hello Mom!" automatically converted to string type!

50 49 I/O with Class string  Just like other types!  string s1, s2; cin >> s1; cin >> s2;  Results: User types in: May the hair on your toes grow long and curly!  Extraction still ignores whitespace: s1 receives value "May" s2 receives value "the"

51 50 getline() with Class string  For complete lines: string line; cout << "Enter a line of input: "; getline(cin, line); cout << line << "END OF OUTPUT";  Dialogue produced: Enter a line of input: Long and curly? Long and curly?END OF INPUT  Similar to c-string’s usage of getline()

52 51 Pitfall: Mixing Input Methods  Be careful mixing cin >> var and getline  int n; string line; cin >> n; getline(cin, line);  If input is:42 Hello hitchhiker. Variable n set to 42 line set to empty string!  cin >> n skipped leading whitespace, leaving "\n" on stream for getline()!

53 52 Class string Processing  Same operations available as c-strings  And more!  Over 100 members of standard string class  Some member functions: .length() Returns length of string variable .at(i) Returns reference to char at position i

54 53 C-string and string Object Conversions  Automatic type conversions  From c-string to string object: char aCString[] = "My C-string"; string stringVar; stringVar = aCstring; Perfectly legal and appropriate!  aCString = stringVar; ILLEGAL! Cannot auto-convert to c-string  Must use explicit conversion: strcpy(aCString, stringVar.c_str());


Download ppt "© Janice Regan, CMPT 128, February. 2007 0 CMPT 128: Introduction to Computing Science for Engineering Students C-strings and strings."

Similar presentations


Ads by Google