Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strings and Characters

Similar presentations


Presentation on theme: "Strings and Characters"— Presentation transcript:

1 Strings and Characters
Chapter 16 Strings and Characters OMIT slides  Pearson Education, Inc. All rights reserved.  2002 Prentice Hall. All rights reserved. expanded by J. Goetz

2 The chief defect of Henry King Was chewing little bits of string.
Hilaire Belloc Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences. William Strunk, Jr. I have made this letter longer than usual, because I lack the time to make it short. Blaise Pascal The difference between the almost-right word & the right word is really a large matter—it’s the difference between the lightning bug and the lightning. Mark Twain

3 In this chapter you will learn:
OBJECTIVES In this chapter you will learn: To create and manipulate immutable character string objects of class string. To create and manipulate mutable character string objects of class StringBuilder. To manipulate character objects of struct Char. To use regular expressions in conjunction with classes Regex and Match. In order To use character classes to match any character from a set of characters. To use quantifiers to match a pattern multiple times. To search for complex patterns in text using regular expressions. To validate data using regular expressions and LINQ (online).

4 Outline: 16.1 Introduction 16.2 Fundamentals of Characters and Strings
16.3   string Constructors 16.4   string Indexer, Length Property and CopyTo Method 16.5   Comparing strings 16.6   Locating Characters and Substrings in strings 16.7   Extracting Substrings from strings 16.8   Concatenating strings 16.9   Miscellaneous string Methods 16.10   Class StringBuilder 16.11   Length and Capacity Properties, EnsureCapacity Method and Indexer of Class StringBuilder 16.12  Append and AppendFormat Methods of Class StringBuilder

5 Outline cn’t: 16.13 Insert, Remove and Replace Methods of Class StringBuilder 16.14 Char Methods 16.15  (Online) Introduction to Regular Expressions and Class Regex Regular Expression Example  Validating User Input with Regular Expressions and LINQ Regex methods Replace and Split 16.16 Wrap-Up

6 16.1 Introduction FCL’s (Framework Class Library) string and character processing capabilities Class string (an alias for class String) and type Char are from the System namespace StringBuilder from System.Text namespace Builds strings dynamically Regex and Match from the System.Text.RegularExpressions namespace Manipulate by patterns

7 16.2 Fundamentals of Characters and Strings
“Building blocks” of C# source programs Character constant A character that is represented as an integer value A character represented as a character in single quotes 1 byte – ASCII p.954 'z' is integer value of z Unicode character set uses 2 bytes – UTF-16 (C# IDE uses) or 4 bytes for UTF-32 see cshtp4_f_Unicode.pdf at International character set

8 16.2 Fundamentals of Characters and Strings
string (an alias for class String) Group of characters treated as a single unit May include letters, digits, some special characters (+,- *...) Strings are objects of class string From System namespace String literal (string constants or anonymous String objects) Sequence of characters written in double quotes "John Q. Doe" or " “ string literal objects are implicitly constant Assigning strings: May assign in declaration String color = "blue"; color is a String reference "blue" is an anonymous String object

9 16.2 Fundamentals of Characters and Strings
C# treats all anonymous String objects with same contents as one object with many references. That conserves memory. (Tip 16.1) string file = “C:\\MyFolder\\MyFile.txt” with escape sequences can be replaced by string file “C:\MyFolder\MyFile.txt” in which all characters interpreted literally by before the beginning quotation mark

10 16.2 Fundamentals of Characters and Strings
Examples demonstrate the following operations: length copy strings access individual chr search obtain substring compare concatenate replace chr convert to uppercase case or lowercase letters

11 16.2 Fundamentals of Characters and Strings
StringBuilder from System.Text namespace Builds strings dynamically Examples demonstrate the following operations: specifying the size of StringBuilder appending inserting removing replacing chrs in StringBuilder object

12 16.2 Fundamentals of Characters and Strings
Observation String objects are immutable -their character contents cannot be changed after they are created. Also, if there are one or more references to a String object (or any object for that matter), the object cannot be reclaimed by the garbage collector. Thus, a String reference cannot be used to modify a String object or to delete a String object from memory as in other programming languages, such as C or C++.

13 string Constructors 16.3 string Constructors
Can initialize string as if it was a primitive type e.g. string example = “I see…”; Can initialize string in the same way as a normal class (Eight constructors) string example = new string( “I see…” );

14 Provides 8 constructors s1 = new String()
16.3 string Constructors Class String Provides 8 constructors Constructors for class String s1 = new String() s1 is an empty string s2 = new String( anotherString ) s2 is a copy of anotherString s3 = new String( charArray ) s3 contains all the characters in charArray s4 = new String( charArray, offset, numElements ) s4 contains numElements characters from charArray, starting at location offset s5 = new String( byteArray, offset, numElements);//As above, but with a byteArray

15 Constructors for class String
16.3 string Constructors Constructors for class String s6 = new String( chr, repeatChrInString) the chr argument repeated a specified numbers of time exc. s6 = "CCCCC" StringBuilder Dynamically resizable, modifiable string (more later) StringBuilder buffer = new StringBuilder( "Hello there" ); Can initialize Strings with StringBuilder s7 = new String( buffer )

16 Outline StringConstructor.cs
Assigns string literal to string reference originalString Set string1 to have the same string literal as originalString One-argument constructor creates a string that contains a copy of the characters in the array argument Three-argument constructor creates a string that contains a copy of partial characters in the array argument Two-argument constructor creates a string that contains the character argument repeated a specified numbers of time

17 16.4 string Indexer, Length Property and CopyTo Method
Facilitates the retrieval of any character in the string Treats a string as an array of chars Return the character at the specific indexer position in the string for ( int i = string1.Length - 1; i >= 0; i-- ) output += string1[ i ]; Length property Returns the length of the string string1.Length CopyTo Method Copies a specified number of characters into a characterArray string1.CopyTo( 0, characterArray, 0, 5 ); public void CopyTo(int sourceIndex, char[] destination, int destinationIndex, int count) character array to copy to Index of location to put into character array Number of characters to copy from string Index to begin copying

18 Display the corresponding character at the indexer position of string1
1. // Fig. 16.2: (18.2ed3 and 16.2 ed5) StringMethods.cs // The ed5 uses Console.WriteLine(output) for the output 2 // Using the indexer, property Length and method CopyTo 3 // of class String. 4 5 using System; 6 using System.Windows.Forms; 7 8 // creates string objects and displays results of using 9 // indexer and methods Length and CopyTo 10 class StringMethods 11 { // The main entry point for the application. [STAThread] static void Main( string[] args ) { string string1, output; char[] characterArray; 18 string1 = "hello there"; characterArray = new char[ 5 ]; 21 // output string output = "string1: \"" + string1 + "\""; 25 // test Length property output += "\nLength of string1: " + string1.Length; 28 // loop through character in string1 and display // reversed output += "\nThe string reversed is: "; 32 for ( int i = string1.Length - 1; i >= 0; i-- ) output += string1[ i ]; 35 StringMethods.cs String declarations String1 to store string literal “hello there” Display the corresponding character at the indexer position of string1 Append to output of string1 in reverse order

19 StringMethods.cs Program Output
// copy characters from string1 into characterArray string1.CopyTo( 0, characterArray, 0, 5 ); output += "\nThe character array is: "; 39 for ( int i = 0 ; i < characterArray.Length; i++ ) output += characterArray[ i ]; //indexer 42 MessageBox.Show( output, "Demonstrating the string " + "Indexer, Length Property and CopyTo method", MessageBoxButtons.OK, MessageBoxIcon.Information ); 46 } // end method Main 48 49 } // end class StringMethods Method Copyto called by string1 StringMethods.cs Program Output public void CopyTo(int sourceIndex, char[] destination, int destinationIndex, int count) character array to copy to Index to begin copying Index of location to put into character array Number of characters to copy from string

20 Common Programming Error 16.1
Attempting to access a character that is outside a string’s bounds i.e., an index less than 0 or an index greater than or equal to the string’s length results in an IndexOutOfRangeException.

21 16.5 Comparing strings Comparing String objects Method Equals or ==
strings are compared character by character of each string, from left to right using a lexicographical comparison. The integer Unicode value that represent each character in each string are compared Method Equals or == Determine if the strings are the same Returns bool value Uses a lexicographical comparison Arranging items in alphabetic order like a dictionary. Method CompareTo Returns 0 if strings are equal Returns negative value if the string invoked is less than the string that is passed in Returns positive value if the string invoked is greater than the string that is passed in (“correct order”) Method StartsWith Determines if string instance starts with the string text passed to it as an argument Method EndWith Determines if string instance ends with the string text passed to it as an argument

22 Instance method Equals
1 // Fig (18 ed3 or 16.3 ed1): StringCompare.cs 2 // Comparing strings. 3 4 using System; 5 using System.Windows.Forms; 6 7 // compare a number of strings 8 class StringCompare 9 { // The main entry point for the application. [STAThread] static void Main( string[] args ) { string string1 = "hello"; string string2 = "good bye"; string string3 = "Happy Birthday"; string string4 = "happy birthday"; string output; 19 // output values of four strings output = "string1 = \"" + string1 + "\"" + "\nstring2 = \"" + string2 + "\"" + "\nstring3 = \"" + string3 + "\"" + "\nstring4 = \"" + string4 + "\"\n\n"; 25 // test for equality using Equals method if ( string1.Equals( "hello" ) ) output += "string1 equals \"hello\"\n"; else output += "string1 does not equal \"hello\"\n"; 31 // test for equality with == if ( string1 == "hello" ) output += "string1 equals \"hello\"\n"; StringCompare.cs Instance method Equals Compares the Unicode values in each string Operator == also tests two strings for equality using lexicographical comparison

23 Test equality between string3 and string4
else output += "string1 does not equal \"hello\"\n"; 37 // test for equality comparing case if ( String.Equals( string3, string4 ) ) output += "string3 equals string4\n"; else output += "string3 does not equal string4\n"; 43 // test CompareTo output += "\nstring1.CompareTo( string2 ) is " + string1.CompareTo( string2 ) + "\n" + "string2.CompareTo( string1 ) is " + string2.CompareTo( string1 ) + "\n" + "string1.CompareTo( string1 ) is " + string1.CompareTo( string1 ) + "\n" + "string3.CompareTo( string4 ) is " + string3.CompareTo( string4 ) + "\n" + "string4.CompareTo( string3 ) is " + string4.CompareTo( string3 ) + "\n\n"; 55 MessageBox.Show( output, "Demonstrating string " + "comparisons", MessageBoxButtons.OK, MessageBoxIcon.Information ); 59 } // end method Main 61 62 } // end class StringCompare Test equality between string3 and string4 static method Equals tests two strings for equality using lexicographical comparison StringCompare.cs Method CompareTo called to compare strings

24 Outline StringStartEnd.cs (1 of 2) Method StartsWith determines if a string instance starts with the string text passed to it If structure determine if string at index i starts with “st” "started" starts with "st" "starting" starts with "st“ "started" ends with "ed" "ended" ends with "ed"

25 If structure determine if string at index i starts with “ed”
Method EndsWith determines if a string instance ends with the string text passed to it If structure determine if string at index i starts with “ed” StringStartEnd.cs (2 of 2)

26 String Method GetHashCode
Hash table - you may omit it Of class Object Make information easily accessible Calculation to produce a hash code

27 Method GetHashCode is called to calculate for string1 and string2
1 // Fig ed1: StringHashCode.cs 2 // Demonstrating method GetHashCode of class String. 3 4 using System; 5 using System.Windows.Forms; 6 7 // testing the GetHashCode method 8 class StringHashCode 9 { // The main entry point for the application. [STAThread] static void Main( string[] args ) { 14 string string1 = "hello"; string string2 = "Hello"; string output; 18 output = "The hash code for \"" + string1 + "\" is " + string1.GetHashCode() + "\n"; 21 output += "The hash code for \"" + string2 + "\" is " + string2.GetHashCode() + "\n"; 24 MessageBox.Show( output, "Demonstrating String " + "method GetHashCode", MessageBoxButtons.OK, MessageBoxIcon.Information ); 28 } // end method Main 30 31 } // end class StringHashCode StringHashCode.cs Define two strings Method GetHashCode is called to calculate for string1 and string2 Hash code value for strings “hello” and “Hello”

28 16.6 Locating Characters and Substrings in strings
Search for characters in string Method IndexOf (chr, “from”, “# of chrs”) Returns the index of first occurrence of a character chr (or substring); -1 if not found Method IndexOfAny (chr_array, “from”, “# of chrs”) Same as IndexOf excepts it takes in an array of characters and returns the index of the first occurrence of any of the characters in the array Method LastIndexOf (chr, “from backward”, “# of chrs”) Returns the index of last occurrence of a character chr (or substring); Method LastIndexOfAny (chr_array, “from backward”, “# of chrs”) Same as LastIndexOf excepts it takes in an array of characters and returns the index of the last occurrence of any of the characters in the array

29 1 // Fig. 16.5: StringIndexMethods.cs
2 // Using String searching methods. 3 4 using System; 5 using System.Windows.Forms; 6 7 // testing indexing capabilities of strings 8 class StringIndexMethods 9 { // The main entry point for the application. [STAThread] static void Main( string[] args ) { // string letters = "abcdefghijklmabcdefghijklm"; string output = ""; char[] searchLetters = { 'c', 'a', '$' }; 17 // test IndexOf to locate a first character in a string output += "'c' is located at index " + letters.IndexOf( 'c' ); 21 output += "\n'a' is located at index " + letters.IndexOf( 'a', 1 ); //start output += "\n'$' is located at index " + letters.IndexOf( '$', 3, 5 ); //start, # of chrs to search // test LastIndexOf to find a character in a string output += "\n\nLast 'c' is located at " + "index " + letters.LastIndexOf( 'c' ); 31 output += "\nLast 'a' is located at index " + letters.LastIndexOf( 'a', 25 ); // start backward StringIndexMethods.cs IndexOf takes two arguments, the character to search for and the initial index of the search IndexOf takes three arguments, the character to search for, the initial index of the search and the number of characters to search Takes two argument, the character to search for and highest index to begin backward search

30 // //string letters = abcdefghijklmabcdefghijklm"; output += "\nLast '$' is located at index " + letters.LastIndexOf( '$', 15, 5 ); // '$‘, start backward, # of chrs // test IndexOf to locate a substring in a string output += "\n\n\"def\" is located at" + " index " + letters.IndexOf( "def" ); 41 output += "\n\"def\" is located at index " + letters.IndexOf( "def", 7 ); // start output += "\n\"hello\" is located at index " + letters.IndexOf( "hello", 5, 15 ); // start, # of chrs // test LastIndexOf to find a substring in a string output += "\n\nLast \"def\" is located at index " + letters.LastIndexOf( "def" ); 51 output += "\nLast \"def\" is located at " + letters.LastIndexOf( "def", 25 ); 54 output += "\nLast \"hello\" is located at index " + letters.LastIndexOf( "hello", 20, 15 ); //start backward, # of chrs // test IndexOfAny to find first occurrence of character // in array output += "\n\nFirst occurrence of 'c', 'a', '$' is " + "located at " + letters.IndexOfAny( searchLetters ); 62 output += "\nFirst occurrence of 'c, 'a' or '$' is " + "located at " + letters.IndexOfAny( searchLetters, 7 ); // from output += "\nFirst occurrence of 'c', 'a' or '$' is " + "located at " + letters.IndexOfAny( searchLetters, 20, 5 ); // from, # StringIndexMethods.cs Instead of sending character arguments, these two methods search substring argument Method IndexOfAny take an array of characters as the first argument returns the index of first occurrence of any characters specified

31 LastIndexOfAny takes an array of characters as the first argument
// test LastIndexOfAny to find last occurrence of character // in array output += "\n\nLast occurrence of 'c', 'a' or '$' is " + "located at " + letters.LastIndexOfAny( searchLetters ); 73 output += "\nLast occurrence of 'c', 'a' or '$' is " + "located at " + letters.LastIndexOfAny( searchLetters, 1 ); 76 output += "\nLast occurrence of 'c', 'a' or '$' is " + "located at " + letters.LastIndexOfAny( searchLetters, 25, 5 ); 80 MessageBox.Show( output, "Demonstrating class index methods", MessageBoxButtons.OK, MessageBoxIcon.Information ); 84 } // end method Main 86 87 } // end class StringIndexMethods LastIndexOfAny takes an array of characters as the first argument Method LastIndexOfAny return the index of first occurrence of any of the character from the argument // string letters = abcdefghijklmabcdefghijklm";

32 Common Programming Error 16.2
In the overloaded methods LastIndexOf and LastIndexOfAny that take 3 parameters, the 2nd argument must be >= to the 3rd. This might seem counterintuitive, but remember that the search moves from the end of the string toward the start of the string. output += "\nLast occurrence of 'c', 'a' or '$' is " + "located at " + letters.LastIndexOfAny( searchLetters, 25, 5 ); Method LastIndexOfAny (chr_array, “from backward”, “# of chrs”)

33 16.7 Extracting Substrings from strings
Method Substring Creates and returns a new string by copying part of an existing string SubString.cs Beginning at index 20, copy all the characters from letters If index specified is not inside the bound, then ArgumentOutOfRangeException thrown Extract the characters from index 0 to 6 from letters

34 16.8 Concatenating Strings
SubConcatenation.cs Concatenate string2 to string1 However, string1 is not modified by method Concat Static Method Concat or + Takes two string and return a new string

35 16.9 Miscellaneous string Methods
Method Replace Returns a new string replacing every occurrence of the specified phrase with another phrase in the string Method ToLower Returns a new lower cased version of the string Method ToUpper Returns a new upper cased version of the string For all the above methods Original string remains unchanged Original string return if no occurrence matched

36 16.9 Miscellaneous string Methods
Method Trim Remove whitespaces Remove characters in the array argument Method ToString Can be called to obtain a string representation of any object

37 Replace all instances of ‘e’ with ‘E’ in string1
1 // Fig ed5 and 15.9 ed1: StringMiscellaneous2.cs 2 // Demonstrating String methods Replace, ToLower, ToUpper, Trim 3 // and ToString. 4 // The ed5 uses Console.WriteLine(output) for the output 5 using System; 6 using System.Windows.Forms; 7 8 // creates strings using methods Replace, ToLower, ToUpper, Trim 9 class StringMethods2 10 { // The main entry point for the application. [STAThread] static void Main( string[] args ) { string string1 = "cheers!"; string string2 = "GOOD BYE "; string string3 = " spaces "; string output; 19 output = "string1 = \"" + string1 + "\"\n" + "string2 = \"" + string2 + "\"\n" + "string3 = \"" + string3 + "\""; 23 // call method Replace output += "\n\nReplacing \"e\" with \"E\" in string1: \"" + string1.Replace( 'e', 'E' ) + "\""; 28 // call ToLower and ToUpper output += "\n\nstring1.ToUpper() = \"" + string1.ToUpper() + "\"\nstring2.ToLower() = \"" + string2.ToLower() + "\""; 33 StringMiscellaneous2.cs Method Replace return new string with correct revision based on the argument Replace all instances of ‘e’ with ‘E’ in string1 String to search for Method ToLower return a new string from string2 by lowercase equivalence

38 Method ToString to show string1 have not been modified
// call Trim method output += "\n\nstring3 after trim = \"" + string3.Trim() + "\""; 37 // call ToString method output += "\n\nstring1 = \"" + string1.ToString() + "\""; 40 MessageBox.Show( output, "Demonstrating various string methods", MessageBoxButtons.OK, MessageBoxIcon.Information ); 44 } // end method Main 46 47 } // end class StringMethods2 Method Trim to remove all whitespace character at the beginning or end of string3 Method ToString to show string1 have not been modified

39 Class StringBuilder from System.Text namespace
Builds and manipulate strings dynamically Class StringBuilder Used to create and manipulate dynamic string information Every StringBuilder can store the number of characters specified by its capacity Exceeding the capacity of a StringBuilder makes the capacity expand to accommodate the additional characters The default initial capacity is 16 characters 6 overloaded constructors

40 Performance Tip 16.2 Objects of class string are immutable (i.e., constant strings), whereas object of class StringBuilder are mutable. C# can perform certain optimizations involving strings (such as the sharing of one string among multiple references), because it knows these objects will not change.

41 Namespace for class StringBuilder
Outline Namespace for class StringBuilder StringBuilderConstructor.cs No-argument constructor creates empty StringBuilder with capacity of 16 characters One-argument constructor creates empty StringBuilder with capacity of specified (10) characters One-argument constructor creates StringBuilder with string “hello” and capacity of 16 characters

42 EnsureCapacity Method – set capacity
16.11 Length and Capacity Properties, EnsureCapacity Method and Indexer of Class StringBuilder Length Property Return number of characters currently in the StringBuilder Capacity Property Return number of characters that the StringBuilder can store without allocating more memory EnsureCapacity Method – set capacity Reduce/increase the capacity of the StringBuilder’s to make it equal to the requested # Indexers is like that of string

43 Outline Create a new StringBuilder
StringBuilderFeatures.cs (1 of 2) Property Length returns the number of characters currently in the buffer Property Capacity returns the number of characters that buffer can store without allocating more memory Use method EnsureCapacity to set capacity to 75 Use property Length to set length to 10 buffer = Hello, how are you? Length = 19 Capacity = 32 New capacity = 75 New length = 10

44 Print corresponding character to the indexer’s position in buffer
Property Length returns the number of characters currently in the buffer Outline Print corresponding character to the indexer’s position in buffer StringBuilderFeatures.cs (2 of 2)

45 Common Programming Error 16.3
Assigning null to a string reference can lead to logic errors if you attempt to compare null to an empty string. The keyword null is a value that represents a null reference (i.e., a reference that does not refer to an object), not an empty string (which is a string object that is of length 0 and contains no characters).

46 16.12 Append and AppendFormat Methods of Class StringBuilder
Method Append Appends the string representation to the end the StringBuilder 19 overloaded methods The FCL provides version for each of simple types and for chr arrays, strings and objects Method AppendFormat Converts a string to a specified format, then appends it to the StringBuilder

47 Outline StringBuilderAppend.cs (1 of 2)

48 Outline Append a space character to StringBuilder
object objectValue = "hello"; string stringValue = "good bye"; char[] characterArray = { 'a', 'b', 'c', 'd', 'e', 'f' }; bool booleanValue = true; char characterValue = 'Z'; int integerValue = 7; long longValue = ; float floatValue = 2.5F; // F suffix indicates that 2.5 is a float double doubleValue = ; StringBuilder buffer = new StringBuilder(); Outline StringBuilderAppend.cs (2 of 2) Append a space character to StringBuilder Append string “good bye” Append “a b c d e f” Append “a b c” Append boolean, char, int, long, float and double Print out results Exc: what happens if ToString () is omitted

49 Outline (1 of 2) Combine the string literal and the arguments together
StringBuilderAppendFormat.cs (1 of 2) String literal that contains formatting information – two placeholders {0} and {1:C}, C means a currency value class object – the base class of all classes string1’s arguments Combine the string literal and the arguments together This car costs: $1, Number:005.

50 Combine the string literal and the argument together
0:d3 – a 3 digit decimal , if less then 3 will have the leading 0 placed Another string literal that contains formatting information: 4 chrs right aligned and 4 chrs left aligned StringBuilderAppendFormat.cs (2 of 2) Combine the string literal and the argument together

51 16.13 Insert, Remove and Replace Methods of Class StringBuilder
Method Insert Allow various types of data to be inserted at any position Method Remove Delete any portion of StringBuilder Method Replace Searches for a specified string or character and substitutes another string or character in its place

52 Outline StringBuilderInsertRemove.cs (1 of 3)

53 Use method Insert to insert data in beginning of StringBuilder
object objectValue = "hello"; string stringValue = "good bye"; char[] characterArray = { 'a', 'b', 'c', 'd', 'e', 'f' }; bool booleanValue = true; char characterValue = ‘K'; int integerValue = 7; long longValue = ; float floatValue = 2.5F; // F suffix indicates that 2.5 is a float double doubleValue = ; StringBuilder buffer = new StringBuilder(); Outline StringBuilderInsertRemove.cs (2 of 3) Use method Insert to insert data in beginning of StringBuilder Use method Remove to remove character from index 10 in StringBuilder Remove characters from indices 4 through 7

54 Outline StringBuilderReplace.cs Replace “Jane” with “Greg” in builder1
Replace “g” with “G” in the first 5 characters of builder2

55 Char is an alias for struct Char
16.14 Char Methods Char is an alias for struct Char Like classes, struct (short for structure) can have methods and properties, and can use the access modifiers public or private Members are accessed via the member access operator (.) Struct type Represents value types structs derive from class ValueType, which in turn derives from Object All struct types are implicitly sealed, so they don’t support virtual or abstract methods, and they member cannot be declared protected or protected internal Most methods are static IsDigit IsLetter IsLetterOrDigit IsLower IsUpper ToLower ToUpper IsPunctuation IsSymbol

56 16.14  Char Methods (Cont.) Char method IsDigit determines whether a character is defined as a digit. IsLetter determines whether a character is a letter. IsLetterOrDigit determines whether a character is a letter or a digit. IsLower determines whether a character is a lowercase letter. IsUpper determines whether a character is an uppercase letter. ToUpper returns a character’s uppercase equivalent, or the original argument if there is no uppercase equivalent.

57 16.14  Char Methods (Cont.) ToLower returns a character lowercase equivalent, or the original argument if there is no lowercase equivalent. IsPunctuation determines whether a character is a punctuation mark, such as "!", ":" or ")". IsSymbol determines whether a character is a symbol, such as "+", "=" or "^". Structure type Char contains more static methods similar to those shown in this example, such as IsWhiteSpace. The struct also contains several public instance methods, many of which we have seen before in other classes, such as ToString, Equals, and CompareTo.

58 Convert the user’s input to a char
Outline StaticCharMethods.cs (1 of 3) Convert the user’s input to a char // for console application: 18 Console.Write (“Enter character”); 19 char character = Convert.ToChar( Console.ReadLine()};

59 Outline Determine whether character is a digit
Determine whether character is a letter Determine whether character is a letter or a digit Determine whether character is a letter or a digit Determine whether character is lowercase and uppercase, respectively Convert character to its uppercase and lowercase, respectively Determine whether character is a punctuation such as "!", ":" or " Determine whether character is a symbol such as "+", "=" or "^"

60 (a) Outline (b) (c) StaticCharMethods.cs (3 of 3) (d) (e)

61 Card Shuffling and Dealing Simulation
You may omit it. This example shows how strings could be used in programs Fields that represents a card

62 Outline DeckForm.cs (1 of 5)
An array of Cards to represent a deck of cards DeckForm.cs (1 of 5) An array of strings to represent the many faces of a card An array of strings to represent the many suits of a card

63 Outline (2 of 5) DeckForm.cs
Assign a face and a suit to every card of the deck Outline Store the dealt card Display the dealt card Notify user that no cards remain DeckForm.cs (2 of 5)

64 Outline (3 of 5) DeckForm.cs
Create a Random object to make shuffle random Swap cards for shuffling DeckForm.cs (3 of 5)

65 Outline DeckForm.cs (4 of 5) Shuffle cards

66 (a) Outline DeckForm.cs (5 of 5) (b) (c) (d)

67 16.15 Regular Expressions and Class Regex
On line available Goal: find pattern in text Compilers use to validate the syntax of programs Regular expression (as regex or regexp) – specially formated strings used to find patterns in text Regex Class From namespace System.Text.RegularExpressions Represents an immutable regular expression Specially formatted strings Method Match Returns an object of class Match that represents a single regular expression match Method Matches Finds all matches of a regular expression in a string and returns an object of the class MatchCollection object containing all the Matches Regular expression also referred to as regex or regexp, provides a concise and flexible means for matching strings of text, such as particular characters, words, or patterns of characters.

68 Fig. 16.18 | Demonstrating basic regular expressions. (Part 1 of 3.)
Regular expressions are specially formatted strings used to find patterns in text.  Simple Regular Expressions and Class Regex Figure 16.16 demonstrates the basic regular-expression classes. Outline BasicRegex.cs (1 of 3 ) The test string is "regular expressions are sometimes called regex or regexp" Match 'e' in the test string: e Match every 'e' in the test string: This regular expression matches the literal character "e" anywhere in an arbitrary string. Match the leftmost occurrence of the character "e" in testString. Fig | Demonstrating basic regular expressions. (Part 1 of 3.)

69 Fig. 18.18 | Demonstrating basic regular expressions. (Part 2 of 3.)
BasicRegex.cs (2 of 3 ) Class Regex also provides method Matches, which finds all matches and returns a MatchCollection objet. The Regex static method Matches takes a regular expression as an argument in addition to the string to be searched. Fig | Demonstrating basic regular expressions. (Part 2 of 3.) The 1st leftmost in the expression

70 16.15 Introduction to Regular-Expression Processing
Class Regex represents a regular expression. Regex method Match returns an object of class Match that represents a single regular-expression match. Class Match’s ToString method returns the substring that matched the regular expression. Class Regex also provides method Matches, which finds all matches and returns a MatchCollection object.

71 16.15 Introduction to Regular-Expression Processing
A MatchCollection is a collection, similar to an array, and can be used with a foreach statement to iterate through the collection’s elements. Regular expressions can also be used to match a sequence of literal characters anywhere in a string. The Regex static method Matches takes a regular expression as an argument in addition to the string to be searched. foreach ( var myMatch in Regex.Matches( testString, "regex" ) )

72 16.15 Introduction to Regular-Expression Processing
A metacharacter is a character with special meaning in a regular expression. A quantifier is a metacharacter that describes how many times a part of the pattern may occur in a match. The ? quantifier matches zero or one occurrence of the pattern to its left. The "|" (alternation) metacharacter matches the expression to its left or to its right. That the "|" character attempts to match the entire expression to its left or to its right. Alternation chooses the leftmost match in the string for either of the alternating expressions.

73 16.15 Introduction to Regular-Expression Processing (Cont.)
Regular-Expression Character Classes and Quantifiers A character class represents a group of characters that might appear in a string. Quantifiers are used in regular expressions to denote how often a particular character or set of characters can appear in a match. The table in Fig. 16.19 lists some character classes that can be used with regular expressions. Fig | Character classes.

74 Figure 16.18 uses character classes in regular expressions.
We precede the regular expression string to avoid having to escape all backslashes. Match any character that isn’t a digit. Notice in the output that this includes punctuation and whitespace. Fig | Demonstrating using character classes and quantifiers. (Part 1 of 4.)

75 Create a character class to match any lowercase letter from a to f.
"abc, DEF, 123" The + quantifier matches one or more occurrences of the pattern to its left. Lazy ?—it matches the shortest possible occurrence of the pattern. Create a character class to match any lowercase letter from a to f. Matches any character that isn’t in the range a-f. Fig | Demonstrating using character classes and quantifiers. (Part 2 of 4.)

76 "abc, DEF, 123" Use a foreach statement to display each Match in the MatchCollection object returned by Regex’s static method Matches. Fig | Demonstrating using character classes and quantifiers. (Part 3 of 4.)

77 16.15 Introduction to Regular-Expression Processing (Cont.)
The Negating a character class matches everything that isn’t a member of the character class. The + quantifier matches one or more occurrences of the pattern to its left. All Quantifiers are greedy—they match the longest possible occurrence of the pattern. You can follow a quantifier with a question mark (?) to make it lazy—it matches the shortest possible occurrence of the pattern. Figure 16.21 lists other quantifiers that you can place after a pattern in a regular expression, and the purpose of each.

78 16.15 Introduction to Regular-Expression Processing (Cont.)
You can also use quantifiers with custom character classes. You can also use the "." (dot) character to match any character other than a newline. The regular expression ".*" matches any sequence of characters. The * quantifier matches zero or more occurrences of the pattern to its left. Fig | Quantifiers used in regular expressions. The pattern to the left of {n} must occur exactly n times.

79 16.15.1 Regular Expression Example
The dot character “.” matches any single character except a newline character When the dot character is followed by an asterisk, .* the regular expression matches any number of unspecified character except newlines You can create your own character class by listing the members of the character class between square brackets, [ and ]. Range of characters are represented by placing a dash [A-Z] between two characters Metacharacters in square brackets are treated as literal characters. ([N|S] \.) matches N. or S. You can negate a custom character class by placing a "^" character after the opening square bracket. Can specify that pattern should match anything other than the characters in the brackets using “^” = means “not” inside [ ] Ex: [^4] matches any non-digit 4 i.e. digits other than 4

80 Fig. 16.19 | Character classes.
Character classes is an escape sequence that represents a group of chrs that might Appear in a string. \d{5} – any 5 digits The \s character class matches a single whitespace character. @"J.*\d[0-35-9]-\d\d-\d\d“ Preceding marks (take literally) all regular backslash chrs within the double quotation, not as the beginning of escape sequences \d[0-35-9] – matches a two-digit number (of which the 2nd digit cannot be 4) @"J.*\d[0-35-9]-\d\d-\d\d“ or @"J.*\d[\d–[4]]-\d\d-\d\d“ searches for a string that starts with the letter “J”, followed by any number of chrs, followed by a two-digit number (of which the 2nd digit cannot be 4), followed by a dash, followed by another two-digit number, followed by a dash and another two-digit number.

81 Fig. 16.19 | Character classes.
Character classes is an escape sequence that represents a group of chrs that might Appear in a string. \d{5} – any 5 digits @"J.*\d[\d–[4]]-\d\d-\d\d“ When the "-" character in a character class ex. [\d–[4]] is followed by a character class, the members of the character class following the "-" are removed from the character class preceding the "-". When using character-class subtraction, the class being subtracted must be the last item in the enclosing square brackets. Tools to create and check the regular expression:

82 Outline Using System.Text.RegularExpressions namespace for finding patterns in the text RegexMatches.cs Create a Regex object with a regular expression string string1 used for finding patterns Find and print out all matches of reg. expression @"J.*\d[\d-[4]]-\d\d-\d\d" that starts with the letter “J”, followed by any number of chrs, followed by a two-digit number (of which the 2nd digit cannot be 4), followed by a dash, followed by another two-digit number, followed by a dash and another two-digit number. Found 2 matches that conform the pattern specified by the regular expression.

83 16.15.2 Validating User Input with Regular Expressions
Match Property Success Indicates whether there was a match if ( !Regex.Match( lastNameTextBox.Text, "^[A-Z][a-zA-Z]*$" ).Success ) MessageBox.Show( "Invalid last name"); "^[A-Z][a-zA-Z]*$" ^ and $ represent the beginning and end of a string respectively These characters force a regular expression to return a match only if the entire string being processed matches the regular expression. This expr mathes any string consisting of one uppercase letter, followed by zero or more additional letters [a-zA-Z]* "|" matches the expression to its left or to its right Hi (John | Joe) matches both Hi John and Hi Joe @"^[0-9]+\s+([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$" This address can contain a number consisting of one or more digits followed by a space \s, and a word of one or more chrs or | a word of one or more chrs followed by a space and another word of one or more chrs. “10 Canon” and “101 Main Street” are valid, but 3 words are not valid Parentheses can be used to group parts of a regular expression @"^[0-9]+\s+([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$“ Quantifiers may be applied to patterns enclosed in parentheses to create more complex regular expressions

84 16.15.3 Validating User Input with Regular Expressions and LINQ
The application in Fig. 16.21 uses regular expressions to validate name, address and telephone-number information input by a user. Outline Validate.cs ( 1 of 7 ) Fig | Validating user information using regular expressions. (Part 1 of 7.)

85 You may include a second where clause after the let clause.
Fig | Validating user information using regular expressions. (Part 2 of 7.) When working with nongeneric collections, such as Controls, you must explicitly type the range variable. If one or more TextBoxes are empty, the program displays a message to the user that all fields must be filled in before the program can validate the information. The let clause creates and initializes a variable in a LINQ query for use later in the query. You may include a second where clause after the let clause. The Success property of class Match indicates whether the Match method found a match.

86 This expr. matches any string consisting
of one uppercase letter, followed by zero or more additional letters [a-zA-Z]* This address can contain a number consisting of one or more digits followed by a space, and a word of one or more chrs or | a word of one or more chrs followed by a space and another word of one or more chrs. Fig | Validating user information using regular expressions. (Part 3 of 7.) This city can contain a word of one or more chrs or | a word of one or more chrs followed by a space and another word of one or more chrs.

87 Outline This state (as city) can contain a word of one or more chrs or | a word of one or more chrs followed by a space and another word of one or more chrs. Fig | Validating user information using regular expressions. (Part 4 of 7.)

88 The Success property of class Match indicates whether the Match method found a match.
Call Regex static method Match, passing both the string to validate and the regular expression as arguments. Fig | Validating user information using regular expressions. (Part 5 of 7.)

89 Outline Validate.cs (1 of 7)

90 Outline (2 of 7) Validate.cs
This expr. matches any string consisting of one uppercase letter, followed by zero or more additional letters [a-zA-Z]* Validate.cs (2 of 7) The Success property indicates whether the first argument matches the pattern from the second argument

91 Validate.cs (3 of 7) This address can contain a number consisting of One or more digits followed by a space, and a word of one or more chrs or | a word of one or more chrs followed by a space and another word of one or more chrs. The Success property indicates whether the first argument matches the pattern from the second argument This city can contain a word of one or more chrs or | a word of one or more chrs followed by a space and another word of one or more chrs.

92 Outline (4 of 7) Validate.cs
\d{5} – any 5 digits ^\d{5} – only 5 digits The Success property indicates whether the first argument matches the pattern from the second argument

93 Outline // accepts axx-byy-yyyy // a, b cannot be 0 Validate.cs (5 of 7) The Success property indicates whether the first argument matches the pattern from the second argument Hides the form Terminate program

94 16.15.3 Regex Methods Replace and Split
Method Replace Replaces text in a string with new text wherever the original string matches a regular expression Regex.Replace( “replacement by this" ) A static version of the Replace method takes the string to modify, the regular expression string, and the replacement string. Replace is also an instance method that uses the regular expression passed to the constructor of the calling Regex object.

95 16.15.3 Regex Methods Replace and Split
Breaking a string into tokens (several substrings) is called tokenization Tokens are separated from one another by delimiters, typically white-space characters such as blank, tab, newline and carriage return The original string is broken at delimiters that match a specified regular expression Other characters may also be used as delimiters to separate tokens Method Split Divides a string into several substrings (tokens) The first argument of the static Split is the string to split; the second argument is the regular expression that represents the delimiter result = Regex.Split( ); Returns an array containing the substrings

96 strings used for finding patterns
Original string: This sentence ends in 5 stars ***** ^ substituted for *: This sentence ends in 5 stars ^^^^^ "carets" substituted for "stars": This sentence ends in 5 carets ^^^^^ Every word replaced by "word": word word word word word word ^^^^^ Original string: 1, 2, 3, 4, 5, 6, 7, 8 Replace first 3 digits by "digit": digit, digit, digit, 4, 5, 6, 7, 8 string split at commas ["1", "2", "3", "4", "5", "6", "7", "8"] strings used for finding patterns Create a Regex object Replaces all “*” with “^” in testString1 “*” is not used as a quantifier b/c we use \ and the meaning is changed Replaces “stars” with “carets” in testString1 Replaces every word with “word” Replaces the first 3 digits with “digit” RegexSubstitution.cs (1 of 2)

97 (2 of 2) Regular expression “,\s”
// 2nd parameter is the delimiter Regular expression “,\s” Split the string at commas and a white space and have each substring as an element of a string array RegexSubstitution.cs (2 of 2)


Download ppt "Strings and Characters"

Similar presentations


Ads by Google