Presentation is loading. Please wait.

Presentation is loading. Please wait.

Characters and Strings. Representation of single characters Data type char is the data type that represents single characters, such as letters, numerals,

Similar presentations


Presentation on theme: "Characters and Strings. Representation of single characters Data type char is the data type that represents single characters, such as letters, numerals,"— Presentation transcript:

1 Characters and Strings

2 Representation of single characters Data type char is the data type that represents single characters, such as letters, numerals, and punctuation marks A literal value of type char is written as a single character enclosed within single quotation marks Examples: ‘a’, ‘F’, ‘9’, ‘&’, ‘ ’, ‘,’

3 Character encoding ASCII stands for American Standard Code for Information Interchange. ASCII is one of the document coding schemes widely used today. This coding scheme allows different computers to share information easily. Most programming languages support ASCII characters

4 ASCII Encoding ASCII works well for English-language documents because all characters and punctuation marks are included in the ASCII codes. ASCII does not represent the full character sets of other languages.

5 ASCII Encoding For example, character 'O' is 79 (row value 70 + col value 9 = 79). O 9 70

6 Limitations of ASCII ASCII uses 8 bits to represent a single character –One bit is reserved for the sign in standard ASCII –This leaves 2 7 (128) unique combinations of bits to represent characters –The extended ASCII set uses all 8 bits to represent a character, given 256 unique combinations

7 Unicode Encoding The Unicode Worldwide Character Standard (Unicode) supports the interchange, processing, and display of the written texts of diverse languages. Java uses the Unicode standard for representing char constants. Each Unicode character occupies 16 bits, allowing for the possibility of 2 16 (65,536) unique bit combinations Currently 34,168 distinct characters are defined, covering most of the major world languages

8 ASCII/Unicode equivalence Unicode uses the same bit combinations for the characters that exist in the ASCII set Thus, an English alphabetic character has the same numeric value in both ASCII and Unicode

9 Special characters Several keys on a standard keyboard don’t translate directly into printable (or displayable) characters For example, the Enter key moves the cursor to a new line; we already know that the character that corresponds to this action can be represented as ‘\n’

10 Special characters Some other special characters used in Java include: –‘\t’: horizontal tab character –‘\a’: alarm “character” – causes system speaker to beep –‘\\’: a single backslash

11 Converting between char and int char ch1 = 'X'; System.out.println(ch1); System.out.println( (int) ch1); X 88 We can convert between a numeric (int) value and its corresponding ASCII character equivalent by using type casting, as the examples below illustrate: int x = 99; System.out.println(x);// prints 99 System.out.println( (char) x);// prints c

12 Character comparison Values of type char can be compared just like integers are compared, since they are actually stored as binary whole numbers In the ASCII (and Unicode) set, uppercase letters have lower numeric value than lowercase letters So, for example, ‘A’ is less than ‘a’, and ‘b’ is greater than ‘Z’

13 Strings A string is a sequence of characters that is treated as a single value. Instances of the String class are used to represent strings in Java. We access individual characters of a string by calling the charAt method of the String object.

14 Strings Each character in a string has an index we use to access the character. Java uses zero-based indexing; the first character’s index is 0, the second is 1, and so on. To refer to the first character of the word name, we say name.charAt(0)

15 String indexing with charAt method An indexed expression is used to refer to individual characters in a string.

16 Constructing strings Since String is a class, we can create an instance of a class by using the new method. –The statements we have used so far, such as String name1 = “Kona”; –works as a shorthand for String name1 = new String(“Kona”); –But this shorthand works for the String class only.

17 Example: Counting Vowels char letter; String name = JOptionPane.showInputDialog(null,"Your name:"); int numberOfCharacters = name.length(); int vowelCount= 0; for (int i = 0; i < numberOfCharacters; i++) { letter = name.charAt(i); if (letter == 'a' || letter == 'A' || letter == 'e' || letter == 'E' || letter == 'i' || letter == 'I' || letter == 'o' || letter == 'O' || letter == 'u' || letter == 'U' ) { vowelCount++; } System.out.print(name + ", your name has " + vowelCount + " vowels"); Here’s the code to count the number of vowels in the input string.

18 Example: Counting ‘Java’ int javaCount = 0; boolean repeat = true; String word; while ( repeat ) { word = JOptionPane.showInputDialog(null,"Next word:"); if ( word.equals("STOP") ) { repeat = false; } else if ( word.equalsIgnoreCase("Java") ) { javaCount++; } Continue reading words and count how many times the word Java occurs in the input, ignoring the case. Notice how the comparison is done. We are not using the == operator.

19 Other Useful String Operators MethodMeaning compareTo Compares the two strings. str1.compareTo( str2 ) substring Extracts the a substring from a string. str1.substring( 1, 4 ) trim Removes the leading and trailing spaces. str1.trim( ) valueOf Converts a given primitive data value to a string. String.valueOf( 123.4565 ) startsWith Returns true if a string starts with a specified prefix string. str1.startsWith( str2 ) endsWith Returns true if a string ends with a specified suffix string. str1.endsWith( str2 )

20 Comparing Strings Comparing String objects is similar to comparing other objects. The equality test (==) is true if the contents of the variables are the same. For a reference data type, the equality test is true if both variables refer to the same object, because they both contain the same address. Thus, the “contents of the variable” does not mean “the sequence of characters in the String”

21 Comparing Strings We don’t usually use the == operator to compare Strings The equals method is true if the String objects to which the two variables refer contain the same string value. String s1 = new String (“hello”); String s2 = new String (“hello”); if (s1 == s2) System.out.println (“They are equal”);// this won’t print if (s1.equals(s2)) System.out.println (“No, really, they are”);// this will print

22 The difference between the equality test and the equals method

23 … continued

24 Comparing Strings String comparison may be done in several ways. –The methods equals and equalsIgnoreCase compare string values; one is case-sensitive and one is not. –The method compareTo returns a value: Zero (0) if the strings are equal. A negative integer if the first string is less than the second. A positive integer if the first string is greater than the second.

25 Comparing Strings As long as a new String object is created using the new operator, the rule for comparing objects applies to comparing strings. String str = new String (“Java”); If the new operator is not used, string data are treated as if they are of the primitive data type. String str = “Java”;

26 The difference between using and not using the new operator for String

27 Pattern Matching and Regular Expressions Pattern matching is a common function in many applications. In Java 2 SDK 1.4, two new classes, Pattern and Matcher, are added. The String class also includes several new methods that support pattern matching.

28 Pattern Example Suppose students are assigned a three-digit code: –The first digit represents the major (5 indicates computer science); –The second digit represents either in-state (1), out-of-state (2), or foreign (3); –The third digit indicates campus housing: On-campus dorms are numbered 1-7. Students living off-campus are represented by the digit 8. The 3-digit pattern to represent computer science majors living on-campus is 5[123][1-7] first character is 5 second character is 1, 2, or 3 third character is any digit between 1 and 7

29 Pattern Matching and Regular Expression The pattern is called a regular expression that allows us to denote a large set of “words” (any sequence of symbols) succinctly. Brackets [ ] represent choices, so [abc] means a, b, or c. For example, the definition for a valid Java identifier may be stated as [a-zA-Z][a-zA-Z0-9_$]*

30 Regular Expressions Rules –The brackets [ ] represent choices –The asterisk symbol * means zero or more occurrences. –The plus symbol + means one or more occurrences. –The hat symbol ^ means negation. –The hyphen – means ranges. –The parentheses ( ) and the vertical bar | mean a range of choices for multiple characters.

31 Regular Expression Examples ExpressionDescription [013] A single digit 0, 1, or 3. [0-9][0-9] Any two-digit number from 00 to 99. [0-9&&[^4567]] A single digit that is 0, 1, 2, 3, 8, or 9. [a-z0-9] A single character that is either a lowercase letter or a digit. [a-zA-z][a-zA-Z0-9_$]* A valid Java identifier consisting of alphanumeric characters, underscores, and dollar signs, with the first character being an alphabet. [wb](ad|eed) Matches wad, weed, bad, and beed. (AZ|CA|CO)[0-9][0-9] Matches AZxx, CAxx, and COxx, where x is a single digit.

32 More Examples ExpressionDescription X{N}Repeat X exactly N times, where X is a regular expression for a single character. X{N,}Repeat X at least N times. X{N,M}Repeat X at least N but no more than M times.

33 Pattern Matching and Regular Expression The matches method from the String class is similar to the equals method. However, unlike equals, the argument to matches can be a pattern.

34 Pattern Matching and Regular Expression The period symbol (.) is used to match any character except a line terminator (\n or \r). String document; document =...; //assign text to ‘document’ if (document.matches(“.*zen of objects.*”){ System.out.println(“Found”); } else { System.out.println(“Not found”); }

35 Pattern Matching and Regular Expression Brackets ([ ]) are used for expressing a range of choices for a given character. To express a range of choices for multiple characters, use parentheses and the vertical bar.

36 Pattern Matching and Regular Expression ExpressionDescription [wb](ad|eed) Matches wad, weed, bad, and beed. (pro|anti)-OOP Matches pro-OOP and anti-OOP (AZ|CA|CO)[0-9]{4} Matches AZxxxx, CAxxxx, and COxxxx, where x is a single digit.

37 Pattern Matching and Regular Expression The replaceAll method is new to the Version 1.4 String class. This method allows us to replace all occurrences of a substring that matches a given regular expression with a given replacement string.

38 Pattern Matching and Regular Expression For example, to replace all vowels in a string with the @ symbol: String originalText, modifiedText; originalText =...; //assign string to ‘originalText’ modifiedText = originalText.replaceAll(“[aeiou]”,”@”); Note that this method does not change the original text; it simply returns a modified text as a separate string.

39 Pattern Matching and Regular Expression To match a whole word, use the \b symbol to designate the word boundary. str.replaceAll(“\\btemp\\b”, “temporary”); Two backslashes are necessary because we must write the expression in a String representation. Two backslashes prevents the system from interpreting the regular expression backslash as a control character.

40 Pattern Matching and Regular Expression The backslash is also used to search for a command character. For example: –To search for the plus symbol (+) in text, we use the backslash as \+. –To express it as a string, we write “\\+”.

41 The Pattern and Matcher Classes The matches and replaceAll methods of the String class are shorthand for using the Pattern and Matcher classes from the java.util.regex package.

42 The Pattern and Matcher Classes If str and regex are String objects, then both str.matches(regex); and Pattern.matches(regex, str); are equivalent to Pattern pattern = Pattern.compile(regex); Matcher matcher = p.matcher(str); matcher.matches();

43 The Pattern and Matcher Classes Creating Pattern and Matcher objects gives us more options and efficiency. The compile method of the Pattern class converts the stated regular expression to an internal format to carry out the pattern- matching operation. This conversion is carried out every time the matches method of the String or Pattern class is executed.

44 The Pattern and Matcher Classes /* Chapter 9 Sample Program: Checks whether the input string is a valid identifier. This version uses the Matcher and Pattern classes. File: Ch9MatchJavaIdentifier2.java */ import javax.swing.*; import java.util.regex.*; class Ch9MatchJavaIdentifier2 { private static final String STOP = STOP"; private static final String VALID ="Valid Java identifier"; private static final String INVALID ="Not a valid Java identifier";

45 The Pattern and Matcher Classes private static final String VALID_IDENTIFIER_PATTERN = "[a-zA-Z][a-zA-Z0-9_$]*"; public static void main (String[] args) { String str, reply; Matcher matcher; Pattern pattern = Pattern.compile(VALID_IDENTIFIER_PATTERN); while (true) { str = JOptionPane.showInputDialog null,"Identifier:"); if (str.equals(STOP)) break;

46 The Pattern and Matcher Classes matcher = pattern.matcher(str); if (matcher.matches()) { reply = VALID; } else { reply = INVALID; } JOptionPane.showMessageDialog(null, str + ":\n" + reply); } // ends loop } // ends main } // ends class

47 The Pattern and Matcher Classes The find method is another powerful method of the Matcher class. The method searches for the next sequence in a string that matches the pattern, and returns true if the pattern is found.

48 The Pattern and Matcher Classes When a matcher finds a matching sequence of characters, we can query the location of the sequence by using the start and end methods.

49 The Pattern and Matcher Classes The start method returns the position in the string where the first character of the pattern is found. The end method returns the value 1 more than the position in the string where the last character of the pattern is found.

50 The String Class is Immutable In Java a String object is immutable –This means once a String object is created, it cannot be changed, such as replacing a character with another character or removing a character –The String methods we have used so far do not change the original string. They created a new string from the original. For example, substring creates a new string from a given string. The String class is defined in this manner for efficiency reasons.

51 Effect of Immutability We can do this because String objects are immutable.

52 The StringBuffer Class In many string processing applications, we would like to change the contents of a string. In other words, we want it to be mutable. Manipulating the content of a string, such as replacing a character, appending a string with another string, deleting a portion of a string, and so on, may be accomplished by using the StringBuffer class.

53 StringBuffer Example StringBuffer word = new StringBuffer("Java"); word.setCharAt(0, 'D'); word.setCharAt(1, 'i'); Changing a string Java to Diva word : StringBuffer Java Before word : StringBuffer Diva After word.setCharAt(0, 'D'); word.setCharAt(1, 'i');

54 Sample Processing Replace all vowels in the sentence with ‘X’. char letter; String inSentence = JOptionPane.showInputDialog(null, "Sentence:"); StringBuffer tempStringBuffer = new StringBuffer(inSentence); int numberOfCharacters = tempStringBuffer.length(); for (int index = 0; index < numberOfCharacters; index++) { letter = tempStringBuffer.charAt(index); if (letter == 'a' || letter == 'A' || letter == 'e' || letter == 'E' || letter == 'i' || letter == 'I' || letter == 'o' || letter == 'O' || letter == 'u' || letter == 'U' ) { tempStringBuffer.setCharAt(index,'X'); } JOptionPane.showMessageDialog(null, tempStringBuffer );

55 The append and insert Methods We use the append method to append a String or StringBuffer object to the end of a StringBuffer object. –The method can also take an argument of the primitive data type. –Any primitive data type argument is converted to a string before it is appended to a StringBuffer object. We can insert a string at a specified position by using the insert method.

56 The StringBuilder Class This class is new to Java 5.0 (SDK 1.5) The class is added to the newest version of Java to improve the performance of the StringBuffer class. StringBuffer and StringBuilder support exactly the same set of methods, so they are interchangeable.

57 The StringBuilder Class There are advanced cases where we must use StringBuffer, but all sample applications in the book, StringBuilder can be used. Since the performance is not our main concern and that the StringBuffer class is usable for all versions of Java, we will use StringBuffer only in this book.


Download ppt "Characters and Strings. Representation of single characters Data type char is the data type that represents single characters, such as letters, numerals,"

Similar presentations


Ads by Google