Characters and Strings. Representation of single characters Data type char is the data type that represents single characters, such as letters, numerals,

Slides:



Advertisements
Similar presentations
Character Arrays (Single-Dimensional Arrays) A char data type is needed to hold a single character. To store a string we have to use a single-dimensional.
Advertisements

©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter Chapter 9 Characters and Strings.
Liang, Introduction to Java Programming, Ninth Edition, (c) 2013 Pearson Education, Inc. All rights reserved. 1 Chapter 9 Strings.
Chapter 7 Strings F To process strings using the String class, the StringBuffer class, and the StringTokenizer class. F To use the String class to process.
1 Strings and Text I/O. 2 Motivations Often you encounter the problems that involve string processing and file input and output. Suppose you need to write.
Java Programming Strings Chapter 7.
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. ETC - 1 What comes next? Recursion (Chapter 6, 15) Recursive Data Structures.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 9: Characters * Character primitives * Character Wrapper class.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
Chapter 9 Characters and Strings. Topics Character primitives Character Wrapper class More String Methods String Comparison String Buffer String Tokenizer.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 9 Characters and Strings (sections ,
Intro to OOP with Java, C. Thomas Wu Characters and Strings
Fundamental Programming Structures in Java: Strings.
28-Jun-15 String and StringBuilder Part I: String.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Adrian Ilie COMP 14 Introduction to Programming Adrian Ilie June 27, 2005.
© 2000 McGraw-Hill Introduction to Object-Oriented Programming with Java--WuChapter Chapter 8 Characters and Strings.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 9 Characters and Strings (sections ,
JavaScript, Fourth Edition
JavaScript, Third Edition
Chapter 7. 2 Objectives You should be able to describe: The string Class Character Manipulation Methods Exception Handling Input Data Validation Namespaces.
Slides prepared by Rose Williams, Binghamton University Chapter 1 Getting Started 1.3 The Class String.
String Escape Sequences
Chapter 3: Introduction to C Programming Language C development environment A simple program example Characters and tokens Structure of a C program –comment.
Lesson 3 – Regular Expressions Sandeepa Harshanganie Kannangara MBCS | B.Sc. (special) in MIT.
Characters, String and Regular expressions. Characters char data type is used to represent a single character. Characters are stored in a computer memory.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 9 Characters and Strings.
Java How to Program, 9/e © Copyright by Pearson Education, Inc. All Rights Reserved.
Java Primitives The Smallest Building Blocks of the Language (corresponds with Chapter 2)
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Characters In Java, single characters are represented.
Number Systems Spring Semester 2013Programming and Data Structure1.
Characters The data type char represents a single character in Java. –Character values are written as a symbol: ‘a’, ‘)’, ‘%’, ‘A’, etc. –A char value.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 9 Strings and Text.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 4 th Ed Chapter Chapter 9 Characters and Strings (adapted from.
Chapter 2: Java Fundamentals
Regular Expressions.
Portions adapted with permission from the textbook author. CS-1020 Dr. Mark L. Hornick 1 Regular Expressions and String processing Animated Version.
Java Programming: From Problem Analysis to Program Design, 4e Chapter 2 Basic Elements of Java.
Using Data Within a Program Chapter 2.  Classes  Methods  Statements  Modifiers  Identifiers.
An Introduction to Java Programming and Object-Oriented Application Development Chapter 7 Characters, Strings, and Formatting.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Java Overview. Comments in a Java Program Comments can be single line comments like C++ Example: //This is a Java Comment Comments can be spread over.
Chapter 7: Characters, Strings, and the StringBuilder.
1 Textual Data Many computer applications manipulate textual data word processors web browsers online dictionaries.
Chapter 3: Classes and Objects Java Programming FROM THE BEGINNING Copyright © 2000 W. W. Norton & Company. All rights reserved Java’s String Class.
Vladimir Misic: Characters and Strings1Tuesday, 9:39 AM Characters and Strings.
Strings and Related Classes String and character processing Class java.lang.String Class java.lang.StringBuffer Class java.lang.Character Class java.util.StringTokenizer.
Java String 1. String String is basically an object that represents sequence of char values. An array of characters works same as java string. For example:
17-Feb-16 String and StringBuilder Part I: String.
A FIRST BOOK OF C++ CHAPTER 14 THE STRING CLASS AND EXCEPTION HANDLING.
Winter 2016CISC101 - Prof. McLeod1 CISC101 Reminders Quiz 3 this week – last section on Friday. Assignment 4 is posted. Data mining: –Designing functions.
CSE 110: Programming Language I Matin Saad Abdullah UB 1222.
Strings A string is a sequence of characters that is treated as a single value. Strings are objects. We have been using strings all along. For example,
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
© 2004 Pearson Addison-Wesley. All rights reserved August 27, 2007 Primitive Data Types ComS 207: Programming I (in Java) Iowa State University, FALL 2007.
Strings, Characters and Regular Expressions
Strings, Characters and Regular Expressions
Strings, StringBuilder, and Character
Chapter 3 : Basic Concept Of Java
Primitive Data Types August 28, 2006 ComS 207: Programming I (in Java)
Multiple variables can be created in one declaration
Primitive Types Vs. Reference Types, Strings, Enumerations
Advanced Programming Behnam Hatami Fall 2017.
String and StringBuilder
Chapter 7: Strings and Characters
Object Oriented Programming
MSIS 655 Advanced Business Applications Programming
String and StringBuilder
Strings A string is a sequence of characters that is treated as a single value. Strings are objects. We have been using strings all along. Every time.
String and StringBuilder
Presentation transcript:

Characters and Strings

Representation of single characters Data type char is the data type that represents single characters, such as letters, numerals, and punctuation marks A literal value of type char is written as a single character enclosed within single quotation marks Examples: ‘a’, ‘F’, ‘9’, ‘&’, ‘ ’, ‘,’

Character encoding ASCII stands for American Standard Code for Information Interchange. ASCII is one of the document coding schemes widely used today. This coding scheme allows different computers to share information easily. Most programming languages support ASCII characters

ASCII Encoding ASCII works well for English-language documents because all characters and punctuation marks are included in the ASCII codes. ASCII does not represent the full character sets of other languages.

ASCII Encoding For example, character 'O' is 79 (row value 70 + col value 9 = 79). O 9 70

Limitations of ASCII ASCII uses 8 bits to represent a single character –One bit is reserved for the sign in standard ASCII –This leaves 2 7 (128) unique combinations of bits to represent characters –The extended ASCII set uses all 8 bits to represent a character, given 256 unique combinations

Unicode Encoding The Unicode Worldwide Character Standard (Unicode) supports the interchange, processing, and display of the written texts of diverse languages. Java uses the Unicode standard for representing char constants. Each Unicode character occupies 16 bits, allowing for the possibility of 2 16 (65,536) unique bit combinations Currently 34,168 distinct characters are defined, covering most of the major world languages

ASCII/Unicode equivalence Unicode uses the same bit combinations for the characters that exist in the ASCII set Thus, an English alphabetic character has the same numeric value in both ASCII and Unicode

Special characters Several keys on a standard keyboard don’t translate directly into printable (or displayable) characters For example, the Enter key moves the cursor to a new line; we already know that the character that corresponds to this action can be represented as ‘\n’

Special characters Some other special characters used in Java include: –‘\t’: horizontal tab character –‘\a’: alarm “character” – causes system speaker to beep –‘\\’: a single backslash

Converting between char and int char ch1 = 'X'; System.out.println(ch1); System.out.println( (int) ch1); X 88 We can convert between a numeric (int) value and its corresponding ASCII character equivalent by using type casting, as the examples below illustrate: int x = 99; System.out.println(x);// prints 99 System.out.println( (char) x);// prints c

Character comparison Values of type char can be compared just like integers are compared, since they are actually stored as binary whole numbers In the ASCII (and Unicode) set, uppercase letters have lower numeric value than lowercase letters So, for example, ‘A’ is less than ‘a’, and ‘b’ is greater than ‘Z’

Strings A string is a sequence of characters that is treated as a single value. Instances of the String class are used to represent strings in Java. We access individual characters of a string by calling the charAt method of the String object.

Strings Each character in a string has an index we use to access the character. Java uses zero-based indexing; the first character’s index is 0, the second is 1, and so on. To refer to the first character of the word name, we say name.charAt(0)

String indexing with charAt method An indexed expression is used to refer to individual characters in a string.

Constructing strings Since String is a class, we can create an instance of a class by using the new method. –The statements we have used so far, such as String name1 = “Kona”; –works as a shorthand for String name1 = new String(“Kona”); –But this shorthand works for the String class only.

Example: Counting Vowels char letter; String name = JOptionPane.showInputDialog(null,"Your name:"); int numberOfCharacters = name.length(); int vowelCount= 0; for (int i = 0; i < numberOfCharacters; i++) { letter = name.charAt(i); if (letter == 'a' || letter == 'A' || letter == 'e' || letter == 'E' || letter == 'i' || letter == 'I' || letter == 'o' || letter == 'O' || letter == 'u' || letter == 'U' ) { vowelCount++; } System.out.print(name + ", your name has " + vowelCount + " vowels"); Here’s the code to count the number of vowels in the input string.

Example: Counting ‘Java’ int javaCount = 0; boolean repeat = true; String word; while ( repeat ) { word = JOptionPane.showInputDialog(null,"Next word:"); if ( word.equals("STOP") ) { repeat = false; } else if ( word.equalsIgnoreCase("Java") ) { javaCount++; } Continue reading words and count how many times the word Java occurs in the input, ignoring the case. Notice how the comparison is done. We are not using the == operator.

Other Useful String Operators MethodMeaning compareTo Compares the two strings. str1.compareTo( str2 ) substring Extracts the a substring from a string. str1.substring( 1, 4 ) trim Removes the leading and trailing spaces. str1.trim( ) valueOf Converts a given primitive data value to a string. String.valueOf( ) startsWith Returns true if a string starts with a specified prefix string. str1.startsWith( str2 ) endsWith Returns true if a string ends with a specified suffix string. str1.endsWith( str2 )

Comparing Strings Comparing String objects is similar to comparing other objects. The equality test (==) is true if the contents of the variables are the same. For a reference data type, the equality test is true if both variables refer to the same object, because they both contain the same address. Thus, the “contents of the variable” does not mean “the sequence of characters in the String”

Comparing Strings We don’t usually use the == operator to compare Strings The equals method is true if the String objects to which the two variables refer contain the same string value. String s1 = new String (“hello”); String s2 = new String (“hello”); if (s1 == s2) System.out.println (“They are equal”);// this won’t print if (s1.equals(s2)) System.out.println (“No, really, they are”);// this will print

The difference between the equality test and the equals method

… continued

Comparing Strings String comparison may be done in several ways. –The methods equals and equalsIgnoreCase compare string values; one is case-sensitive and one is not. –The method compareTo returns a value: Zero (0) if the strings are equal. A negative integer if the first string is less than the second. A positive integer if the first string is greater than the second.

Comparing Strings As long as a new String object is created using the new operator, the rule for comparing objects applies to comparing strings. String str = new String (“Java”); If the new operator is not used, string data are treated as if they are of the primitive data type. String str = “Java”;

The difference between using and not using the new operator for String

Pattern Matching and Regular Expressions Pattern matching is a common function in many applications. In Java 2 SDK 1.4, two new classes, Pattern and Matcher, are added. The String class also includes several new methods that support pattern matching.

Pattern Example Suppose students are assigned a three-digit code: –The first digit represents the major (5 indicates computer science); –The second digit represents either in-state (1), out-of-state (2), or foreign (3); –The third digit indicates campus housing: On-campus dorms are numbered 1-7. Students living off-campus are represented by the digit 8. The 3-digit pattern to represent computer science majors living on-campus is 5[123][1-7] first character is 5 second character is 1, 2, or 3 third character is any digit between 1 and 7

Pattern Matching and Regular Expression The pattern is called a regular expression that allows us to denote a large set of “words” (any sequence of symbols) succinctly. Brackets [ ] represent choices, so [abc] means a, b, or c. For example, the definition for a valid Java identifier may be stated as [a-zA-Z][a-zA-Z0-9_$]*

Regular Expressions Rules –The brackets [ ] represent choices –The asterisk symbol * means zero or more occurrences. –The plus symbol + means one or more occurrences. –The hat symbol ^ means negation. –The hyphen – means ranges. –The parentheses ( ) and the vertical bar | mean a range of choices for multiple characters.

Regular Expression Examples ExpressionDescription [013] A single digit 0, 1, or 3. [0-9][0-9] Any two-digit number from 00 to 99. [0-9&&[^4567]] A single digit that is 0, 1, 2, 3, 8, or 9. [a-z0-9] A single character that is either a lowercase letter or a digit. [a-zA-z][a-zA-Z0-9_$]* A valid Java identifier consisting of alphanumeric characters, underscores, and dollar signs, with the first character being an alphabet. [wb](ad|eed) Matches wad, weed, bad, and beed. (AZ|CA|CO)[0-9][0-9] Matches AZxx, CAxx, and COxx, where x is a single digit.

More Examples ExpressionDescription X{N}Repeat X exactly N times, where X is a regular expression for a single character. X{N,}Repeat X at least N times. X{N,M}Repeat X at least N but no more than M times.

Pattern Matching and Regular Expression The matches method from the String class is similar to the equals method. However, unlike equals, the argument to matches can be a pattern.

Pattern Matching and Regular Expression The period symbol (.) is used to match any character except a line terminator (\n or \r). String document; document =...; //assign text to ‘document’ if (document.matches(“.*zen of objects.*”){ System.out.println(“Found”); } else { System.out.println(“Not found”); }

Pattern Matching and Regular Expression Brackets ([ ]) are used for expressing a range of choices for a given character. To express a range of choices for multiple characters, use parentheses and the vertical bar.

Pattern Matching and Regular Expression ExpressionDescription [wb](ad|eed) Matches wad, weed, bad, and beed. (pro|anti)-OOP Matches pro-OOP and anti-OOP (AZ|CA|CO)[0-9]{4} Matches AZxxxx, CAxxxx, and COxxxx, where x is a single digit.

Pattern Matching and Regular Expression The replaceAll method is new to the Version 1.4 String class. This method allows us to replace all occurrences of a substring that matches a given regular expression with a given replacement string.

Pattern Matching and Regular Expression For example, to replace all vowels in a string with symbol: String originalText, modifiedText; originalText =...; //assign string to ‘originalText’ modifiedText = Note that this method does not change the original text; it simply returns a modified text as a separate string.

Pattern Matching and Regular Expression To match a whole word, use the \b symbol to designate the word boundary. str.replaceAll(“\\btemp\\b”, “temporary”); Two backslashes are necessary because we must write the expression in a String representation. Two backslashes prevents the system from interpreting the regular expression backslash as a control character.

Pattern Matching and Regular Expression The backslash is also used to search for a command character. For example: –To search for the plus symbol (+) in text, we use the backslash as \+. –To express it as a string, we write “\\+”.

The Pattern and Matcher Classes The matches and replaceAll methods of the String class are shorthand for using the Pattern and Matcher classes from the java.util.regex package.

The Pattern and Matcher Classes If str and regex are String objects, then both str.matches(regex); and Pattern.matches(regex, str); are equivalent to Pattern pattern = Pattern.compile(regex); Matcher matcher = p.matcher(str); matcher.matches();

The Pattern and Matcher Classes Creating Pattern and Matcher objects gives us more options and efficiency. The compile method of the Pattern class converts the stated regular expression to an internal format to carry out the pattern- matching operation. This conversion is carried out every time the matches method of the String or Pattern class is executed.

The Pattern and Matcher Classes /* Chapter 9 Sample Program: Checks whether the input string is a valid identifier. This version uses the Matcher and Pattern classes. File: Ch9MatchJavaIdentifier2.java */ import javax.swing.*; import java.util.regex.*; class Ch9MatchJavaIdentifier2 { private static final String STOP = STOP"; private static final String VALID ="Valid Java identifier"; private static final String INVALID ="Not a valid Java identifier";

The Pattern and Matcher Classes private static final String VALID_IDENTIFIER_PATTERN = "[a-zA-Z][a-zA-Z0-9_$]*"; public static void main (String[] args) { String str, reply; Matcher matcher; Pattern pattern = Pattern.compile(VALID_IDENTIFIER_PATTERN); while (true) { str = JOptionPane.showInputDialog null,"Identifier:"); if (str.equals(STOP)) break;

The Pattern and Matcher Classes matcher = pattern.matcher(str); if (matcher.matches()) { reply = VALID; } else { reply = INVALID; } JOptionPane.showMessageDialog(null, str + ":\n" + reply); } // ends loop } // ends main } // ends class

The Pattern and Matcher Classes The find method is another powerful method of the Matcher class. The method searches for the next sequence in a string that matches the pattern, and returns true if the pattern is found.

The Pattern and Matcher Classes When a matcher finds a matching sequence of characters, we can query the location of the sequence by using the start and end methods.

The Pattern and Matcher Classes The start method returns the position in the string where the first character of the pattern is found. The end method returns the value 1 more than the position in the string where the last character of the pattern is found.

The String Class is Immutable In Java a String object is immutable –This means once a String object is created, it cannot be changed, such as replacing a character with another character or removing a character –The String methods we have used so far do not change the original string. They created a new string from the original. For example, substring creates a new string from a given string. The String class is defined in this manner for efficiency reasons.

Effect of Immutability We can do this because String objects are immutable.

The StringBuffer Class In many string processing applications, we would like to change the contents of a string. In other words, we want it to be mutable. Manipulating the content of a string, such as replacing a character, appending a string with another string, deleting a portion of a string, and so on, may be accomplished by using the StringBuffer class.

StringBuffer Example StringBuffer word = new StringBuffer("Java"); word.setCharAt(0, 'D'); word.setCharAt(1, 'i'); Changing a string Java to Diva word : StringBuffer Java Before word : StringBuffer Diva After word.setCharAt(0, 'D'); word.setCharAt(1, 'i');

Sample Processing Replace all vowels in the sentence with ‘X’. char letter; String inSentence = JOptionPane.showInputDialog(null, "Sentence:"); StringBuffer tempStringBuffer = new StringBuffer(inSentence); int numberOfCharacters = tempStringBuffer.length(); for (int index = 0; index < numberOfCharacters; index++) { letter = tempStringBuffer.charAt(index); if (letter == 'a' || letter == 'A' || letter == 'e' || letter == 'E' || letter == 'i' || letter == 'I' || letter == 'o' || letter == 'O' || letter == 'u' || letter == 'U' ) { tempStringBuffer.setCharAt(index,'X'); } JOptionPane.showMessageDialog(null, tempStringBuffer );

The append and insert Methods We use the append method to append a String or StringBuffer object to the end of a StringBuffer object. –The method can also take an argument of the primitive data type. –Any primitive data type argument is converted to a string before it is appended to a StringBuffer object. We can insert a string at a specified position by using the insert method.

The StringBuilder Class This class is new to Java 5.0 (SDK 1.5) The class is added to the newest version of Java to improve the performance of the StringBuffer class. StringBuffer and StringBuilder support exactly the same set of methods, so they are interchangeable.

The StringBuilder Class There are advanced cases where we must use StringBuffer, but all sample applications in the book, StringBuilder can be used. Since the performance is not our main concern and that the StringBuffer class is usable for all versions of Java, we will use StringBuffer only in this book.