Text Processing and Regex API

Slides:



Advertisements
Similar presentations
Software University Curriculum, Courses, Exams, Jobs SoftUni Team Technical Trainers Software University
Advertisements

Methods, Arrays, Lists, Dictionaries, Strings, Classes and Objects
Svetlin Nakov Technical Trainer Software University
Test-Driven Development Learn the "Test First" Approach to Coding SoftUni Team Technical Trainers Software University
Java Collections Basics Arrays, Lists, Strings, Sets, Maps Svetlin Nakov Technical Trainer Software University
Java Collections Arrays, Lists, Strings, Sets, Maps SoftUni Team Technical Trainers Software University
Regular Expressions /^Hel{2}o\s*World\n$/ SoftUni Team Technical Trainers Software University
Java Collections Basics Arrays, Lists, Strings, Sets, Maps Bogomil Dimitrov Technical Trainer Software University
Inheritance Class Hierarchies SoftUni Team Technical Trainers Software University
Stacks and Queues Processing Sequences of Elements SoftUni Team Technical Trainers Software University
Generics SoftUni Team Technical Trainers Software University
Strings and Text Processing
XML Processing SoftUni Team Database Applications Technical Trainers
Version Control Systems
Static Members and Namespaces
Functional Programming
Databases basics Course Introduction SoftUni Team Databases basics
C# Basic Syntax, Visual Studio, Console Input / Output
ASP.NET Essentials SoftUni Team ASP.NET MVC Introduction
C# Basic Syntax, Visual Studio, Console Input / Output
Introduction to MVC SoftUni Team Introduction to MVC
Deploying Web Application
PHP MVC Frameworks Course Introduction SoftUni Team Technical Trainers
PHP Fundamentals Course Introduction SoftUni Team Technical Trainers
Reflection SoftUni Team Technical Trainers Java OOP Advanced
/^Hel{2}o\s*World\n$/
Classes, Properties, Constructors, Objects, Namespaces
Mocking tools for easier unit testing
Parsing JSON JSON.NET, LINQ-to-JSON
PHP MVC Frameworks MVC Fundamentals SoftUni Team Technical Trainers
Processing Sequences of Elements
/^Hel{2}o\s*World\n$/
EF Relations Object Composition
Entity Framework: Code First
Repeating Code Multiple Times
Data Definition and Data Types
Databases advanced Course Introduction SoftUni Team Databases advanced
Arrays, Lists, Stacks, Queues
Install and configure theme
Balancing Binary Search Trees, Rotations
Entity Framework: Relations
Fast String Manipulation
Methods, Arrays, Lists, Dictionaries, Strings, Classes and Objects
Array and List Algorithms
Functional Programming
ASP.NET Razor Engine SoftUni Team ASP.NET MVC Introduction
Processing Variable-Length Sequences of Elements
Regular Expressions (RegEx)
Functional Programming and Stream API
C# Advanced Course Introduction SoftUni Team C# Technical Trainers
Databases Advanced Course Introduction SoftUni Team Databases Advanced
Built-in Functions. Usage of Wildcards
Data Definition and Data Types
Multidimensional Arrays, Sets, Dictionaries
Extending functionality using Collections
Exporting and Importing Data
Language Comparison Java, C#, PHP and JS SoftUni Team
Functional Programming
C# Advanced Course Introduction SoftUni Team C# Technical Trainers
Exporting and Importing Data
Introduction to TypeScript & Angular
CSS Transitions and Animations
Train the Trainers Course
Iterators and Comparators
Version Control Systems
JavaScript Frameworks & AngularJS
/^Hel{2}o\s*World\n$/
CSS Transitions and Animations
Iterators and Generators
Multidimensional Arrays
Presentation transcript:

Text Processing and Regex API String operations and regular expressions Java SoftUni Team Technical Trainers Software University http://softuni.bg

Table of Contents Working with strings Regular expressions Regex API in Java

Working with strings string s = "Hello, SoftUni!"; H e l o , S f t U n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Working with strings

The String Data Type String The String data type: Declared by the String class Represents a sequence of characters Has a default value null (no value) Strings are enclosed in quotes: Strings can be concatenated Using the + operator String String s = "Hello, Java";

The String Data Type (2) Strings in Java Know their number of characters: length() Can be accessed by index: charAt(0 … length()-1) Reference types Stored in the heap (dynamic memory) Can have null value (missing value) Strings cannot be modified (immutable) Most string operations return a new String instance StringBuilder class is used to build stings

Strings – Examples String str = "SoftUni"; System.out.println(str); for (int i = 0; i < str.length(); i++) { System.out.printf("str[%d] = %s\n", i, str.charAt(i)); } System.out.println(str.indexOf("Uni")); // 4 System.out.println(str.indexOf("uni")); // -1 System.out.println(str.substring(4, 7)); // Uni System.out.println(str.replace("Soft", "Hard")); // HardUni System.out.println(str.toLowerCase()); // softuni // example continues

Strings – Examples (2) String str = "SoftUni"; System.out.println(str.toUpperCase()); // SOFTUNI System.out.println(str.charAt(4)); // U System.out.println(str.contains("Soft")); // true System.out.println(str.contains("soft")); // false System.out.println(str.startsWith("soft")); // false System.out.println(str.endsWith("Uni")); // true str = " SoftUni "; System.out.println(str.trim()); // SoftUni System.out.println(str.lastIndexOf(' '); // 8

Strings – Examples (3) String firstName = "Steve"; String lastName = "Jobs"; int age = 56; System.out.println(firstName + " " + lastName + " (age: " + age + ")"); // Steve Jobs (age: 56) String allLangs = "C#, Java; HTML, CSS; PHP, SQL"; String[] langs = allLangs.split("[, ;]+"); for (String lang : langs) { System.out.println(lang); } System.out.println("Langs = " + String.join(", ", langs)); System.out.println(" \n\n Software University ".trim());

Strings – Examples (4) String concatenation (+) is a costly operation and slows down performance a lot when used inside a loop Use a StringBuilder instead StringBuilder input = new StringBuilder(); while (hasInput) { input.append(scanner.nextLine()); }

Comparing Strings in Java The == operator does not work correctly for strings! Use String.equals(String) and String.compareTo(String) String[] words = "yes yes".split(" "); System.out.println("words[0] = " + words[0]); // yes System.out.println("words[1] = " + words[0]); // yes System.out.println(words[0] == words[1]); // false System.out.println(words[0].equals(words[1])); // true System.out.println("Alice".compareTo("Mike")); // < 0 System.out.println("Alice".compareTo("Alice")); // == 0 System.out.println("Mike".compareTo("Alice")); // > 0

Regex API

Regular Expressions Regular expressions match text by pattern, e.g. [0-9]+ matches a non-empty sequence of digits [a-zA-Z]* matches a sequence of letters (including empty) [A-Z][a-z]+ [A-Z][a-z]+ matches a name (first name + space + last name) \s+ matches any whitespace; \S+ matches non-whitespace \d+ matches digits; \D+ matches non-digits \w+ matches letters (Unicode); \W+ matches non-letters \+\d{1,3}([ -]*[0-9]+)+ matches international phone

Validation by Regular Expression – Example import java.util.regex.*; … String regex = "\\+\\d{1,3}([ -]*[0-9]+)+"; System.out.println("+359 2 981-981".matches(regex)); // true System.out.println("invalid number".matches(regex)); // false System.out.println("+359 123-".matches(regex)); // false System.out.println("+359 (2) 981 981".matches(regex)); // false System.out.println("+44 280 11 11".matches(regex)); // true System.out.println("++44 280 11 11".matches(regex)); // false System.out.println("(+49) 325 908 44".matches(regex)); // false System.out.println("+49 325 908-40-40".matches(regex)); // true

Regex API Pattern – a compiled representation of a regular expression. Matcher – an engine that performs match operations on a character sequence by interpreting a Pattern. import java.util.regex.*; … String text = "Hello, my number in Sofia is +359 894 11 22 33, " + "but in Munich my number is +49 89 975-99222."; Pattern phonePattern = Pattern.compile( "\\+\\d{1,3}([ -]*([0-9]+))+"); Matcher matcher = phonePattern.matcher(text); while (matcher.find()) { System.out.println(matcher.group()); }

Matcher methods matches – attempts to match the entire input. lookingAt – attempts to match the input sequence, starting at the beginning. find – scans the input sequence looking for the next subsequence that matches the pattern. start – returns the start index of the previous match. end – returns the offset after the last character matched. group – returns the input subsequence matched by the previous match.

Regex API Live Demo

Summary Working with strings StringBuilder Regular Expressions Syntax, groups, special characters Regex API

Text Processing and Regex API https://softuni.bg/courses/java-basics/ © Software University Foundation – http://softuni.org This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike license.

License This course (slides, examples, demos, videos, homework, etc.) is licensed under the "Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International" license Attribution: this work may contain portions from "Fundamentals of Computer Programming with Java" book by Svetlin Nakov & Co. under CC-BY-SA license "C# Basics" course by Software University under CC-BY-NC-SA license © Software University Foundation – http://softuni.org This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike license.

Free Trainings @ Software University Software University Foundation – softuni.org Software University – High-Quality Education, Profession and Job for Software Developers softuni.bg Software University @ Facebook facebook.com/SoftwareUniversity Software University @ YouTube youtube.com/SoftwareUniversity Software University Forums – forum.softuni.bg