Text Manipulation Chapter 7 Attaway MATLAB 5E.

Slides:



Advertisements
Similar presentations
 2003 Prentice Hall, Inc. All rights reserved Fundamentals of Characters and Strings Character constant –Integer value represented as character.
Advertisements

Strings.
Lecture 9. Lecture 9: Outline Strings [Kochan, chap. 10] –Character Arrays/ Character Strings –Initializing Character Strings. The null string. –Escape.
What is a pointer? First of all, it is a variable, just like other variables you studied So it has type, storage etc. Difference: it can only store the.
Current Assignments Homework 5 will be available tomorrow and is due on Sunday. Arrays and Pointers Project 2 due tonight by midnight. Exam 2 on Monday.
Chapter 7. Even in quantitative sciences, we often encounter letters and/or words that must be processed by code You may want to write code that: Reads.
Additional Data Types: 2-D Arrays, Logical Arrays, Strings Selim Aksoy Bilkent University Department of Computer Engineering
EGR 106 – Week 2 – Arrays Definition, size, and terminology Construction methods Addressing and sub-arrays Some useful functions for arrays Character arrays.
Chapter 8 Arrays and Strings
Additional Data Types: Strings Selim Aksoy Bilkent University Department of Computer Engineering
MATLAB Strings Selim Aksoy Bilkent University Department of Computer Engineering
Introduction to Array The fundamental unit of data in any MATLAB program is the array. 1. An array is a collection of data values organized into rows and.
AN ENGINEER’S GUIDE TO MATLAB
C How to Program, 6/e © by Pearson Education, Inc. All Rights Reserved.
1 Computer Programming (ECGD2102 ) Using MATLAB Instructor: Eng. Eman Al.Swaity Lecture (3): MATLAB Environment (Chapter 1)
CSE123 Lecture 6 String Arrays. Character Strings In Matlab, text is referred to as character strings. String Construction Character strings in Matlab.
Characters The data type char represents a single character in Java. –Character values are written as a symbol: ‘a’, ‘)’, ‘%’, ‘A’, etc. –A char value.
Strings CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by.
COMP 116: Introduction to Scientific Programming Lecture 24: Strings in MATLAB.
An Introduction to Java Programming and Object-Oriented Application Development Chapter 7 Characters, Strings, and Formatting.
Dale Roberts Department of Computer and Information Science, School of Science, IUPUI C-Style Strings Strings and String Functions Dale Roberts, Lecturer.
Chapter 8 Strings. Copyright ©2004 Pearson Addison-Wesley. All rights reserved.9-2 Strings stringC implements the string data structure using arrays of.
More Strings CS303E: Elements of Computers and Programming.
Representing Strings and String I/O. Introduction A string is a sequence of characters and is treated as a single data item. A string constant, also termed.
1. Comparing Strings 2. Converting strings/numbers 3. Additional String Functions Strings Built-In Functions 1.
C++ Programming Lecture 19 Strings The Hashemite University Computer Engineering Department (Adapted from the textbook slides)
CS 170 – INTRO TO SCIENTIFIC AND ENGINEERING PROGRAMMING.
Strings Characters and Sentences 1.Overview 2.Creating Strings 3.Slicing Strings 4.Searching for substrings 1.
1 Arrays and Pointers The name of an array is a pointer constant to the first element. Because the array’s name is a pointer constant, its value cannot.
CSE 251 Dr. Charles B. Owen Programming in C1 Strings and File I/O.
ECE 103 Engineering Programming Chapter 29 C Strings, Part 2 Herbert G. Mayer, PSU CS Status 7/30/2014 Initial content copied verbatim from ECE 103 material.
LECTURE 5 Strings. STRINGS We’ve already introduced the string data type a few lectures ago. Strings are subtypes of the sequence data type. Strings are.
1 Agenda  Unit 7: Introduction to Programming Using JavaScript T. Jumana Abu Shmais – AOU - Riyadh.
Winter 2016CISC101 - Prof. McLeod1 CISC101 Reminders Quiz 3 this week – last section on Friday. Assignment 4 is posted. Data mining: –Designing functions.
Strings in Python String Methods. String methods You do not have to include the string library to use these! Since strings are objects, you use the dot.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
Introduction Programs which manipulate character data don’t usually just deal with single characters, but instead with collections of them (e.g. words,
Do-more Technical Training
Excel STDEV.S Function.
String Methods Programming Guides.
Topics Designing a Program Input, Processing, and Output
Strings CSCI 112: Programming in C.
C Characters and Strings
COMP 170 – Introduction to Object Oriented Programming
Arrays: Checkboxes and Textareas
String class.
Other Kinds of Arrays Chapter 11
Other Kinds of Arrays Chapter 11
Lecture 5 Strings.
Chapter 8 - Characters and Strings
Vectors and Matrices Chapter 2 Attaway MATLAB 4E.
Variables In programming, we often need to have places to store data. These receptacles are called variables. They are called that because they can change.
Introduction to C++ Programming
Week 9 – Lesson 1 Arrays – Character Strings
Introduction to MATLAB
String Manipulation Chapter 7 Attaway MATLAB 4E.
T. Jumana Abu Shmais – AOU - Riyadh
Chapter 8 Data Structures: Cell Arrays and Structures
Coding Concepts (Data- Types)
Vectors and Matrices Chapter 2 Attaway MATLAB 4E.
Text functions.
Topics Designing a Program Input, Processing, and Output
Topics Designing a Program Input, Processing, and Output
Introduction to Computer Science
C++ Programming Lecture 20 Strings
Python Strings.
MATLAB Programs Chapter 6 Attaway MATLAB 5E.
Chapter 8 Data Structures: Cell Arrays and Structures, Sorting
Electrical and Computer Engineering Department SUNY – New Paltz
Presentation transcript:

Text Manipulation Chapter 7 Attaway MATLAB 5E

Text Terminology Text in MATLAB can be represented using: character vectors (in single quotes) string arrays, introduced in R2016b (in double quotes) Many functions that manipulate text can be called using either a character vector or a string Additionally, new functions have been created for the string type Prior to R2016b, the word “string” was used for what are now called character vectors

Characters Both character vectors and string scalars are composed of groups of characters Characters include letters of the alphabet, digits, punctuation marks, white space, and control characters Control characters are characters that cannot be printed, but accomplish a task (such as a backspace or tab) White space characters include the space, tab, newline, and carriage return Leading blanks are blank spaces at the beginning Trailing blanks are blank spaces at the end Individual characters in single quotation marks are the type char (so, the class is char) Function newline returns a newline character

Character Vectors Character vectors: consist of any number of characters are contained in single quotation marks are displayed using single quotation marks are of the type, or class, char Examples: ‘x’, ‘ ’, ‘abc’, ‘hi there’ Since these are vectors of characters, many built-in functions and operators that we’ve seen already work with character vectors as well as numbers – e.g., length to get the length of a character vector, or the transpose operator You can also index into a character vector to get individual characters or to get subsets (called a substring for a character vector or a string)

String Scalars A string scalar is a single string, and is used to store a group of characters such as a word Strings can be created using double quotes, e.g. “Awesome” using the string function, e.g. string(‘Awesome’) Strings are displayed using double quotes The class of a string scalar is ‘string’

Dimensions of character vectors A single character is a 1 x 1 scalar >> letter = 'x'; >> size(letter) ans = 1 1 A character vector is 1 x n where n is the length >> myword = 'Hello'; >> size(myword) 1 5

Dimensions of strings A string is a 1 x 1 scalar >> mystr = "Awesome"; >> size(mystr) ans = 1 1 A character vector is contained in a string; curly braces can be used to index: >> mystr(1) "Awesome" >> mystr{1} 'Awesome' >> mystr{1}(1) 'A'

Length of strings We have seen that the length function can be used to find the number of characters in a character vector However, since a string is a 1 x 1 scalar, the length of a string is 1 Use the strlength function for strings: >> length("Ciao") ans = 1 >> strlength("Ciao") 4 Even an empty string is a 1 x 1 scalar, so the length is 1 (not 0)

Groups of strings Groups of strings can be stored in string arrays character matrices cell arrays (in Chapter 8) String arrays are very easy to use: >> greets = ["hi", "ciao"] greets = 1×2 string array "hi" "ciao" >> greets(1) ans = "hi"

Character matrices Creating a column vector of character vectors actually creates a character matrix (a matrix in which every element is a single character) There are 2 ways to do this: Using [ ] and separating with semicolons Using char Since all rows in a matrix must have the same number of characters, shorter character vectors must be padded with blank spaces so that all are the same length; the built-in function char will do that automatically So, string arrays are so much easier!!

Creating Character Matrices Both [ ] and char can be used to create a matrix in which every row has a character vector: >> cmat = ['Hello';'Hi '; 'Ciao ']; >> cmat = char('Hello', 'Hi', 'Ciao’); Both of these will create a matrix cmat: Shorter strings are padded with blanks e.g. ‘Hi ’ Again, string arrays are so much easier! H e l o i C a

Operations on Character Vectors Character vectors can be created using single quotes The input function creates a character vector (don’t forget the second argument, ‘s’) >> cvword = input('Enter a word: ','s') Enter a word: science cvword = 'science' The blanks function creates a character vector with n blank spaces >> bb = blanks(3) bb = ' '

Operations on Strings Strings are created using double quotes or by passing a character vector to the string function Without any arguments, the string function creates an empty string (which, don’t forget, is still a 1 x 1 scalar so the length is 1) The plus function or operator can join, or concatenate, two strings together >> "abc" + "xyz" ans = "abcxyz"

Text Functions Many functions can have either character vectors or strings as input arguments Generally, if a character vector is passed to the function, a character vector will be returned, and if a string is passed, a string will be returned For example, changing case using upper or lower: >> upper('abc') ans = 'ABC' >> lower("HELP") "help"

Removing Characters deblank removes trailing blanks strtrim removes leading and trailing blanks (Note: neither function removes blanks in the middle of strings) strip removes leading and/or trailing characters; either whitespace characters by default or else a specified character erase removes all occurrences of a substring within a string or character vector (so could be used to erase blanks in the middle of strings)

String Concatenation There are several ways to concatenate, or join, strings (using +) and character vectors (using [ ]) >> "abc"+"d" ans = "abcd" >> ['abc' 'd'] 'abcd' The strcat function can be used; it will remove trailing blanks for character vectors but not for strings >> strcat('Hi ', 'everyone') 'Hieveryone' >> strcat("Hi ", "everyone") "Hi everyone”

The sprintf function sprintf works just like fprintf, but instead of printing, it creates text– so it can be used to customize the format of text So, sprintf can be used to create customized text to pass to other functions (e.g., title, input) >> maxran = randi([1, 50]); >> prompt = sprintf('Enter an integer from 1 to %d: ', maxran); >> mynum = input(prompt); Enter an integer from 1 to 46: 33 Any time text is required as an input, sprintf can create customized text

String Comparisons strcmp compares two strings or character vectors and returns logical 1 if they are identical or 0 if not (or not the same length) Variations: strncmp compares only the first n characters strcmpi ignores case (upper or lower) strncmpi compares n characters, ignoring case

The equality operator To use the equality operator with character vectors, they must be the same length, and each element will be compared. For strings, however, it will simply return 1 or 0: >> 'cat' == 'car' ans = 1 1 0 >> 'cat' == 'mouse' Matrix dimensions must agree. >> "cat" == "car" >> "cat" == "mouse"

Find and replace functions Note that the word “string” here can mean string or character vector strfind(string, substring): finds all occurrences of the substring within the string; returns a vector of the indices of the beginning of the strings, or an empty vector if the substring is not found strrep(string, oldsubstring, newsubstring): finds all occurrences of the old substring within the string, and replaces with the new substring the old and new substrings can be different lengths count(string, substring): counts the number of occurrences of a substring within a string

The strtok function The strtok function takes a string and breaks it into two pieces and returns both strings It looks for a delimiter (by default a blank space) and returns a token which is the beginning of the string up to the delimiter, and also the rest of the string, including the delimiter A second argument can be passed for the delimiter So – no characters are lost; all characters from the original string are returned in the two output strings Since the function returns two strings, the call to strtok should be in an assignment statement with two variables on the left to store the two strings

Examples of strtok >> mystring = 'Isle of Skye'; >> [first, rest] = strtok(mystring) first = Isle rest = of Skye >> length(rest) ans = 8 >> [f, r] = strtok(rest, 'y') f = of Sk r = ye

Functions of String Arrays These functions work with string arrays, but not character vectors: strings: preallocates a string array to empty strings strjoin: concatenates strings in a string array together strsplit: splits a string into string scalars join: concatenates strings in corresponding elements of columns in a string array Also, the plus function or operator can concatenate the same string to all string scalars in a string array

“is” Functions to determine type ischar true if the input argument is a character vector isstring true if the input argument is a string isStringScalar true if the input argument is a string scalar

“is” and true/false Functions functions for strings and character vectors: isletter true if the input argument is a letter of the alphabet isspace true if the input argument is a white space character isstrprop determines whether the characters in a string are in a category specified by second argument, e.g. ‘alphanumeric’ contains determines whether a substring is somewhere within a string endsWith/startsWith determine whether text ends with (or starts with) a specified string

String/Number Functions Converting from strings to numbers and vice versa: int2str converts from an integer to a character vector storing the integer num2str converts a real number to a character vector containing the number string converts number(s) to strings str2num (and str2double) converts from a string or character vector containing number(s) to a number array Note: this is different from converting to/from ASCII equivalents

Common Pitfalls Trying to use == to compare character vectors for equality, instead of the strcmp function (and its variations) Confusing sprintf and fprintf. The syntax is the same, but sprintf creates a string whereas fprintf prints  Trying to create a vector of character vectors with varying lengths (use string arrays instead) Forgetting that when using strtok, the second argument returned (the “rest” of the string) contains the delimiter.

Programming Style Guidelines Make sure the correct string comparison function is used; for example, strcmpi if ignoring case is desired