>> a = "hello globe" # New string (and label). >>> a = str(2) # String "2" as text. >>> a[0] # Subscription. 'h' >>> a[0] = "m" # Attempted assignment. TypeError: 'str' object does not support item assignment"> >> a = "hello globe" # New string (and label). >>> a = str(2) # String "2" as text. >>> a[0] # Subscription. 'h' >>> a[0] = "m" # Attempted assignment. TypeError: 'str' object does not support item assignment">

Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strings As well as tuples and ranges, there are two additional important immutable sequences: Bytes (immutable sequences of 8 ones and zeros (usually represented.

Similar presentations


Presentation on theme: "Strings As well as tuples and ranges, there are two additional important immutable sequences: Bytes (immutable sequences of 8 ones and zeros (usually represented."— Presentation transcript:

1 Strings As well as tuples and ranges, there are two additional important immutable sequences: Bytes (immutable sequences of 8 ones and zeros (usually represented as ints between 0 and 255 inclusive, as is 255 as an int); of which Byte Arrays are the mutable version) Strings (text) Many languages have a primitive type which is an individual character. Python doesn't - str (the string type) are just sequences of one-character long other str. “The array module provides an array() object that is like a list that stores only homogeneous data and stores it more compactly. The following example shows an array of numbers stored as two byte unsigned binary numbers (typecode "H") rather than the usual 16 bytes per entry for regular lists of Python int objects:” “Python strings cannot be changed — they are immutable. Therefore, assigning to an indexed position in the string results in an error:”

2 Strings Moreover, it may seem odd that they are immutable, but this helps with memory management. If you change a str the old one is destroyed and a new one created. >>> a = "hello world" >>> a = "hello globe" # New string (and label). >>> a = str(2) # String "2" as text. >>> a[0] # Subscription. 'h' >>> a[0] = "m" # Attempted assignment. TypeError: 'str' object does not support item assignment

3 String Literals String literals are formed 'content' or "content" (inline) or '''content''' or """content""" (multiline). In multiline quotes, line ends are preserved unless the line ends “\” print('''This is \ all one line. This is a second.''') For inline quotes, you need to end the quote and start again on next line (with or without “+” for variables): print("This is all " + "one line.") print("This is a second") # Note the two print statements.

4 String concatenation (joining)
Strings can be concatenated (joined) though: >>> a = "hello" + "world" >>> a = "hello" "world" # "+" optional if just string literals. >>> a 'helloworld' # Note no spaces. To add spaces, do them inside the strings or between them: >>> a = "hello " + "world" >>> a = "hello" + " " + "world" For string variables, need "+" >>> h= "hello" >>> a = h + "world"

5 Immutable concatenation
But, remember that each time you change an immutable type, you make a new one. This is hugely inefficient, so continually adding to a immutables takes a long time. There are alternatives: With tuples, use a list instead, and extend this (a new list isn't created each time). With bytes, use a bytearray mutable. With a string, build a list of strings and then use the str.join() function built into all strings once complete. >>> a = ["x","y","z"] >>> b = " ".join(a) >>> b 'x y z' >>> c = " and ".join(a) >>> c 'x and y and z' “/”.join(args) Where args is a sequence: joins them with “/”between.

6 Parsing Often we'll need to split strings up based on some delimiter. This is known as parsing. For example, it is usual to read data files a line at a time and them parse them into numbers.

7 Split Strings can be split by: a = str.split(string, delimiter) a = some_string.split(delimiter) (There's no great difference) For example: a = "Daisy, Daisy/Give me your answer, do." b = str.split(a," ") As it happens, whitespace is the default.

8 Search and replace str.startswith(strA, 0, len(string)) Checks whether a string starts with strA. str.endswith(suffix, 0, len(string)) Second two params are optional start and end search locations. str.find(strA, 0, len(string)) Gives index position or -1 if not found str.index(strA, 0, len(string)) Raises error message if not found. rfind and rindex do the same from right-to-left Once an index is found, you can uses slices to extract substrings. strB = strA[index1:index2] lstrip(str)/rstrip(str) Removes leading whitespace from left/right strip([chars]) As above, but both sides, and with optional characters to strip str.replace(substringA, substringB, int) Replace all occurrences of A with B. The optional final int arg will control the max number of replacements.

9 Escape characters What if we want quotes in our strings? Use double inside single, or vice versa: a = "It's called 'Daisy'." a = 'You invented "Space Paranoids"?' If you need to mix them, though, you have problems as Python can't tell where the string ends: a = 'It's called "Daisy".' Instead, you have to use an escape character, a special character that is interpreted differently from how it looks. All escape characters start with a backslash, for a single quote it is simply: a = 'It\'s called "Daisy".'

10 Escape characters \newline Backslash and newline ignored \\
\' Single quote (') \" Double quote (") \b ASCII Backspace (BS) \f ASCII Formfeed (FF) \n ASCII Linefeed (LF) \r ASCII Carriage Return (CR) \t ASCII Horizontal Tab (TAB) \ooo Character with octal value ooo \xhh Character with hex value hh \N{name} Character named name in the Unicode database \uxxxx Character with 16-bit hex value xxxx \Uxxxxxxxx Character with 32-bit hex value xxxxxxxx

11 String Literals Going back to our two line example: print("This is all " + "one line.") print("This is a second") # Note the two print statements. Note that we can now rewrite this as: "one line. \n" + "This is a second")

12 String Literals There are some cases where we want to display the escape characters as characters rather than escaped characters when we print or otherwise use the text. To do this, prefix the literal with "r": >>> a = r"This contains a \\ backslash escape" From then on, the backslashes as interpreted as two backslashes. Note that if we then print this, we get: >>> a 'This contains a \\\\ backslash escape' Note that the escape is escaped. String literal markups: R or r is a “raw” string, escaping escapes to preserve their appearance. F or f is a formatted string (we'll come to these). U or u is Python 2 legacy similar to R. Starting br or rb or any variation capitalised – a sequence of bytes. “Why can’t raw strings (r-strings) end with a backslash? More precisely, they can’t end with an odd number of backslashes: the unpaired backslash at the end escapes the closing quote character, leaving an unterminated string. An example of when we might want to use this is when constructing a regex statement. These search text on the basis of patterns which include quite a lot of backslashes, and escaping each one can make for very confusing patterns.

13 Formatting strings There are a wide variety of ways of formatting strings. print( "{0} has: {1:10.2f} pounds".format(a,b) ) print('%(a)s has: %(b)10.2f pounds'%{'a':'Bob','b': }) See website for examples.

14 Sets Unordered collections of unique objects. Main type is mutable, but there is a FrozenSet: a = {"red", "green", "blue"} a = set(some_other_container) Can have mixed types and container other containers. Note you can't use a = {} to make an empty set (as this is an empty dictionary), have to use: a = set() Implement the classic sets you know of from set mathematics.

15 Add/Remove a.add("black") a.remove("blue") # Creates a warning if item doesn't exist. a.discard("pink") # Silent if item doesn't exist. a.clear() # Discard everything.

16 Operators | or a.union(b) # Union of sets a and b. & or a.intersection(b) # Intersection. - or a.difference(b) # Difference (elements of a not in b). ^ or a.symmetric_difference(b) # Inverse of intersection. x in a # Checks if item x in set a. x not in a # Checks if item x is not in set a. a <= b or a.issubset(b) # If a is contained in b. a < b # a is a proper subset (i.e. not equal to) a >= b or a.issuperset(b) # If b is contained in a. a > b # a is a proper superset Operators only work on sets; functions work on (some) other containers. Specifically, the functions work on containers that are iterable, that is, you can ask for their contents one at a time. We'll come on to this.

17 Other functions Most of the functions have partners that adjust the set, for example: a &= b or a.intersection_update(b) Updates a so it is just its previous intersection with b. For a complete list, see:

18 Mappings Mappings link (map) one set of data to another, so requests for the first get the second. The main mapping class is dict (dictionary; in other languages these are sometimes called associative arrays, or ~hashtables) They're composed of a table of keys and values. If you ask for the key you get the value. An example would be people's names and their addresses. Keys have to be unique. Keys have to be immutable objects (we don't want them changing after they're used). Dictionaries are not ordered.

19 Dict If strings you can also do:
a = {1:"Person One", 2:"Person Two", 3:"Person 3"} If strings you can also do: a = {"one"="Person One", "two"="Person Two"} a = {} # Empty dictionary. keys = (1,2,3) values = ("Person One", "Person Two", "Person 3") a = dict(zip(keys, values)) a[key] = value # Set a new key and value. print(a[key]) # Gets a value given a key.

20 Useful functions del a[key] # Remove a key and value. a.clear() # Clear all keys and values. get(a[key], default) # Get the value, or if not there, returns default. (normally access would give an error) a.keys() a.values() # Return a "view" of keys, values, or pairs. a.items() These are essentially a complicated insight into the dictionary. To use these, turn them into a list: list(a.items()) list(a.keys()) Again, there are update methods. See:

21 Dictionaries Dictionaries are hugely important as, not that you’d know it, objects are stored as dictionaries of attributes and methods.


Download ppt "Strings As well as tuples and ranges, there are two additional important immutable sequences: Bytes (immutable sequences of 8 ones and zeros (usually represented."

Similar presentations


Ads by Google