Strings As well as tuples and ranges, there are two additional important immutable sequences: Bytes (immutable sequences of 8 ones and zeros (usually represented.

Slides:



Advertisements
Similar presentations
Introduction to Python
Advertisements

Characters and Strings. Characters In Java, a char is a primitive type that can hold one single character A character can be: –A letter or digit –A punctuation.
String Escape Sequences
CMPT 120 Lists and Strings Summer 2012 Instructor: Hassan Khosravi.
1 Variables, Constants, and Data Types Primitive Data Types Variables, Initialization, and Assignment Constants Characters Strings Reading for this class:
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 9 More About Strings.
CS190/295 Programming in Python for Life Sciences: Lecture 3 Instructor: Xiaohui Xie University of California, Irvine.
Python Mini-Course University of Oklahoma Department of Psychology Day 3 – Lesson 12 More about strings 05/02/09 Python Mini-Course: Day 3 – Lesson 12.
Strings The Basics. Strings can refer to a string variable as one variable or as many different components (characters) string values are delimited by.
Fall Week 4 CSCI-141 Scott C. Johnson.  Computers can process text as well as numbers ◦ Example: a news agency might want to find all the articles.
Built-in Data Structures in Python An Introduction.
Getting Started with Python: Constructs and Pitfalls Sean Deitz Advanced Programming Seminar September 13, 2013.
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
Chapter 5 Strings CSC1310 Fall Strings Stringordered storesrepresents String is an ordered collection of characters that stores and represents text-based.
Copyright © 2015 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 9 Dictionaries and Sets.
Python Primer 1: Types and Operators © 2013 Goodrich, Tamassia, Goldwasser1Python Primer.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
17-Mar-16 Characters and Strings. 2 Characters In Java, a char is a primitive type that can hold one single character A character can be: A letter or.
Quiz 3 Topics Functions – using and writing. Lists: –operators used with lists. –keywords used with lists. –BIF’s used with lists. –list methods. Loops.
Winter 2016CISC101 - Prof. McLeod1 CISC101 Reminders Quiz 3 this week – last section on Friday. Assignment 4 is posted. Data mining: –Designing functions.
CSC 108H: Introduction to Computer Programming Summer 2011 Marek Janicki.
© 2004 Pearson Addison-Wesley. All rights reserved August 27, 2007 Primitive Data Types ComS 207: Programming I (in Java) Iowa State University, FALL 2007.
String and Lists Dr. José M. Reyes Álamo.
More about comments Review Single Line Comments The # sign is for comments. A comment is a line of text that Python won’t try to run as code. Its just.
Lec 3: Data Representation
Python Variable Types.
Topics Dictionaries Sets Serializing Objects. Topics Dictionaries Sets Serializing Objects.
Containers and Lists CIS 40 – Introduction to Programming in Python
String Processing Upsorn Praphamontripong CS 1110
Primitive Data Types August 28, 2006 ComS 207: Programming I (in Java)
CSc 120 Introduction to Computer Programing II
Variables and Primative Types
Introduction to Scripting
Containers.
Statement atoms The 'atomic' components of a statement are: delimiters (indents, semicolons, etc.); keywords (built into the language); identifiers (names.
Repeating code We could repeat code we need more than once: i = 1 print (i) i += 1 print (i) #… stop when i == 9 But each line means an extra line we might.
What to bring: iCard, pens/pencils (They provide the scratch paper)
Intro to PHP & Variables
Lists Part 1 Taken from notes by Dr. Neil Moore & Dr. Debby Keen
Strings, Line-by-line I/O, Functions, Call-by-Reference, Call-by-Value
Basic Python Collective variables (sequences) Lists (arrays)
Introduction To Python
Introduction to C++ Programming
WEB PROGRAMMING JavaScript.
CHAPTER THREE Sequences.
Guide to Programming with Python
Winter 2018 CISC101 12/1/2018 CISC101 Reminders
CISC101 Reminders Quiz 2 graded. Assn 2 sample solution is posted.
Programming for Geographical Information Analysis: Core Skills
String and Lists Dr. José M. Reyes Álamo.
Presented by Mirza Elahi 10/29/2018
Chapter 2 Create a Chapter 2 Workspace Create a Project called Notes
Python Tutorial for C Programmer Boontee Kruatrachue Kritawan Siriboon
Topics Dictionaries Sets Serializing Objects. Topics Dictionaries Sets Serializing Objects.
Dictionaries Dictionary: object that stores a collection of data
Python Primer 1: Types and Operators
Python Lists and Sequences
CS190/295 Programming in Python for Life Sciences: Lecture 3
CHAPTER 3: String And Numeric Data In Python
CS 1111 Introduction to Programming Spring 2019
Topics Basic String Operations String Slicing
Introduction to Computer Science
Winter 2019 CISC101 4/29/2019 CISC101 Reminders
Topics Basic String Operations String Slicing
Lists Like tuples, but mutable. Formed with brackets: Assignment: >>> a = [1,2,3] Or with a constructor function: a = list(some_other_container) Subscription.
Enclosing delimiters Python uses three style of special enclosing delimiters. These are what the Python documentation calls them: {} braces # Sometimes.
Topics Basic String Operations String Slicing
Introduction to Computer Science
ECE 120 Midterm 1 HKN Review Session.
Presentation transcript:

Strings As well as tuples and ranges, there are two additional important immutable sequences: Bytes (immutable sequences of 8 ones and zeros (usually represented as ints between 0 and 255 inclusive, as 11111111 is 255 as an int); of which Byte Arrays are the mutable version) Strings (text) Many languages have a primitive type which is an individual character. Python doesn't - str (the string type) are just sequences of one-character long other str. https://docs.python.org/3/library/array.html#module-array https://docs.python.org/3/library/collections.html#module-collections “The array module provides an array() object that is like a list that stores only homogeneous data and stores it more compactly. The following example shows an array of numbers stored as two byte unsigned binary numbers (typecode "H") rather than the usual 16 bytes per entry for regular lists of Python int objects:” “Python strings cannot be changed — they are immutable. Therefore, assigning to an indexed position in the string results in an error:”

Strings Moreover, it may seem odd that they are immutable, but this helps with memory management. If you change a str the old one is destroyed and a new one created. >>> a = "hello world" >>> a = "hello globe" # New string (and label). >>> a = str(2) # String "2" as text. >>> a[0] # Subscription. 'h' >>> a[0] = "m" # Attempted assignment. TypeError: 'str' object does not support item assignment

String Literals String literals are formed 'content' or "content" (inline) or '''content''' or """content""" (multiline). In multiline quotes, line ends are preserved unless the line ends “\” print('''This is \ all one line. This is a second.''') For inline quotes, you need to end the quote and start again on next line (with or without “+” for variables): print("This is all " + "one line.") print("This is a second") # Note the two print statements.

String concatenation (joining) Strings can be concatenated (joined) though: >>> a = "hello" + "world" >>> a = "hello" "world" # "+" optional if just string literals. >>> a 'helloworld' # Note no spaces. To add spaces, do them inside the strings or between them: >>> a = "hello " + "world" >>> a = "hello" + " " + "world" For string variables, need "+" >>> h= "hello" >>> a = h + "world"

Immutable concatenation But, remember that each time you change an immutable type, you make a new one. This is hugely inefficient, so continually adding to a immutables takes a long time. There are alternatives: With tuples, use a list instead, and extend this (a new list isn't created each time). With bytes, use a bytearray mutable. With a string, build a list of strings and then use the str.join() function built into all strings once complete. >>> a = ["x","y","z"] >>> b = " ".join(a) >>> b 'x y z' >>> c = " and ".join(a) >>> c 'x and y and z' “/”.join(args) Where args is a sequence: joins them with “/”between.

Parsing Often we'll need to split strings up based on some delimiter. This is known as parsing. For example, it is usual to read data files a line at a time and them parse them into numbers.

Split Strings can be split by: a = str.split(string, delimiter) a = some_string.split(delimiter) (There's no great difference) For example: a = "Daisy, Daisy/Give me your answer, do." b = str.split(a," ") As it happens, whitespace is the default.

Search and replace str.startswith(strA, 0, len(string)) Checks whether a string starts with strA. str.endswith(suffix, 0, len(string)) Second two params are optional start and end search locations. str.find(strA, 0, len(string)) Gives index position or -1 if not found str.index(strA, 0, len(string)) Raises error message if not found. rfind and rindex do the same from right-to-left Once an index is found, you can uses slices to extract substrings. strB = strA[index1:index2] lstrip(str)/rstrip(str) Removes leading whitespace from left/right strip([chars]) As above, but both sides, and with optional characters to strip str.replace(substringA, substringB, int) Replace all occurrences of A with B. The optional final int arg will control the max number of replacements.

Escape characters What if we want quotes in our strings? Use double inside single, or vice versa: a = "It's called 'Daisy'." a = 'You invented "Space Paranoids"?' If you need to mix them, though, you have problems as Python can't tell where the string ends: a = 'It's called "Daisy".' Instead, you have to use an escape character, a special character that is interpreted differently from how it looks. All escape characters start with a backslash, for a single quote it is simply: a = 'It\'s called "Daisy".'

Escape characters \newline Backslash and newline ignored \\ \' Single quote (') \" Double quote (") \b ASCII Backspace (BS) \f ASCII Formfeed (FF) \n ASCII Linefeed (LF) \r ASCII Carriage Return (CR) \t ASCII Horizontal Tab (TAB) \ooo Character with octal value ooo \xhh Character with hex value hh \N{name} Character named name in the Unicode database \uxxxx Character with 16-bit hex value xxxx \Uxxxxxxxx Character with 32-bit hex value xxxxxxxx

String Literals Going back to our two line example: print("This is all " + "one line.") print("This is a second") # Note the two print statements. Note that we can now rewrite this as: "one line. \n" + "This is a second")

String Literals There are some cases where we want to display the escape characters as characters rather than escaped characters when we print or otherwise use the text. To do this, prefix the literal with "r": >>> a = r"This contains a \\ backslash escape" From then on, the backslashes as interpreted as two backslashes. Note that if we then print this, we get: >>> a 'This contains a \\\\ backslash escape' Note that the escape is escaped. String literal markups: R or r is a “raw” string, escaping escapes to preserve their appearance. F or f is a formatted string (we'll come to these). U or u is Python 2 legacy similar to R. Starting br or rb or any variation capitalised – a sequence of bytes. “Why can’t raw strings (r-strings) end with a backslash? More precisely, they can’t end with an odd number of backslashes: the unpaired backslash at the end escapes the closing quote character, leaving an unterminated string. “ An example of when we might want to use this is when constructing a regex statement. These search text on the basis of patterns which include quite a lot of backslashes, and escaping each one can make for very confusing patterns.

Formatting strings There are a wide variety of ways of formatting strings. print( "{0} has: {1:10.2f} pounds".format(a,b) ) print('%(a)s has: %(b)10.2f pounds'%{'a':'Bob','b':2.23333}) See website for examples.

Sets Unordered collections of unique objects. Main type is mutable, but there is a FrozenSet: https://docs.python.org/3/library/stdtypes.html#frozenset a = {"red", "green", "blue"} a = set(some_other_container) Can have mixed types and container other containers. Note you can't use a = {} to make an empty set (as this is an empty dictionary), have to use: a = set() Implement the classic sets you know of from set mathematics.

Add/Remove a.add("black") a.remove("blue") # Creates a warning if item doesn't exist. a.discard("pink") # Silent if item doesn't exist. a.clear() # Discard everything.

Operators | or a.union(b) # Union of sets a and b. & or a.intersection(b) # Intersection. - or a.difference(b) # Difference (elements of a not in b). ^ or a.symmetric_difference(b) # Inverse of intersection. x in a # Checks if item x in set a. x not in a # Checks if item x is not in set a. a <= b or a.issubset(b) # If a is contained in b. a < b # a is a proper subset (i.e. not equal to) a >= b or a.issuperset(b) # If b is contained in a. a > b # a is a proper superset Operators only work on sets; functions work on (some) other containers. Specifically, the functions work on containers that are iterable, that is, you can ask for their contents one at a time. We'll come on to this.

Other functions Most of the functions have partners that adjust the set, for example: a &= b or a.intersection_update(b) Updates a so it is just its previous intersection with b. For a complete list, see: https://docs.python.org/3/library/stdtypes.html#set

Mappings Mappings link (map) one set of data to another, so requests for the first get the second. The main mapping class is dict (dictionary; in other languages these are sometimes called associative arrays, or ~hashtables) They're composed of a table of keys and values. If you ask for the key you get the value. An example would be people's names and their addresses. Keys have to be unique. Keys have to be immutable objects (we don't want them changing after they're used). Dictionaries are not ordered.

Dict If strings you can also do: a = {1:"Person One", 2:"Person Two", 3:"Person 3"} If strings you can also do: a = {"one"="Person One", "two"="Person Two"} a = {} # Empty dictionary. keys = (1,2,3) values = ("Person One", "Person Two", "Person 3") a = dict(zip(keys, values)) a[key] = value # Set a new key and value. print(a[key]) # Gets a value given a key. https://www.python.org/dev/peps/pep-0448/

Useful functions del a[key] # Remove a key and value. a.clear() # Clear all keys and values. get(a[key], default) # Get the value, or if not there, returns default. (normally access would give an error) a.keys() a.values() # Return a "view" of keys, values, or pairs. a.items() These are essentially a complicated insight into the dictionary. To use these, turn them into a list: list(a.items()) list(a.keys()) Again, there are update methods. See: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict

Dictionaries Dictionaries are hugely important as, not that you’d know it, objects are stored as dictionaries of attributes and methods.