Strings Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.

Slides:



Advertisements
Similar presentations
Container Types in Python
Advertisements

Strings Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
CS 100: Roadmap to Computing Fall 2014 Lecture 0.
String and Lists Dr. Benito Mendoza. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list List.
File input and output if-then-else Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Variables and I/O. Types Strings –Enclosed in quotation marks –“Hello, World!” Integers –4, 3, 5, 65 Floats –4.5, 0.7 What about “56”?
Variables and I/O. Types Strings –Enclosed in quotation marks –“Hello, World!” Integers –4, 3, 5, 65 Floats –4.5, 0.7 What about “56”?
Introduction to Python
JavaScript, Fourth Edition
JavaScript, Third Edition
String Escape Sequences
Computing with Strings CSC 161: The Art of Programming Prof. Henry Kautz 9/16/2009.
Lilian Blot CORE ELEMENTS COLLECTIONS & REPETITION Lecture 4 Autumn 2014 TPOP 1.
Introduction to Python Lecture 1. CS 484 – Artificial Intelligence2 Big Picture Language Features Python is interpreted Not compiled Object-oriented language.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 9 More About Strings.
Strings CS303E: Elements of Computers and Programming.
Numbers, lists and tuples Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
An Introduction to Java Programming and Object-Oriented Application Development Chapter 7 Characters, Strings, and Formatting.
CS346 Javascript -3 Module 3 JavaScript Variables.
Python Conditionals chapter 5
Introduction to Strings Intro to Computer Science CS1510, Section 2 Dr. Sarah Diesburg 1.
Strings in Python. Computers store text as strings GATTACA >>> s = "GATTACA" s Each of these are characters.
Introduction to Strings CSIS 1595: Fundamentals of Programming and Problem Solving 1.
COMPE 111 Introduction to Computer Engineering Programming in Python Atılım University
An Introduction to Programming with C++ Sixth Edition Chapter 13 Strings.
Strings Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See
Python Strings. String  A String is a sequence of characters  Access characters one at a time with a bracket operator and an offset index >>> fruit.
Strings CSE 1310 – Introduction to Computers and Programming Alexandra Stefan University of Texas at Arlington 1.
Strings CSE 1310 – Introduction to Computers and Programming Alexandra Stefan University of Texas at Arlington 1.
String and Lists Dr. José M. Reyes Álamo. 2 Outline What is a string String operations Traversing strings String slices What is a list Traversing a list.
Mr. Crone.  Use print() to print to the terminal window  print() is equivalent to println() in Java Example) print(“Hello1”) print(“Hello2”) # Prints:
CSC 231: Introduction to Data Structures Dr. Curry Guinn.
String and Lists Dr. José M. Reyes Álamo.
Chapter 2 Variables.
For loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Introduction to Python
Strings Part 1 Taken from notes by Dr. Neil Moore
File input and output Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble Notes from 2009: Sample problem.
Numbers, lists and tuples
Conditionals (if-then-else)
Introduction to Strings
Introduction to Strings
Chapter 7: Strings and Characters
For loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble Notes for 2010: I skipped slide 10. This is.
CHAPTER THREE Sequences.
Data types Numeric types Sequence types float int bool list str
CS 100: Roadmap to Computing
CEV208 Computer Programming
String and Lists Dr. José M. Reyes Álamo.
More for loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
While loops Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble.
Sequence comparison: Local alignment
Basic String Operations
Introduction to Strings
Strings in Python.
CHAPTER 3: String And Numeric Data In Python
15-110: Principles of Computing
Topics Basic String Operations String Slicing
Introduction to Python Strings in Python Strings.
String methods 26-Apr-19.
Introduction to Strings
“Everything Else”.
Topics Basic String Operations String Slicing
Chapter 2 Variables.
EECE.2160 ECE Application Programming
CS 100: Roadmap to Computing
Introduction to Strings
Strings Taken from notes by Dr. Neil Moore & Dr. Debby Keen
Topics Basic String Operations String Slicing
Introduction to Python
Presentation transcript:

Strings Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble

Strings A string is a sequence of letters (called characters). In Python, strings start and end with single or double quotes. >>> “foo” ‘foo’ >>> ‘foo’

Defining strings Each string is stored in the computer’s memory as a list of characters. >>> myString = “GATTACA” myString

Accessing single characters You can access individual characters by using indices in square brackets. >>> myString = “GATTACA” >>> myString[0] ‘G’ >>> myString[1] ‘A’ >>> myString[-1] >>> myString[-2] ‘C’ >>> myString[7] Traceback (most recent call last): File "<stdin>", line 1, in ? IndexError: string index out of range Negative indices start at the end of the string and move left.

Accessing substrings >>> myString = “GATTACA”

Special characters Escape sequence Meaning \\ Backslash \’ The backslash is used to introduce a special character. >>> "He said, "Wow!"" File "<stdin>", line 1 "He said, "Wow!"" ^ SyntaxError: invalid syntax >>> "He said, 'Wow!'" "He said, 'Wow!'" >>> "He said, \"Wow!\"" 'He said, "Wow!"' Escape sequence Meaning \\ Backslash \’ Single quote \” Double quote \n Newline \t Tab

More string functionality >>> len(“GATTACA”) 7 >>> “GAT” + “TACA” ‘GATTACA’ >>> “A” * 10 ‘AAAAAAAAAA >>> “GAT” in “GATTACA” True >>> “AGT” in “GATTACA” False Length Concatenation Repeat Substring test

String methods In Python, a method is a function that is defined with respect to a particular object. The syntax is <object>.<method>(<parameters>) >>> dna = “ACGT” >>> dna.find(“T”) 3

String methods >>> "GATTACA".find("ATT") 1 >>> "GATTACA".count("T") 2 >>> "GATTACA".lower() 'gattaca' >>> "gattaca".upper() 'GATTACA' >>> "GATTACA".replace("G", "U") 'UATTACA‘ >>> "GATTACA".replace("C", "U") 'GATTAUA' >>> "GATTACA".replace("AT", "**") 'G**TACA' >>> "GATTACA".startswith("G") True >>> "GATTACA".startswith("g") False

Strings are immutable Strings cannot be modified; instead, create a new one. >>> s = "GATTACA" >>> s[3] = "C" Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: object doesn't support item assignment >>> s = s[:3] + "C" + s[4:] >>> s 'GATCACA' >>> s = s.replace("G","U") 'UATCACA'

Strings are immutable String methods do not modify the string; they return a new string. >>> sequence = “ACGT” >>> sequence.replace(“A”, “G”) ‘GCGT’ >>> print sequence ACGT >>> new_sequence = sequence.replace(“A”, “G”) >>> print new_sequence GCGT

String summary Methods: S.upper() S.lower() S.count(substring) Basic string operations: S = "AATTGG" # assignment - or use single quotes ' ' s1 + s2 # concatenate s2 * 3 # repeat string s2[i] # index character at position 'i' s2[x:y] # index a substring len(S) # get length of string int(S) # turn a string into an integer float(S) # … or a floating point decimal Methods: S.upper() S.lower() S.count(substring) S.replace(old,new) S.find(substring) S.startswith(substring) S. endswith(substring) Printing: print var1,var2,var3 # print multiple variables print "text",var1,"text” # print a combination of strings and variables

First get it working just for uppercase letters. Sample problem #1 Write a program called dna2rna.py that reads a DNA sequence from the first command line argument, and then prints it as an RNA sequence. Make sure it works for both uppercase and lowercase input. > python dna2rna.py AGTCAGT ACUCAGU > python dna2rna.py actcagt acucagu > python dna2rna.py ACTCagt ACUCagu First get it working just for uppercase letters.

Two solutions import sys sequence = sys.argv[1] new_sequence = sequence.replace(“T”, “U”) newer_sequence = new_sequence.replace(“t”, “u”) print newer_sequence print sys.argv[1]

Two solutions import sys sequence = sys.argv[1] new_sequence = sequence.replace(“T”, “U”) newer_sequence = new_sequence.replace(“t”, “u”) print newer_sequence print sys.argv[1].replace(“T”, “U”)

Two solutions import sys sequence = sys.argv[1] new_sequence = sequence.replace(“T”, “U”) newer_sequence = new_sequence.replace(“t”, “u”) print newer_sequence print sys.argv[1].replace(“T”, “U”).replace(“t”, “u”) It is legal (but not always desirable) to chain together multiple methods on a single line.

Sample problem #2 Write a program get-codons.py that reads the first command line argument as a DNA sequence and prints the first three codons, one per line, in uppercase letters. > python get-codons.py TTGCAGTCG TTG CAG TCG > python get-codons.py TTGCAGTCGATC > python get-codons.py tcgatcgac ATC GAC

Solution #2 import sys sequence = sys.argv[1] upper_sequence = sequence.upper() print upper_sequence[:3] print upper_sequence[3:6] print upper_sequence[6:9]

Reading Chapter 1-2 of Think Python (1st edition) by Allen B. Downey.