Presentation is loading. Please wait.

Presentation is loading. Please wait.

BioPython http://biopython.org/wiki/Biopython Download & Installation http://biopython.org/wiki/Download Documentation http://biopython.org/wiki/Category%3AWiki_Documentation.

Similar presentations


Presentation on theme: "BioPython http://biopython.org/wiki/Biopython Download & Installation http://biopython.org/wiki/Download Documentation http://biopython.org/wiki/Category%3AWiki_Documentation."— Presentation transcript:

1 BioPython http://biopython.org/wiki/Biopython Download
& Installation Documentation

2 BioPython Key features: Sequences Sequence Annotation I/O Operations
Accessing online databases Multiple sequence alignments BLAST and many many more …

3 quickstart: Sequence objects
Simple example: from Bio.Seq import Seq from Bio.Alphabet import IUPAC dna_sequence = Seq('AGGCTTCTCGTA', IUPAC.unambiguous_dna) print dna_sequence print dna_sequence.alphabet

4 quickstart: parsing sequences
Simple example: from Bio import SeqIO for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"): print(seq_record.id) print(repr(seq_record.seq)) print(len(seq_record)) file format

5 sequence objects alphabet sequence sequences work like strings
from Bio.Seq import Seq from Bio.Alphabet import IUPAC dna_sequence = Seq('AGGCTTCTCGTA', IUPAC.unambiguous_dna) for index, letter in enumerate(dna_sequence): print("%i %s" % (index, letter)) print dna_sequence[2:7] print dna_sequence[0::3] print dna_sequence[1::3] my_seq = str(dna_sequence) + “ATTAATTG” fasta_format_string = ">Name\n%s\n" % my_seq print(fasta_format_string) alphabet sequence sequences work like strings slicing of sequences striding of sequences turning sequences into strings

6 sequence objects making complements making mRNA
from Bio.Seq import Seq from Bio.Alphabet import IUPAC my_seq = Seq("GATCGATGGGCCTATATAGGATCGAAAATCGC”, IUPAC.unambiguous_dna) print my_seq print my_seq.complement() print my_seq.reverse_complement() my_seq = Seq(my_seq, IUPAC.unambiguous_rna) messenger_rna = my_seq.transcribe() print messenger_rna making complements making mRNA

7 sequence objects translation translation
from Bio.Seq import Seq from Bio.Alphabet import IUPAC messenger_rna = Seq("AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG”, IUPAC.unambiguous_rna) print messenger_rna print messenger_rna.translate() coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG", IUPAC.unambiguous_dna) print coding_dna.translate() print coding_dna.translate(table = 2) #translation tables translation translation translation using different tables

8 seqRecord object .seq sequence itself, typically a Seq object.
.id primary id, string .name common name, string .description human readable description, string .letter_annotations Holds per-letter-annotations using a (restricted) dictionary of additional information, Python sequence .annotations additional information, dictionary .features A list of SeqFeature objects with more structured information about the features on a sequence (e.g. position of genes on a genome, or domains on a protein sequence) .dbxrefs database cross-references, string

9 seqRecord object from scratch from Bio.Seq import Seq
simple_seq = Seq("GATC") from Bio.SeqRecord import SeqRecord simple_seq_r = SeqRecord(simple_seq) simple_seq_r.id = (“0001”) simple_seq_r.name = (“MFG1”) simple_seq_r.description = "Made up sequence” print simple_seq_r reading the information from Bio import SeqIO record = SeqIO.read("NC_ fna", "fasta") print record

10 Sequence I/O Parsing from file from Bio import SeqIO
for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"): print(seq_record.id) print(repr(seq_record.seq)) print(len(seq_record)) Or using an iterator: identifiers = [seq_record.id for seq_record in SeqIO.parse("ls_orchid.fasta", ”fasta")] print identifiers handle format

11 Sequence I/O Parsing from the web from Bio import Entrez
from Bio import SeqIO Entrez. = handle = Entrez.efetch(db="nucleotide", rettype="fasta", retmode="text", id=" ") seq_record = SeqIO.read(handle, "fasta") handle.close() print("%s with %i features" % (seq_record.id, len(seq_record.features)))

12 Sequence I/O How to find sequence information from Bio import SeqIO
orchid_dict = SeqIO.to_dict(SeqIO.parse("ls_orchid.fasta", ”fasta")) creates Python dictionary with each entry held as a SeqRecord object in memory


Download ppt "BioPython http://biopython.org/wiki/Biopython Download & Installation http://biopython.org/wiki/Download Documentation http://biopython.org/wiki/Category%3AWiki_Documentation."

Similar presentations


Ads by Google