Download presentation
Presentation is loading. Please wait.
Published byAmice Cleopatra Harrington Modified over 6 years ago
1
BioPython http://biopython.org/wiki/Biopython Download
& Installation Documentation
2
BioPython Key features: Sequences Sequence Annotation I/O Operations
Accessing online databases Multiple sequence alignments BLAST and many many more …
3
quickstart: Sequence objects
Simple example: from Bio.Seq import Seq from Bio.Alphabet import IUPAC dna_sequence = Seq('AGGCTTCTCGTA', IUPAC.unambiguous_dna) print dna_sequence print dna_sequence.alphabet
4
quickstart: parsing sequences
Simple example: from Bio import SeqIO for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"): print(seq_record.id) print(repr(seq_record.seq)) print(len(seq_record)) file format
5
sequence objects alphabet sequence sequences work like strings
from Bio.Seq import Seq from Bio.Alphabet import IUPAC dna_sequence = Seq('AGGCTTCTCGTA', IUPAC.unambiguous_dna) for index, letter in enumerate(dna_sequence): print("%i %s" % (index, letter)) print dna_sequence[2:7] print dna_sequence[0::3] print dna_sequence[1::3] my_seq = str(dna_sequence) + “ATTAATTG” fasta_format_string = ">Name\n%s\n" % my_seq print(fasta_format_string) alphabet sequence sequences work like strings slicing of sequences striding of sequences turning sequences into strings
6
sequence objects making complements making mRNA
from Bio.Seq import Seq from Bio.Alphabet import IUPAC my_seq = Seq("GATCGATGGGCCTATATAGGATCGAAAATCGC”, IUPAC.unambiguous_dna) print my_seq print my_seq.complement() print my_seq.reverse_complement() messenger_rna = Seq(my_seq, IUPAC.unambiguous_rna) print messenger making complements making mRNA
7
sequence objects translation translation from Bio.Seq import Seq
from Bio.Alphabet import IUPAC messenger_rna = Seq("AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG”, IUPAC.unambiguous_rna) print messenger_rna print messenger_rna.translate() coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG", IUPAC.unambiguous_dna) print coding_dna.translate() translation translation
8
seqRecord object .seq sequence itself, typically a Seq object.
.id primary id, string .name common name, string .description human readable description, string .letter_annotations Holds per-letter-annotations using a (restricted) dictionary of additional information, Python sequence .annotations additional information, dictionary .features A list of SeqFeature objects with more structured information about the features on a sequence (e.g. position of genes on a genome, or domains on a protein sequence) .dbxrefs database cross-references, string
9
seqRecord object from scratch from Bio.Seq import Seq
simple_seq = Seq("GATC") from Bio.SeqRecord import SeqRecord simple_seq_r = SeqRecord(simple_seq) simple_seq_r.id = (“1234”) simple_seq_r.description = "Made up sequence” print simple_seq_r reading the information from Bio import SeqIO record = SeqIO.read("NC_ fna", "fasta") print record
10
Sequence I/O Parsing from file from Bio import SeqIO
for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"): print(seq_record.id) print(repr(seq_record.seq)) print(len(seq_record)) Or using an iterator: identifiers = [seq_record.id for seq_record in SeqIO.parse("ls_orchid.fasta", ”fasta")] print identifiers handle format
11
Sequence I/O Parsing from the web from Bio import Entrez
from Bio import SeqIO Entrez. = handle = Entrez.efetch(db="nucleotide", rettype="fasta", retmode="text", id=" ") seq_record = SeqIO.read(handle, "fasta") handle.close() print("%s with %i features" % (seq_record.id, len(seq_record.features)))
12
Sequence I/O How to find sequence information from Bio import SeqIO
orchid_dict = SeqIO.to_dict(SeqIO.parse("ls_orchid.fasta", ”fasta")) creates Python dictionary with each entry held as a SeqRecord object in memory
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.