Presentation is loading. Please wait.

Presentation is loading. Please wait.

Essential BioPython Retrieving Sequences from the Web

Similar presentations


Presentation on theme: "Essential BioPython Retrieving Sequences from the Web"— Presentation transcript:

1 Essential BioPython Retrieving Sequences from the Web
MARC: Developing Bioinformatics Programs Alex Ropelewski PSC-NRBSC Bienvenido Vélez UPR Mayaguez

2 The Entrez eFetch Function
Fetching a single Genbank Sequence from the Network from Bio import SeqIO from Bio import Entrez #Please use your REAL address below: handle = Entrez.efetch(db="nucleotide",rettype="gb",id="NM_000518") sr = SeqIO.read(handle,"genbank") print sr.id print sr.seq handle.close()

3 Main eFetch Function Parameters
Name Req’d Default Description with Options db Y N/A Database to search** (e.g. nucleotide, protein, structure) id Single of comma separated list of unique IDs** retmode N db specific Data format for records returned* (text, xml or asn.1.) rettype Data layout for records returned* (e.g. fasta, gp) retstart Sequential index of first record to be retrieved Retmax 10,000 Max number of records to retrieve seq_start 1 First sequence base to retrieve seq_stop length Last sequence base to retrieve *See for a complete list of available retmode/rettype combinations **See for a complete list of avalable Entrez databases and their corresponding unique identifiers.

4 The Entrez eFetch Function
Fetch a single Genbank Sequence from the Network and Save from Bio import SeqIO from Bio import Entrez InHandle = Entrez.efetch(db="nucleotide",rettype="gb",id="NM_000518") sr = SeqIO.read(InHandle, "genbank") OutHandle = open("NM_ gb","w") SeqIO.write(sr,OutHandle,"genbank") InHandle.close() OutHandle.close()

5 The Entrez eFetch Function
Fetch a single Genbank Sequence from the Network and Save as Fasta from Bio import SeqIO from Bio import Entrez InHandle = Entrez.efetch(db="nucleotide",rettype="gb",id="NM_000518") sr = SeqIO.read(InHandle, "genbank") OutHandle = open("NM_ fasta","w") SeqIO.write(sr,OutHandle,"fasta") InHandle.close() OutHandle.close()

6 The Entrez eFetch Function
Fetch multiple Genbank Sequences from the Network and Save as Fasta from Bio import SeqIO from Bio import Entrez accessions="NM_000518, AJ131351" InHandle = Entrez.efetch(db="nucleotide",rettype="gb",id=accessions) seqRecords = SeqIO.parse(InHandle, "genbank") OutHandle = open("myseqs.fasta","w") SeqIO.write(seqRecords,OutHandle,"fasta") InHandle.close() OutHandle.close()

7 The ExPASy and SwissProt Packages
Fetch SwissProt Sequences from ExPASy and Save as Fasta >>> from Bio import ExPASy >>> from Bio import SeqIO >>> accessions = ["O23729", "O23730", "O23731"] >>> records = [] >>> for accession in accessions: ... handle = ExPASy.get_sprot_raw(accession) ... record = SeqIO.read(handle, "swiss") ... records.append(record) >>> outHandle=open("mysequences.fasta", "w") >>> SeqIO.write(records,outHandle,"fasta") >>> outHandle.close()


Download ppt "Essential BioPython Retrieving Sequences from the Web"

Similar presentations


Ads by Google