Important modules: Biopython, SQL & COM. Information sources python.org tutor list (for beginners), the Python Package index, on-line help, tutorials,

Slides:



Advertisements
Similar presentations
Introductory to database handling Endre Sebestyén.
Advertisements

EIONET Training Beginners Zope Course Miruna Bădescu Finsiel Romania Copenhagen, 27 October 2003.
Site Collection, Sites and Sub-sites
W EB A PPLICATION D EVELOPMENT A PPLICATION T O B IO -I NFORMATICS -III Vicky Khanna M-Tech Bioinformatics.
Wrapping up our last topic: You and your (DNA) parasites Events like these, happening over and over again, have led to… Edward Marcotte/Univ. of Texas/BCH391L/Spring.
10/6/2014BCHB Edwards Sequence File Parsing using Biopython BCHB Lecture 11.
Web development  World Wide Web (web) is the Internet system for hypertext linking.  A hypertext document (web page) is an online document. It contains.
Lecture Microsoft Access and Relational Database Basics.
Creating WordPress Websites. Creating a site on your computer Local server Local WordPress installation Setting Up Dreamweaver.
1 Computing for Todays Lecture 22 Yumei Huo Fall 2006.
Austyn Krutsinger & Iain Smith. Program Overview  Who is this for?  What does it do?  Key points  Remove the need for paper records  Save all information.
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
Python Web Applications A KISS Introduction. Web Applications with Python Fetching, parsing, text processing Database client – mySQL, etc., for building.
E-Commerce The technical side. LAMP Linux Linux Apache Apache MySQL MySQL PHP PHP All Open Source and free packages. Can be installed and run on most.
ECA 228 Internet/Intranet Design I Intro to the Web.
BioPerl. cpan Open a terminal and type /bin/su - start "cpan", accept all defaults install Bio::Graphics.
WHAT IS PHP PHP is an HTML-embedded scripting language primarily used for dynamic Web applications.
Why Worry About the WWW? Intranets -- with lots of HR applications »policies/procedures »job postings »benefits & other transactions »hiring & other workflows.
1 Web Database Processing. Web Database Applications Static Report Publishing a report is prepared from a database application and exported to HTML DB.
Public Resources (II) – Analysis tools  Web-based analysis tools – easy to use, but often with less customization options.  Stand-alone analysis tools.
Web Sites for amateur radio. So You want to make a Web Site? There are several things you need to know about web sites before you start to think about.
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
3/8/00asp00 1 Active Server Pages from Microsoft Nancy McCracken Northeast Parallel Architectures Center at Syracuse.
Nic Shulver, Intro: Developing Server Applications What is a server? Many types of server – File server – file: networked file.
Web Application Programming Carol Wolf Computer Science.
1 PHP and MySQL. 2 Topics  Querying Data with PHP  User-Driven Querying  Writing Data with PHP and MySQL PHP and MySQL.
DB Libraries: An Alternative to DBMS By Matt Stegman November 22, 2005.
BioPython Workshop Gershon Celniker Tel Aviv University.
Introduction to Python for Biologists Lecture 3: Biopython This Lecture Stuart Brown Associate Professor NYU School of Medicine.
Public Resources for Bioinformatics Databases : how to find relevant information. Analysis Tools.
WEB API: WHY THEY MATTER ECOL 453/ Nirav Merchant
M1G Introduction to Database Development 6. Building Applications.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Microsoft Internet Explorer and the Internet Using Microsoft Explorer 5.
SQL Queries Relational database and SQL MySQL LAMP SQL queries A MySQL Tutorial and applications Database Building Assignment.
Supporting High- Performance Data Processing on Flat-Files Xuan Zhang Gagan Agrawal Ohio State University.
MET280: Computing for Bioinformatics Introduction to databases What is a database? Not a spreadsheet. Data types and uses DBMS (DataBase Management System)
PHP and MySQL CS How Web Site Architectures Work  User’s browser sends HTTP request.  The request may be a form where the action is to call PHP.
Putting it all together Dynamic Data Base Access Norman White Stern School of Business.
9 1 DBM Databases CGI/Perl Programming By Diane Zak.
Identifying the ortholog of TNF (Tumor necrosis factor) in mosquito genomes Pet Projects:
ASP.NET The Clock Project. The ASP.NET Clock Project The ASP.NET Clock Project is the topic of Chapter 23. By completing the clock project, you will learn.
Enterprise PHP - Introduction Enterprise Client-Server Development with PHP Nic Shulver, FCES, Staffordshire University A fifteen credit module based on.
>gi| |gb|AAB | ADP-glucose pyrophosphorylase large subunit [Oryza sativa] 02-AUG-1996 Gene accession U66041 Plant Physiol. 112, 1399 (1996)
A Genomics View of Unix. General Unix Tips To use the command line start X11 and type commands into the “xterm” window A few things about unix commands:
DataFlow Diagram – Level 0
Instructor: Pavlos Pavlikas1 How Data is Stored Chapter 8.
© 2012 LogiGear Corporation. All Rights Reserved FitNesseFitNesse Authors: Nghia Pham 1.
Important modules: Biopython, SQL & COM. Information sources  python.org  tutor list (for beginners), the Python Package index, on-line help, tutorials,
COSC 2328 – Web Programming.  PHP is a server scripting language  It’s widely-used and free  It’s an alternative to Microsoft’s ASP and Ruby  PHP.
Biopython 1. What is Biopython? tools for computational molecular biology to program in python and want to make it as easy as possible to use python for.
Web Page Design The Basics. The Web Page A document (file) created using the HTML scripting language. A document (file) created using the HTML scripting.
XP Creating Web Pages with Microsoft Office
L.A.M.P. İlker Korkmaz & Kaya Oğuz CS 350. Why cover a lecture on LAMP? ● Job Opportunities – There are many hosting companies offering LAMP as a web.
Sequence File Parsing using Biopython
Topic 2: Hardware and Software
BioPython Download & Installation Documentation
Some bits on how it works
PHP / MySQL Introduction
BioPython Download & Installation Documentation
LAMP, WAMP and.. L. Grewe.
Sequence File Parsing using Biopython
PHP + Oracle = Data-Driven Websites
Lesson 3 Bioinformatics Laboratory
Shim (Helper) Services and Beanshell Services
Install MySQL Community Server and MySQL Workbench
Sequence File Parsing using Biopython
Supporting High-Performance Data Processing on Flat-Files
How to search NCBI.
Web Application Development Using PHP
Presentation transcript:

Important modules: Biopython, SQL & COM

Information sources python.org tutor list (for beginners), the Python Package index, on-line help, tutorials, links to other documentation, and more. biopython.org (and mailing list) newsgroup comp.lang.python

Biopython Collection of many bioinformatics modules Some well tested, some experimental Check with biopython.org before writing new software. It may already exist. Contribute your code (even useful scripts) to them.

The Seq object >>> from Bio import Seq >>> seq = Seq.Seq("ATGCATGCATGATGATCG") >>> print seq Seq('ATGCATGCATGATGATCG', Alphabet()) >>> Alphabet? What’s that? Python doesn’t know that you gave it DNA. (It could be a strange protein.)

Alphabets >>> from Bio import Seq >>> from Bio.Alphabet import IUPAC >>> protein = Seq.Seq("ATGCATGCATGC", IUPAC.protein) >>> dna = Seq.Seq("ATGCATGCATGC", IUPAC.unambiguous_dna) >>> protein[:10] Seq('ATGCATGCAT', IUPACProtein()) >>> protein[:10] + protein[::-1] Seq('ATGCATGCATCGTACGTACGTA', IUPACProtein()) >>> dna[:6] Seq('ATGCAT', IUPACUnambiguousDNA()) >>> dna[0] 'A' >>> protein[:10] + dna[:6] Traceback (most recent call last): File " ", line 1, in ? File "/usr/local/lib/python2.3/site-packages/Bio/Seq.py", line 45, in __add__ raise TypeError, ("incompatable alphabets", str(self.alphabet), TypeError: ('incompatable alphabets', 'IUPACProtein()', 'IUPACUnambiguousDNA()') >>>

Translation >>> from Bio import Seq >>> from Bio.Alphabet import IUPAC >>> from Bio import Translate >>> >>> standard_translator = Translate.unambiguous_dna_by_id[1] >>> seq = Seq.Seq("GATCGATGGGCCTATATAGGATCGAAAATCGC",... IUPAC.unambiguous_dna) >>> standard_translator.translate(seq) Seq('DRWAYIGSKI', HasStopCodon(IUPACProtein(), '*')) >>>

Reading sequence files We’ve put a lot of work into reading common bioinformatics file formats. As the formats change, we update our parsers. There’s (almost) no reason for you to write your own GenBank, SWISS-PROT,... parser!

Reading a FASTA file >>> from Bio import Fasta >>> parser = Fasta.RecordParser() >>> infile = open("ls_orchid.fasta") >>> iterator = Fasta.Iterator(infile, parser) >>> record = iterator.next() >>> record.title 'gi| |emb|Z |CIZ78533 C.irapeanum 5.8S rRNA gene and ITS1 and ITS2 DNA' >>> record.sequence 'CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTGATGAGACCGTGGAATAAAC GATCGAGTGAATCCGGAGGACCGGTGTACTCAGCTCACCGGGGGCATTGCTCCCGTGGT GACCCTGATTTGTTGTTGGGCCGCCTCGGGAGCGTCCATGGCGGGTTTGAACCTCTAGCC CGGCGCAGTTTGGGCGCCAAGCCATATGAAAGCATCACCGGCGAATGGCATTGTCTTCCC CAAAACCCGGAGCGGCGGCGTGCTGTCGCGTGCCCAATGAATTTTGATGACTCTCGCAAA CGGGAATCTTGGCTCTTTGCATCGGATGGAAGGACGCAGCGAAA TGCGATAAGTGGTGTGAATTG CAAGATCCCGTGAACCATCGAGTCTTTTGAACGCAAGTTGCGCCCGAGGCCATCAGGCTAAGGGCACGCCTGCTTGG GCGTCGCGCTTCGTCTCTCTCCTGCCAATGCTTGCCCGGCATACAGCCAGGCCGGCGTGGTGCGGATGTGAAAGATT GGCCCCTTGTGCCTAGGTGCGGCGGGTCCAAGAGCTGGTGTTTTGATGGCCCGGAACCCGGCAAGAGGTGGACGGA TGCTGGCAGCAGCTGCCGTGCGAATCCCCCATGTTGTCGTGCTTGTCGGACAGGCAGGAGAACCCTTCCGAACCCCA ATGGAGGGCGGTTGACCGCCATTCGGATGTGACCCCAGGTCAGGCGGGGGCACCCGCTGAGTTTACGC'

Reading all records >>> from Bio import Fasta >>> parser = Fasta.RecordParser() >>> infile = open("ls_orchid.fasta") >>> iterator = Fasta.Iterator(infile, parser) >>> while 1:... record = iterator.next()... if not record:... break... print record.title[record.title.find(" ")+1:-1]... C.irapeanum 5.8S rRNA gene and ITS1 and ITS2 DN C.californicum 5.8S rRNA gene and ITS1 and ITS2 DN C.fasciculatum 5.8S rRNA gene and ITS1 and ITS2 DN C.margaritaceum 5.8S rRNA gene and ITS1 and ITS2 DN C.lichiangense 5.8S rRNA gene and ITS1 and ITS2 DN C.yatabeanum 5.8S rRNA gene and ITS1 and ITS2 DN.... additional lines removed....

Reading a GenBank file >>> from Bio import GenBank >>> parser = GenBank.RecordParser() >>> infile = open("input.gb") >>> iterator = GenBank.Iterator(infile, parser) >>> record = iterator.next() >>> record.locus '10A19I' >>> record.organism 'Oryza sativa (japonica cultivar-group)' >>> len(record.features) 31 >>> record.features[0].key 'source' >>> record.features[0].location ' ' >>> record.taxonomy ['Eukaryota', 'Viridiplantae', 'Streptophyta', 'Embryophyta', 'Tracheophyta', 'Spermatophyta', 'Magnoliophyta', 'Liliopsida', 'Poales', 'Poaceae', 'Ehrhartoideae', 'Oryzeae', 'Oryza'] >>> Only changed the format name

Get data over the web Python includes the ‘urllib2’ (successor to urllib) to fetch data given a URL. It can handle GET and POST requests for HTTP and HTTPS, do ftp, and read local files. Several web servers have an interface for programs to get data directly from the server instead of going through the HTML page meant for humans. But most of the time we have to do some screen- scraping.

Live demos! I’m off the net here at Dan’s house on a Mac so I can’t test the following examples. Hopefully I’ve had a time to prepare them during the break.

NCBI’s EUtils Designed for software to access NCBI’s databases directly. Can query literature and sequence databases. Biopython includes a library for working with it. Sadly, it’s poorly documented. The module for this is Bio.EUtils

Remote BLAST (at NCBI) Biopython includes a library to run a job on NCBI’s BLAST server. To your program it looks just like a normal function call. The module for this is Bio.BLAST.NCBIWWW

BLAST (locally) If you instead want to use a local installation of BLAST you can use Bio.Blast.NCBIStandalone

Microsoft/COM/Excel Python runs on Unix, Macs, Microsoft Windows, and more. Python support for Windows is very good. “The most Microsoft compliant language outside Redmond.” COM is how one program can communicate with another under Windows. It’s very easy to have Python talk to Excel to load or get data from a spreadsheet. For more information, see the “Win32 Programming in Python” book by Mark Hammond.

SQL Python can connect to different databases (like MySQL, PostgreSQL, and Oracle). Usually there are two ways to talk to the database; directly using the database-specific interface or indirectly through a Python adapter which tries to hide the differences between the databases. There is a standard open-source database schema for bioinformatics called BioSQL. Python supports it.