Programmatic access to EMBL-EBI resources Web Services Programmatic access to EMBL-EBI resources Andrew Cowley andrew.cowley@ebi.ac.uk www.ebi.ac.uk/webservices
Web interfaces are great for several uses Single jobs Hypothesis testing Publications Training But what if I want to examine every sequence in a proteome? Or my data needs lots of repetitive analysis?
Solution 1 – local install You could download programs/data to your own computer/server Lots of control Flexible Add your own programs/data But data is large Storage cost Maintenance/updating cost Just how friendly are you with your local IT?!
Solution 2 – Script against websites Literally automatic pasting data and clicking ‘run’??! NO! (well, please don’t) Instead, resources expose APIs (Application Programming Interface) via Web Services Use EMBL-EBIs compute resources, and up-to-date programs and data
Web Services Well-established protocols: REST and SOAP Built-in function help Large range of example clients available to download and run as is ‘Easy’ to plug into existing workflow tools Many third-party tools also exist that use our Web Services behind the scenes (In fact they’re so nice, we use them a lot either behind our own web interface or to chain together bioinformatics services)
EBI Search WS usage: Results enrichment Sequence similarity search tool results display referenced data, e.g. for BLAST/FASTA search:
Web Service examples Client front end and results, WP engine (UniProt, InterProScan)
Web Service examples External front end and results
EBI Search
EBI Search
How analysis tools are called Jobs per month
Web Services Synchronous & asynchronous
Web Services - synchronous
Web Services - synchronous
Web Services - asynchronous JobID “Is JobID done?” “Still cooking”
Web Services - asynchronous “Is JobID done?” “It’s ready!” Fetch JobID
Steps... Input parameters Meta-Information List parameters Get parameter details → Name, description, values... Submission Run (Email, title, values...) → Job Identifier Check status → RUNNING, FINISHED, ERROR... Results analysis List results available → Name, description, media type... Get result → Output, text, binaries (images)... Input parameters Job identifier (e.g. iprscan-S20110708-094729-0726-35857540-pg)
Web Services role play…
Key resources Documentation/example clients http://www.ebi.ac.uk/Tools/webservices Pre-compiled clients available for a large variety of languages SOAP WSDL: http://www.ebi.ac.uk/Tools/services/soap/{tool}?wsdl Most programming languages have SOAP client libraries Generate stubs or dynamically call methods REST http://www.ebi.ac.uk/Tools/services/rest/{tool}/{method}/{params} Basic HTTP requests Web browser, HTTP client libraries, CURL…
Tools at EMBL-EBI http://www.ebi.ac.uk/services Analyse your own data via ~100 tools: Search across our databases for homologous sequences using a suite of tools from classics such as BLAST and FASTA to recent advances such as PSI-Search Functionally annotate your sequence with protein domains and important sites using InterProScan Align your sequences to discover conserved regions with Clustal Omega, MUSCLE etc. Perform sequence translation, repeat analysis/masking, trans- membrane topology prediction etc. with EMBOSS and other tools Results enriched with data from EBI resources As well as providing data itself at EMBL-EBI, we also enable you to analyse your own data using our resources and data collections. The wide-ranging tools are developed internally, externally and by collaboration with leading authors. They are incorporated into our web framework to allow a consistent user experience, access to the latest data and to leverage cross-resource information, for example automatically annotating aligned regions of a BLAST protein search with domain and motif information. (and there are tools for things other than sequence! I’ve just concentrated on them for brevity) Sequence similarity searching: Central to genome annotation Characterising protein families Exploring distant evolutionary relationships http://www.ebi.ac.uk/services
Tools at EMBL-EBI http://www.ebi.ac.uk/services
Web Services exercise Linux command line help Open terminal cd =change directory/folder cd .. =go up directory mkdir =make directory cp <file1> <file2> =copy [tab] auto completes ls =list files in current directory ./[program] run program in my current directory cat =view contents of text file [shift PgUp/PgDn] scroll up/down a page [double click] select word [right click] paste www.ebi.ac.uk/~apc/Courses/ExploreBioSeq/ chmod a+x <file> =make file executable
Workflows
Workflows Galaxy Taverna
Web Services workflow $ncbiblast_lwp.pl --email email@example.org --program blastp --database uniprotkb_human --stype protein P01174.fasta $wsdbfetch_soaplite.pl fetchData @<jobid>.ids.txt fasta > P01174search2016_11_03.fasta $kalign_soaplite.pl --email email@example.org P01174search2016_11_03.fasta
Web Services workflow $ncbiblast_lwp.pl --email email@example.org --program blastp --database uniprotkb_human --stype protein P01174.fasta | wsdbfetch_soaplite.pl fetchData @- fasta | kalign_soaplite.pl --email email@example.org -
How to get help/contact us Most resources have Web Services or programmatic API links for more info Central collection of links/example clients: www.ebi.ac.uk/Tools/webservices Step-by-step instructions and example command lines: “Using EMBL-EBI Services via Web Interface and Programmatically via Web Services” Lopez R , Cowley A , Li W , McWilliam H Curr Protoc Bioinformatics. 2014 Dec 12;48:3.12.1-50. doi: 10.1002/0471250953.bi0312s48 Helpdesk: www.ebi.ac.uk/support/