EMBL-EBI, programmatically - take a REST from manual searching: Sequence analysis tools Web Production Team Anna Foix Joon Lee
Contents EBI Tools RESTful Clients Where to get the clients Show you how to run the clients How to use a simple pipeline manually Where to get help and support
RESTful Clients RESTful Web Services enable applications and services to run over the web Access to these happens using URIs Provide a uniform interface. Clients exist in a variety of programming languages Resources are decoupled from their representation and can be access through a broad variety of formats such as HTML, XML, text, JSON, etc. Users no longer need to maintain applications and databases
EBI Tools Clients There are many categories of Web Services in bioinformatics (e.g. sequence similarity, multiple sequence alignments, data retrieval, etc.) Ready to use Clients exist in Python, Perl, Java, Ruby, C, C#...and can be obtained here! At https://www.ebi.ac.uk/Tools/webservices you will find a lot of information about our Web Services
Where to get the clients https://www.ebi.ac.uk/Tools/webservices
How to run the clients How to run NCBI Blast How to run EMBOSS Water
Stitching things together Using a shell script Using CWL (Common Workflow Language)
How to run the clients as a workflow Running blastp against swissprot Getting the IDs from blastp Fetching sequenes using dbfetch Running Clustal Omega Running Simple phylogency
Workflow by a shell script ../ncbiblast/ncbiblast_lwp.pl --email $email --program blastp --database uniprotkb_swissprot --stype protein --sequence $inputSeq --outformat ids --outfile $$ cat $$.ids.txt | head -5 | sed 's/SP\://g;'| tr '\n' ',' | ../dbfetch_lwp.pl fetchBatch uniprot - fasta raw >> $$.fasta ../clustalo/clustalo_lwp.pl --email $email --sequence …….. ../simple_phylogeny/simple_phylogeny_lwp.pl --email $email --sequence $$.aln-clustal.clustal --outformat tree -- outfile $$
Workflow by CWL script steps: ncbiblast_step: run: '../ncbiblast/ncbiblast.cwl’ dbfetch: run: '../dbfetch/dbfetch.cwl‘ in: accessions: ncbiblast_step/ids out: [aligned_sequences] clustalo_step: run: '../clustalo/clustalo.cwl‘ in: sequences: dbfetch/aligned_sequences out: [clustalo_out]
NCBI BLAST CWL inputs: command: type: File inputBinding: default: class: File location: ncbiblast_lwp.pl email: type: string inputBinding: prefix: --email default: ‘joonlee@ebi.ac.uk' program: type: string inputBinding: prefix: --program default: 'blastp‘ database: type: string inputBinding: prefix: --database default: 'uniprotkb_swissprot‘ type: type: string inputBinding: prefix: --stype default: 'protein'
Clustal Omega CWL inputs: command: type: File inputBinding: default: class: File location: clustalo_lwp.pl email: type: string inputBinding: prefix: --email default: ‘joonlee@ebi.ac.uk' outformat: type: string inputBinding: prefix: --outformat default: 'aln-clustal’ sequences: type: File inputBinding: prefix: -- sequence outputs: clustalo_out: type: stdout
More help and support https://www.ebi.ac.uk/Tools/webservices http://www.ebi.ac.uk/support