European Life Sciences Infrastructure for Biological Information Rafael C Jimenez ELIXIR CTO EMBL-EBI workshop networks and pathways 2014, Friday 23 June An introduction to programmatic access
Query interfaces 2 Graphical User Interface (GUI) FTP access Database access Application Programmatic Interface (API) Data Web Services Biologist Bioinformaticians Developers remote resource
Query interfaces 3 Graphical User Interface (GUI) FTP access Database access Application Programmatic Interface (API) Data Web Services Interface:A program that allows users to interact with a system API:An interface that can be accessed using a specific programing language Web service:A web API available for multiple programming languages This introduction is intended for a non technical audience with purposely simplified technical concepts. remote resource
Web Services Service on the server side providing functionality It is accessible over a network (Internet) It is meant for machine to machine communication Independent from programming languages It can be operated following specific rules (protocols: REST or SOAP) This introduction is intended for a non technical audience with purposely simplified technical concepts.
Web Services How should I invoke you? Documentation Make a request Results Web server Application Web Service describes the methods and variables to query the service User/Developer Client
REST Web Services How should I invoke you? Documentation Make a request ( GET or POST ) Results ( data + status ) Web server Web Service Sometimes a WADL file is available in the server to describe the service WADL or Informal Description Formal Description Application Developer Client
SOAP Web Services How should I invoke you? Documentation ( WSDL ) Make a SOAP request Web server Web Service WSDL Method / ParametersData SOAP requestSOAP response and Informal Description Formal Description Results ( SOAP response ) Application Developer
SOAP vs. REST 8 REST Geared to simplicity. A browser can be a client. Request as complex as a URL can be. REST query: WADL: SOAP Based on Standards. Only accessed by specialized software. Allow description of complex data structure in request and response. SOAP REST WSDL:
PSICQUIC REST queries Bruno Aranda mint /psicquic/webservices/current/search/ query / p53 intact /webservices/current/search/ query / p53 chembl /webservices/current/search/ query / p
MIQL Bruno Aranda
MIQL …/query/ specie:rat …/query/ brca AND rpa1 Terms Fields Operands
PSICQUIC SOAP service species:trypanosoma AND detmethod:’two hybrid’
European Life Sciences Infrastructure for Biological Information Workflows
Introduction to Web Services at EBI Workflow – Sequence of tasks that produces a result of observable value Workflow management system – Computer system to compose and execute workflows. Workflow components – Input – Service – Output – Shims Service A Service B
Match Mismatch Shims: Connecting services Shim Service A Service B Service A Service B Convert data formats and act as connectors
European Life Sciences Infrastructure for Biological Information MyGrid
Create and run workflows Share, discover and reuse workflows Discover and reuse services myGrid solutions
A public centralised and curated registry of Life Science Web Services ‘Web 2.0’-style website and API Allow anyone to register, discover and curate Web Services Community oriented with expert guidance Open content, open source, open platform Paul Fisher, myGrid, University of Manchester Biocatalogue
BioCatalogue’s Mission 5/13/
Service Search
Workflow diagram Tree view of workflow structure Available services Taverna Workflow management system Java desktop application Open source and extensible Includes access to Biocatalogue and myExperiment
Sharing Experiments You can share results/experiments/experiences with your – Research group – Collaborators – Scientific community A registry of workflows Paul Fisher, myGrid, University of Manchester
myExperiment
Recycling, Reuse, Repurposing Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis Paul meets Jo. Jo is investigating mouse Whipworm infection. Jo reuses one of Paul’s workflows. Jo identifies the biological pathways involved in sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite. Previously a manual two year study by Jo had failed to do this. Workflows are protocols Paul Fisher, myGrid, University of Manchester
Examples from myExperiment OLS PICR Biomart and Microarray analysis ChEBI
European Life Sciences Infrastructure for Biological Information Taverna
Workflow Diagram Services Panel Workflow Explorer Run workflow
Input list Input description Input example Input value
Output tab Results display List of results
Installing the Workbench Download the Taverna 2.5 workbench from Install Taverna Open Taverna Start / All programs / Taverna / Taverna Workbench 2.5 You do not have to complete the registration now. Click on “Do no ask me again”.
European Life Sciences Infrastructure for Biological Information Tutorial myExperiment & Taverna
1.Open one PSICQUIC workflow 1.Open Taverna and click the “myExperiment” button. myExperiment is a repository of workflows 2.In the “Query” field, type “psicquic” 3.Find the “Molecular Interactions from IntAct PSICQUIC service (REST)” workflow and click on the “Open” button. 2.Run a PSICQUIC workflow 1.In the menu click on “File” and “Run workflow” 3.Define your query 1.Find and click the “Set value” button 2.Specify your MIQL query. i.e species:trypanosoma AND detmethod:"two hybrid" 3.Click on the button “Run workflow” 4.Check your results 1.In the bottom left corner, in the “MITAB” tab, click on “Value1” 5.Save results 1.Click on the “Save value” button on the bottom right corner. Simple PSICQUIC workflow with Taverna to query IntAct
Workflow results
Make your own workflow, reuse workflows Look at the following workflows: Get a list of Protein Identification experiments from PRIDE by a Gene Ontology query Get a list of proteins annotated with an Ontology term and use these proteins to query BioModels Create a similar workflow to retrieve molecular interactions from IntAct using a GO term as input. Reuse one of the previous workflows and connect it with one of the following workflows: Retrieve Molecular Interactions from PSICQUIC Services:
European Life Sciences Infrastructure for Biological Information Thank you