Presentation is loading. Please wait.

Presentation is loading. Please wait.

EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI 30.01.2016.

Similar presentations


Presentation on theme: "EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI 30.01.2016."— Presentation transcript:

1 EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI 30.01.2016

2 What data are we dealing with ? Why are we interested in Interactions ? 1.As a means of precisely understanding a protein role inside a specific cell type 2.To verify data, visualise your own interaction netwrok over the known space 3.Guilt by Association – it may be the only means of predicting a protein’s function 4.As building blocks for System’s Biology and Drug Discovery

3 Why are there so many issues with interaction data? 1.Wide variety of methods for demonstrating molecular interactions – all have their strengths and weaknesses 2.No single method accurately defines an interaction as being a true binary interaction observed under physiological conditions

4 Why do we need interaction databases Issues with all interaction data – true picture can only be built up by combining data derived using multiple techniques, multiple laboratories Problematic for any bench researcher to do – issues with data formats, molecular identifiers, sheer volume of data Molecular interaction databases publicly funded to collect this data and annotate in a format most useful to researchers

5 Interaction Databases Deep Curation IntAct – active curation, broad species coverage, all molecule types MINT – active curation, broad species coverage, PPIs DIP – active curation, broad species coverage, PPIs MPACT - no curation, limited species coverage, PPIs MatrixDB – active curation, extracellular matrix molecules only InnateDB - active curation – interactions involved in innate immunity BIND – ceased curating 2006/7, broad species coverage, all molecule types – information becoming dated Shallow curation BioGRID – active curation, limited number of model organisms HPRD – active curation, human-centric, modelled interactions MPIDB – active curation, microbial interactions

6 6 1.Publicly available repository of molecular interactions (mainly PPIs) - >250K binary interactions taken from >4,700 publications 2.Data is standards-compliant and available via our website, for download at our ftp site or via PSICQUIC 3.Provide open-access versions of the software to allow installation of local IntAct nodes. IntAct goals & achievements http://www.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml

7 Master headline “Lifecycle of an Interaction” Publication (full text) Sanity Checks (nightly) IntAct Curation CVs curator report Curation manual. abstract reject Super curator annotate p1 p2 I exp IMEx MatrixDB Mint DIP Public web site FTP site accept check

8 8 UniProt Knowledge Base Interactions in IntAct are using Splice Variants http://www.ebi.uniprot.org/

9 9 UniProt Knowledge Base IntAct exports interaction data to UniProt. Only interactions detected by specific methods are exported. Mostly physical -> higher quality interactions http://www.ebi.uniprot.org/ !

10 10 Controlled vocabularies Why do we use them ? e.g. more than 20 ways to write: yeast two hybrid, Y2H, 2H, two-hybrid, … Full integration of PSI-MI ontology Over 1,500 terms, fully defined and cross-referenced

11 www.ebi.ac.uk/ols Controlled vocabularies

12 12 Data model Support for detailed features i.e. definition of interacting interface Interacting domains Overlay of Ranges on sequence:

13 13 How to deal with Complexes Some experimental protocol do generate complex data: Eg. Tandem affinity purification (TAP) One may want to convert these complexes into sets of binary interactions, 2 algorithms are available:

14 Community standard for Molecular Interactions XML schema and detailed controlled vocabularies Jointly developed by major data providers: BIND, CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct, MINT, MIPS, Serono, U. Bielefeld, U. Bordeaux, U. Cambridge, and others Version 1.0 published in February 2004 The HUPO PSI Molecular Interaction Format - A community standard for the representation of protein interaction data. Henning Hermjakob et al, Nature Biotechnology 2004. Version 2.5 published in October 2007 Broadening the horizon - Level 2.5 of the HUPO-PSI format for molecular interactions. Samuel Kerrien et al., BMC Biology 2007. PSI-MI XML format

15 Data distribution: PSICQUIC Proteomics Standards Initiative Common QUery InterfaCe. Community effort to standardise the way to access and retrieve data from Molecular Interaction databases. Widely implemented by independent interaction data resources. Based on the PSI standard formats (PSI-MI XML and MITAB) Not limited to protein-protein interactions, also e.g. Drug-target interactions Simplified pathway data A registry listing resources implementing PSICQUIC Documentation: http://psicquic.googlecode.com

16 PSICQUIC: distributing data over multiple sources

17 IMEx: The International Molecular Exchange Consortium Group of major public interaction data providers sharing curation effort: DIP, IntAct, I2D, MINT, MatrixDB, Molecular Connections, InnateDB and MPIDB Independent molecular interaction resources Common curation standards for detailed curation Common data formats (PSI-MI XML, PSICQUIC) Common accession number space Coordinated & non-redundant curation In production mode since February 2010 Since 3/2009 supported by the European Commission under PSIMEx, contract number FP7-HEALTH-2007-223411, with additional partners Vital-IT, Nature, Wiley, BiaCore (GE), U. Maryland, CSIC, TU Munich, MIPS, SCBIT (Shanghai) www.imexconsortium.org

18 18 www.imexconsortium.org

19 EBI is an Outstation of the European Molecular Biology Laboratory. Performing and visualing a Simple Search EBI Walthrough May 2009 EBI Data, Standards and Tools

20 20 IntAct – Home Page http://www.ebi.ac.uk/intact

21 Performing a Simple Search 21

22 22 Visualizing - networkView From search to networkView…

23 Extend and Visualise your Search 23

24 24 Visualizing - networkView Simple, immediate visualisation of your network For manipulation – go to Cytoscape

25 Cytoscape View 25

26 EBI is an Outstation of the European Molecular Biology Laboratory. Exploring a single interaction in more depth

27 Interaction detail 27 First search from the home page… Choice of UniProtKB or Dasty View UniProt Taxonomy PubMed Expansion method Details of interaction

28 Participant information 28 Search result for ‘RAD1’

29 Interaction Detail 29 First search from the home page… Choice of UniProtKB or Dasty View UniProt Taxonomy PubMed Expansion method Details of interaction

30 30 First search from the home page… Details of interaction Viewing Interaction Data Details of interaction

31 31 Viewing Interaction Details Additional information

32 IntAct – Home Page-Quick Search 32

33 Advanced search: Fields Filtering options Add more filtering options

34 34 Searching with MIQL First search from the home page… Using the Molecular Interaction Query Language (MIQL), one can also build complex queries List of terms one can query on :

35 35 Browsing – Molecule View Binary view of o60671_human

36 36 Browsing – extending your search

37 37 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

38 PSI-MI XML 2.5 DATA MODEL An overview of the 38

39 39 Top level structure unchanged compared to PSI-MI 1.0 Use of Id/Ref on main objects Bird’s eye view of PSI-MI XML 2.5

40 40 Main objects - Experiment Controlled by Ontologies Literature references Confidence measures

41 41 Main objects - Interactor Generic interactor Reference to a public database

42 42 Main objects - Interaction Controlled by Ontology Copyright Experiment Kinetics parameters Confidence value

43 43 Basics – Controlled Vocabularies Why ? Ensure data consistency Provide reliable mean for searching & filtering data How ? By providing a reference to an ontology term Using Xref !!

44 44 Main objects - Participant e.g. enzyme target Interactor e.g. bait, prey Delivery method expression level… Interactor used experimentally Building of Complex

45 PSI-MI TAB DATA MODEL An overview of the 45

46 46 Standard columns (15): ID(s) interactor A & B Alt. ID(s) interactor A & B Alias(es) interactor A & B Interaction detection method(s) Publication 1st author(s) Publication Identifier(s) Taxid interactor A & B Interaction type(s) Source database(s) Interaction identifier(s) Confidence value(s) PSIMITAB Standard Columns

47 INTACT EXTENDED MITAB A quick look into 47

48 48 IntAct specific columns (+11): Experimental role(s) of interactors Biological role(s) of interactors Properties (CrossReference) of interactors Type(s) of interactors HostOrganism(s) Expansion method(s) Dataset name(s) Standard columns (15): ID(s) interactor A & B Alt. ID(s) interactor A & B Alias(es) interactor A & B Interaction detection method(s) Publication 1st author(s) Publication Identifier(s) Taxid interactor A & B Interaction type(s) Source database(s) Interaction identifier(s) Confidence value(s) + PSIMITAB Extended Columns

49 PSI-MI XML 2.5 JAVA API A hands on introduction to 49

50 50 PSI-MI XML Java API Uses Java 5 Provides binding between XML and Java object model Tools to read/write XML from/to file Read can be done in 2 fashions: Load a whole file in an EntrySet Only allows to load large files if you have enough memory Easy to update content and write back to file Index XML data and give access though an IndexedEntry Memory efficient with large files Allows to browse through interactions, experiments… Trickier to write updated content (yet, feasible)

51 PSI-MI TAB 2.5 JAVA API A hands on introduction to 51

52 52 PSI-MI TAB Java API Uses Java 5 Provides binding between TAB and a Java object model Tools to read/write TAB from/to file You can read in 2 fashions: Load a whole file in a Collection Only allows to load large files if you have enough memory Load interaction one at a time using Iterator Memory efficient with large files

53 53 PSI-MI XML is the de facto standard for molecular interactions We have code samples & exercises for both APIs ! Let me know if you want access to it … The Java API makes it easy to handle Summary http://psidev.info/MI PSI-MI Home page http://www.psidev.info/index.php?q=node/60#tools API Download ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psi25 Data

54 R packages for PSI-MI Quick introduction to 54

55 55 Rintact & RpsiXML Initiative from the Wolfgang Huber’s group at the EBI Enables PSI-MI XML data read into R data structure Enables data analysis using existing packages such as: RBGL, ppiStats, apComplex, … Currently supports: IntAct, MINT, HPRD, DIP, BioGRID, MIPS/CORUM, MatriDB, MPACT. http://www.bioconductor.org/packages/2.1/bioc/html/Rintact.html API Download http://www.bioconductor.org/packages/2.3/bioc/vignettes/RpsiXML/inst/doc/RpsiXML.pdf Documentation


Download ppt "EBI is an Outstation of the European Molecular Biology Laboratory. Master title Molecular Interactions – the IntAct Database Sandra Orchard EMBL-EBI 30.01.2016."

Similar presentations


Ads by Google