Presentation is loading. Please wait.

Presentation is loading. Please wait.

IntAct David Croft A database of Molecular Interactions.

Similar presentations


Presentation on theme: "IntAct David Croft A database of Molecular Interactions."— Presentation transcript:

1 IntAct David Croft (david.croft@ebi.ac.uk) A database of Molecular Interactions

2 What data are we dealing with ? What are protein-protein interactions? Some are very stable...

3 What data are we dealing with ? What are protein-protein interactions?...others last just a few milliseconds.

4 What data are we dealing with ? Example technique: yeast two hybrid

5 What data are we dealing with ? Why are we interested in Interactions ? 1.As a means of precisely understanding a protein role inside a specific cell type 1.Guilt by Association – it may be the only means of predicting a protein’s function 1.As building blocks for Systems Biology

6 What data are we dealing with ? The scope of IntAct data Nucleic acidsProteins Transcriptomics Small compounds

7 1.Define a standard for the representation and annotation of molecular interaction data 2.provide a public repository 1.populate the repository with experimental data from project partners and curated literature data 2.provide modular analysis tools 3.provide portable versions of the software to allow installation of local IntAct nodes. IntAct goals & achievements http://www.ebi.ac.uk/intact ftp://ftp.ebi.ac.uk/pub/databases/intact 4200+ distinct publications, 228,000+ binary interactions, 68,000+ proteins imported from UniProt search & advanced search, hierarchView, pay-as-you-go, MiNe… Known installation: AstraZeneca, GSK, MERCK, MINT, Proteome Center of Shanghai

8 Master headline “Lifecycle of an Interaction” Publication (full text) Sanity Checks (nightly) IntAct Curation CVs curator report Curation manual. abstract reject Super curator annotate p1 p2 I exp IMEx MatrixDB Mint DIP Public web site FTP site accept check

9 Christian Kohler Master headline Interaction space Realistically one publication per working day and curator Only a fraction of all published interactions is captured in interaction databases An end is not in sight, the interaction space is still vastly under-sampled

10 A very detailed data model Support for detailed features i.e. definition of interacting interface Overlay of Ranges on sequence: Interacting domains

11 How to deal with Complexes Some experimental protocol do generate complex data: Eg. Tandem affinity purification (TAP) One may want to convert these complexes into sets of binary interactions, 2 algorithms are available:

12 PSICQUIC: distributing data over multiple sources

13 MIMIx Experiments Interaction detection method (eg. Yeast two hybrid) Participant detection method (eg. Mass Spectrometry) Host organism Interactions Interactors Identifiers from public database Species of origin Biological/experimental roles (eg. enzyme,target / bait,prey) Confidence

14 Tutorial

15 http://www.ebi.ac.uk/intact IntAct: Home page

16 UniProt TaxonomyPubMedMethod (PSI-MI CV) Interaction details Complex ? Interactors IntAct: Search and results IMEx data Other PSICQUIC services

17 IntAct: Search and results Export Custom columns Filters

18 Exercise 1 In the search panel, type the query: CDK8. How many binary interactions are returned? Are any rodent proteins present in the results? (hint – look at Browse by taxonomy). If so, which species of rodent? Are there any bacterial proteins in the results? How would you filter these results so that only experimentally determined pairwise interactions are displayed? How many pairwise interactions do you find? Type the query: “transcription factor”. What types of interactor does it find (hint: click on the Lists tag). In the search panel, type the query: chlorophyll. Click on “Change Columns Displayed” and deselect the two Aliases columns, select the First Author column, then click the Update button. What changes occur in the interactions table? Who is the first author for the “Photosystem I subunit VII-ps1a1” (psaA) interaction?

19 Interaction details

20 Exercise 2 In the search panel, type: ERK AND species:3702. Click on the details symbol for interaction 1. What is the host organism for this experiment? Which journal was it published in, and in what year? How many interactions in total does IntAct have from this publication? (hint – look to the right of the Publications section)

21 The Browse tab

22 Exercise 3 In the search panel, type: Phosphopentokinase, click on the Browse tab, then click the By UniProt taxonomy link. How many interactions are there involving only Arabidopsis proteins? Select the arabidopsis interactions. Which interaction detection method is used for the manually curated entry? What is the title of the publication for this entry? Click on the Browse tab again, then click By Gene Ontology. What kind of activity does this interaction involve? (hint: expand the molecular function)

23 Advanced search: Fields Filtering options Add more filtering options

24 Exercise 4 In the search panel, type: starch. How many interactions are returned? Click on the “Show Advanced Fields” button to the right of the Quick Search box. Select the field Organism from the Pulldown menu – type in 3702 as your organism, click Add and search. How many interactions do you see now? Further refine the search by adding Detection method as two hybrid – does this make a difference in the number of interactions found?

25 The List tab - Proteins

26 List tab - Compounds

27 Exercise 5 In the search panel, type: mitosis and click on the Lists tab. How many proteins are found? How many small compounds? Click on the DASTY links for various proteins. Notice how it shows features such as mutation sites, post translational modifications and binding sites. Return to the Lists tab. Click on the Compounds sub-tab. Click on the ChEBI link for gdp. Is its atomic mass below 500 kilodaltons?

28 Viewing results in other resources

29 Exercise 6 Search for: GPCR and click on the Lists tab, then click the mRNA Expression button. You get an error – why is this? Fix the cause of the error, and click the mRNA Expression button again. For GPR56, how many experiments show this gene as overexpressed, how many underexpressed? Click on the Pathways button – which resource does this take you to? Which pathways are overrepresented?

30 Ontology search I

31 Ontology search II

32 Exercise 7 Click on the Search tab and scroll down to the ontology section. Start to enter the word stamen slowly. What do you notice? How many different stamen processes does IntAct recognize? Which ontologies are supported by IntAct? Which of these ontologies know something about stamen processes?

33 Using PSICQIC services IMEx data Other PSICQUIC services

34 Exercise 8 In the search panel, type the query: arabidopsis. How many binary interactions are returned? What is the total number of interaction evidences from other databases? How many interaction evidences come from IMEx databases? Click on the link to the IMEx hits. Which other database(s) has/have hits for this query? Look at the interactions from the MINT database. What information is available that is not available in IntAct?

35 Graph tab I

36 Graph tab II

37 Exercise 9 In the search panel, type: O81905, click on the Graph tab, then click the Cytoscape link. If Cytoscape does not start, ask your neighbour or try a different browser – not all computers have the permissions to do this. On the left hand side of the Cytoscape window, select the VizMapper tab. Under the drop down list ‘Current Visual Style’ choose ‘Sample 1’ Expand the edge color node, set detection method to discrete mapping. To color interactions by detection method, right click and choose Generate discrete values → Rainbow 1. Now experiment with other features of this visualization tool!

38 Answers! Exercise 1 57 interactions returned Yes, mouse. There are no bacterial proteins present. Filter out spoke-expanded queries, leaves 12 results Finds proteins, chemical compounds and nucleic acids Naver et al (2001)

39 Exercise 2 Host organism is yeast Proc Natl Acad Sci USA, 2008 8 interactions

40 Exercise 3 2 interactions from Arabidopsis Enzymatic study PubMed 15352244 New targets of Arabidopsis thioredoxins revealed by proteomic analysis Electron carrier

41 Exercise 4 334 interactions When you add Arabidopsis, leaves 20 results Add two hybrid detection method, leaves 15 entries

42 Exercise 5 5780 proteins 23 small compounds Yes, its mass is 443.2 KD. Master headline

43 Exercise 6 Error – need to select some or all of the list before selecting button 5 overexpressed, 0 underexpressed. Reactome GPCR signalling Master headline

44 Exercise 7 The word stamen auto-completes 4 stamen processes are recognised Gene Ontology, PSI-MI, ChEBI, UniProt Taxonomy and InterPro GO Master headline

45 Exercise 8 267114 interactions found 16043763 interaction evidences (PSICQUIC) 82047 interaction evidences from 3 other IMEx databases DIP, MatrixDB and MINT databases have hits for this query Mint has confidence values, IntAct does not. Master headline


Download ppt "IntAct David Croft A database of Molecular Interactions."

Similar presentations


Ads by Google