Download presentation
Presentation is loading. Please wait.
Published byWillis Phelps Modified over 9 years ago
1
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating Activity) EBI services Jennifer McDowall EMBL-EBI
2
Overview Introduction EBI Databases Searching for sequences –NEW: simple EBI search –Advanced SRS text search –Sequence search tools Accessing Old entries –Sequence archives Chemoinformatics
3
Website: http://www.ebi.ac.uk/ Thematic index EBI Search Search all databases and literature in one go EBI Search Search all databases and literature in one go
4
Website: www.ebi.ac.uk Databases Patent resources Sequences Genomes Chemistry Structures Gene expression Reactions & pathways Literature Sequence searching Sequence analysis Structural analysis Functional analysis Tools Training eLearning Workshops 2Can education resource Industry programme Industry support SME Support
5
patent-related resources... EBI databases
6
Sequence data from patent literature October 2010 patent nucleotides > 17.5m sequences patent proteins > 4.9m sequences GenBank ENA DDBJ EPO USPTO JPO + KIPO EPO policy: Data publically released 18 months after patent application date (whether patent granted or not) INSDC agreement: Free unrestricted access Permanently accessible All data exchanged daily
7
Patent resources at EBI www.ebi.ac.uk/patentdata
8
Patent sequence records at EBI NR patent sequences >124 million sequences patent + non-patent nucleotides redundant UniParc (division of UniProt) ENA (formerly EMBL-Bank) >24 million sequences patent + non-patent proteins non-redundant patent proteins and nucleotides non-redundant additional patent annotation non-patent sequence prior art searches patent sequence prior art searches
9
Non-redundant patent databases www.ebi.ac.uk Remove sequence redundancy Level-1 NR Group by patent families Level-2 NR Additional annotation, including priority dates for patent families ENA (redundant)
10
Sequence submissions Generate sequence Submit to journal Submit to ENA Submission guides at www.ebi.ac.uk Not acceptedSubmit to journal Step 2 Submit claim to EPO Step 1
11
Searching for sequences simple EBI search...
12
EBI-Search by patent number www.ebi.ac.uk Follow link to NEW EBI Search
13
Link to NEW EBI Search EBI-Search by patent number
14
Link to NEW EBI Search Getting started How it works Gene & protein summaries NEW EBI Search Training video
15
EBI-Search by patent number Link to NEW EBI Search Search for patent WO0146262
16
Link to NEW EBI Search EBI-Search by patent number Search for WO0146262
17
EBI-Search by patent number Link to NEW EBI Search Search for WO0146262 Literature for WO0146262 Sequence data for WO0146262
18
EBI-Search by patent number Link to NEW EBI Search Search for WO0146262 Link to full patent paper
19
EBI-Search by patent number Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases
20
Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore EBI-Search by patent number
21
Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore
22
EBI-Search by patent number Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore WO0146262 in Esp@cenet
23
EBI-Search by patent number Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXploreWO0146262 in Esp@cenet
24
EBI-Search by patent number WO0146262 in Esp@cenet Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore WO0146262 in Patent Lens
25
EBI-Search by patent number WO0146262 in Esp@cenet Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore WO0146262 in Patent Lens
26
EBI-Search by patent number WO0146262 in Esp@cenet Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore WO0146262 in Patent Lens Lists nucleotide sequences from WO0146262 Additional annotation
27
EBI-Search by patent number WO0146262 in Esp@cenet Link to NEW EBI Search Search for WO0146262 WO0146262 literature and sequence databases WO0146262 in CiteXplore WO0146262 in Patent Lens WO0146262 nucleotide sequence record in ENA
28
Patent sequence record in ENA www.ebi.ac.uk Graphical viewer Sequence Patent reference Navigate to related data e.g. Version archive Navigate to external data sources e.g. UniProt Download data DNA source Dates (first public and last updated) Sequence version
29
WO0146262 in Esp@cenet Link to NEW EBI Search Search for WO0146262 WO0146262 in CiteXplore WO0146262 in Patent Lens WO0146262 literature and sequence databases ENA sequence record EBI-Search by patent number
30
EBI-Search by gene name Link to NEW EBI Search Search for src gene
31
Link to NEW EBI Search EBI-Search by gene name Search for src
32
EBI-Search by gene name Link to NEW EBI Search Search for src Genome information Gene & protein summaries
33
EBI-Search by gene name Link to NEW EBI Search Search for src Let’s select src in humans
34
EBI-Search by gene name Link to NEW EBI Search src gene & protein summary Search for src
35
EBI-Search by gene name Link to NEW EBI Search src gene & protein summary Search for src Species selector
36
EBI-Search by gene name Link to NEW EBI Search src gene & protein summary Search for src Gene tab Gene structure (forward & reverse strand) Gene sequence Location Sequence variations Orthologs Data source (Ensembl)
37
src gene & protein summary Link to NEW EBI Search Search for src Gene & protein summary gene tab EBI-Search by gene name
38
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab Expression tab Expression studies Data source (Expression Atlas) EBI-Search by gene name
39
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab See expression in cell type EBI-Search by gene name Gene Atlas
40
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary expression tab
41
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary expression tab Protein tab Function Isoforms Sequence Classification Interactions Data sources (UniProt, InterPro, IntAct)
42
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab
43
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Structure tab Citation Data source (PDBe) Structural domains 47 additional structures
44
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Gene & protein summary structure tab
45
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Gene & protein summary structure tab Literature tab Search results taken from: PubMed PubMedUK Agricola EPO Divided into categories Description of categories
46
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Gene & protein summary structure tab Literature tab Patents Curator-selected articles
47
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Gene & protein summary structure tab Gene & protein summary literature tab
48
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Gene & protein summary structure tab Reporting view print full summary page
49
Link to NEW EBI Search Search for src src gene & protein summary Gene & protein summary gene tab EBI-Search by gene name Gene & protein summary protein tab Gene & protein summary expression tab Gene & protein summary structure tab Gene & protein summary literature tab Print report
50
Searching for sequences advanced SRS text search...
51
SRS – for more search options www.ebi.ac.uk/srs 1 st : Select resources to search 2 nd : Create query
52
SRS – for more search options Select library tab
53
SRS – for more search options Select library tab Patent literature Patent DNA Patent proteins Search >100 databases
54
SRS – for more search options Select library tab Here, selected NR-level 2 DNA database
55
SRS – for more search options Select library tab Select resources to search
56
SRS – for more search options Select library tab Select resources to search 2) Type in text 1) Select field
57
SRS – for more search options Select library tab Select resources to search Here, selected patent number
58
SRS – for more search options Select library tab Select resources to search Create query
59
SRS – for more search options Select library tab Select resources to search Create query Lists non-redundant nucleotide sequences from WO0146262
60
SRS – for more search options Select library tab Select resources to search Create queryWO0146262 sequences
61
SRS – for more search options Select library tab Select resources to search Create query WO0146262 sequences WO0146262 nucleotide sequence record in NRNL2
62
Patent sequence record in NRNL2 Patent equivalents Sequence record in ENA Sequence Patent literature Priority number and date Translation
63
SRS – for more search options Select library tab Select resources to search Create query WO0146262 sequencesNRNL2 sequence record
64
SRS – for more search options Select library tab Select resources to search Create query WO0146262 sequencesNRNL2 sequence record WO0146262 literature www.ebi.ac.uk/srs
65
Searching for sequences sequence search...
66
Sequence searching – specialised tools Navigate to ‘Sequence Similarity & Analysis’ www.ebi.ac.uk
67
Sequence searching – specialised tools Navigate to search tools
68
Sequence searching – specialised tools Navigate to search tools www.ebi.ac.uk/Tools/sss BLAST FASTA PSI search Choose Search tool
69
When to use which search? Query length FASTA WU-BLAST NCBI BLAST PSI-SEARCH time to search Database size
70
When to use which search? Chose the appropriate search engine for the job BLAST – initial fast search FASTA – better general search engine PSI-BLAST – find remote family members GLSEARCH – match peptide/domain to protein GGSEARCH –full length matches FASTM – match several peptides to protein (one search engine won’t do everything)
71
Sequence searching – specialised tools Navigate to search tools www.ebi.ac.uk/Tools/sss Here, try FASTA protein
72
Sequence searching – specialised tools Navigate to search tools Select search tool
73
Sequence searching – specialised tools Navigate to search tools Select search tool For patent proteins: Search individual patent offices or non-redundant patent datasets Step 1: Select database
74
Sequence searching – specialised tools Navigate to search tools Select search tool Here, selected UniProt Knowledgebase + NR patent proteins L2 Step 1: Select database
75
Sequence searching – specialised tools Navigate to search tools Select search tool(1) Select database
76
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database Step 2: Copy/paste sequence or upload file Copy/pasted patent protein A00210 from patent EP0242329
77
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence
78
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence Step 3: Set parameters Can change search engine
79
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence Step 3: Set parameters Can change search parameters
80
How to optimise parameters? User manual provides help
81
How to optimise parameters? 2.length of query sequence Choice of matrix depends on: 1.strictness of search QUERY LENGTH MATRIX open ext >300 BLOSUM50 -10 -2 85-300 BLOSUM62 -7 -1 50-85 BLOSUM80 -16 -4 >300 PAM250 -10 -2 85-300 PAM120 -16 -4 35-85 MDM40 -12 -2 <=35 MDM20 -22 -4 <=10 MDM10 -23 -4
82
How to optimise parameters? Choice of gap penalties depends on: 2.to match scoring matrix 1.strictness of search QUERY LENGTH MATRIX open ext >300 BLOSUM50 -10 -2 85-300 BLOSUM62 -7 -1 50-85 BLOSUM80 -16 -4 >300 PAM250 -10 -2 85-300 PAM120 -16 -4 35-85 MDM40 -12 -2 <=35 MDM20 -22 -4 <=10 MDM10 -23 -4 larger penalty fewer gaps
83
How to optimise parameters? Do I mask my sequence? **Be careful you don’t mask what you are looking for Low complexity regions should be masked to avoid spurious results CA repeats poly-A tails proline-rich regions
84
How to optimise parameters? use strict matrices use high gap penalties avoid masking allow high e-values What do I use for short sequences?
85
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence Step 3: Set parameters Here, use default parameters
86
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters
87
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters Step 4: submit Can select to have results emailed
88
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit
89
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Results include patent proteins (from NRPL2)......and non-patent proteins (from UniProtKB) View additional annotation (non-patent proteins)
90
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Related EMBL nucleotide entries
91
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Related genomic information
92
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Gene ontology (GO) mapping for protein
93
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit InterPro family/domain classification
94
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Literature
95
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Functional predictions on ALL proteins
96
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Result summary + annotation
97
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Result summary + annotation Visual comparison find mis- or partial matches Prioritize results Functional predictions: InterPro family/domain classifications Extract information
98
Sequence searching – specialised tools Navigate to search tools Select search tool (1) Select database (2) Copy/paste sequence(3) Set parameters (4) Submit Result summary + annotationFunctional predictions
99
Accessing old entries sequence archives...
100
Sequence archives www.ebi.ac.uk ENA nucleotide sequence version archive (SVA) www.ebi.ac.uk/embl/sva UniSave – UniProt sequence/annotation version archive www.ebi.ac.uk/uniprot/unisave Search by date get specific record Search by accession only get all records
101
Sequence archives View old entries Compare different versions Provides complete version list
102
Sequence archives View old entries
103
Sequence archives Compare different versions
104
Chemoinformatics ChEBI & ChEMBL...
105
Chemoinformatics databases at EBI Chemical Entities of Biological Interest ‘Small’ chemical entities (no protein/nucleic acids) Illustrated dictionary of chemical nomenclature http://www.ebi.ac.uk/chebi/ ChEBI ChEMBL Database of bioactive drug-like small molecules ‘Small’ molecules and peptides Illustrated dictionary of chemical nomenclature http://www.ebi.ac.uk/chembl/
106
ChEBI data overview Visualisation caffeine 1,3,7-trimethylxanthine methyltheobromine Nomenclature Formula: C8H10N4O2 Charge: 0 Mass: 194.19 Chemical data metabolite CNS stimulant trimethylxanthines Ontology MSDchem: CFF KEGG DRUG: D00528 Database Xrefs Chemical Informatics InChI=1/C8H10N4O2/c1-10-4-9-6- 5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3 SMILES CN1C(=O)N(C)c2ncn(C)c2C1=O
107
ChEBI search for -lactamase Chemical Entities of Biological Interest (ChEBI)
108
ChEBI search for -lactamase Compounds interacting with BLA2_KLEPN
109
ChEBI search for -lactamase Patent abstracts ChEMBL db bioactivity details
110
Summary Comprehensive sequence databases ENA & UniParc (PAT / PRT class data) Non-redundant patent sequences enriched Sequence archives ENA SVA & UniSave track changes Multiple search engines Broad patent sequence coverage Protein/nucleotides: EPO, USTPO, JPO, KIPO EB-eye text search fetch patent literature ad sequences SRS advanced text searching >100 databases (including patents) Sequence searching specialised tools; annotation-enhanced
111
User support 2Can bioinformatics user support – www.ebi.ac.uk/2Can Online help pages – www.ebi.ac.uk/help E-mail support – www.ebi.ac.uk/support
112
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number 226073 (Integrating Activity) Any questions? Contacts: www.ebi.ac.uk/support
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.