Download presentation
Presentation is loading. Please wait.
1
Overview of EBI Data Resources and Services
Dominic Clark, Industry Programme Manager,
2
What is EMBL-EBI? Part of the European Molecular Biology Laboratory
Based on the Wellcome Trust Genome Campus near Cambridge, UK Non-profit organisation But first some background, the EBI is based on the Wellcome Trust Genome Campus in Hinxton, which is near Cambridge in UK. We share the campus with the Sanger Institute. The EBI is part of the European Molecular Biology Laboratory and as part of that, we’re a non-profit organisation.
3
Why do we need EMBL-EBI services?
Data Growth Global context Very large user community: Need to preserve data and make accessible to all Impact on medicine & agriculture Impact on society & bioindustries 3 2008 BAC 3
4
Literature and ontologies
New types of data Literature and ontologies Genomes Protein sequence DNA & RNA sequence Protein structure Gene expression Chemical entities Protein families, motifs and domains Protein interactions Pathways The EBI is probably unique in the world for its range of data resources and tools, spanning everything from DNA and protein sequence to complex pathways and networks. At the EBI, we separate resource development and provision, which we call services, and research although these two are closely related. Both the research areas and services follow the different areas of focus as shown on the slide. Some of the types of data that are now being collected in a high-throughput way, presenting new challenges for how we organise and store this data. Systems
5
EMBL-EBI’s mission To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress To contribute to the advancement of biology through basic investigator-driven research in bioinformatics To provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators To help disseminate cutting-edge technologies to industry EBI shares its central four mission objectives with EMBL, although focussed on bioinformatics rather than molecular biology. The EBI is at the centre of Europe’s efforts to collect, organise and make all types biological data available and we do this by providing services so researchers can access and make sense of the information, by being active in bioinformatics research, by providing training and by working closely with industry.
6
Services
7
Key facts about services
European node for globally coordinated data collection and dissemination projects Core databases produced in collaboration with other world leaders, including NCBI (US), National Institute of Genetics (Japan), Swiss Institute of Bioinformatics, Cold Spring Harbor Laboratory (US) One of the world’s most comprehensive collection of molecular databases The EBI is the European centre for the collection and dissemination of biological data; we do this in collaboration with other global centres (primarily in the US and Japan but different for each data type); The EBI is probably unique in the world for its range of data resources, spanning everything from DNA and protein sequence to complex pathways and networks.
8
Principles of service provision
Accessibility – all data and tools freely available without restriction Compatibility – we develop and promote the use of standards in bioinformatics Comprehensive data sets – agreements with other data providers ensure that our resources contain comprehensive and up-to-date data; agreements with publishers ensure that published data are placed in a public repository at the earliest opportunity Portability – data and software can be downloaded and installed locally Quality – Our databases are enhanced through annotation and cross-referencing The EBI’s services function to meet the needs of researchers for data deposition, access, analysis and integration. Our data resources differ in detail but they all uphold the same five principles.
9
Databases: molecules to systems
Literature and ontologies CiteXplore, GO Genomes Ensembl Ensembl Genomes EGA Protein families, motifs and domains InterPro Nucleotide sequence EMBL-Bank Microarray & gene expression data ArrayExpress Protein structure PDB Protein interactions IntAct Pathways Reactome Proteomes UniProt, PRIDE Chemical entities ChEBI, ChEMBL The slide shows the core resources at the EBI mapped on to the same arrow to show the range of data you can access through the EBI. The EBI is the European centre for the collection and dissemination of biological data; we do this in collaboration with other global centres such as NCBI, the Institute of Genetics in Japan, the Swiss Institute of Bioinformatics and Cold Spring Harbor. Systems BioModels
10
Database collaborations
Many of the EBI’s data resources are members of international consortia, Some, such as the International Nucleotide Sequence Collaboration, exchange data on a regular basis; others, such as the UniProt Consortium and the GO Cosnortium, work together to produce a single resource.
11
Standards development – international collaborations
Genomics Standards Consortium (GSC) gensc.org Genome annotation Protein sequence Nucleotide sequence Protein structure Microarray and Gene Expression Data (MGED) HUPO- Proteomics Standards Initiative (PSI) Psidev.sf.net Cheminformatics Pathways Systems modelling standards Metabolomics Standards Initiative (MSI)
12
EBI website and search engine EB-eye
Search all main databases in one go We launched a new website and search engine just over a year ago. Our website gets over 2 million hits a day and it’s the gateway for accessing the information you want. The search engine, the EB-eye, allows integrated searching of all our core data resources from a single search box – it’s like a google for all the information held at the EBI. Advanced search: drill down to specific fields in specific databases Refine your search
13
Genomes 1: Ensembl Across species Within species Chromosomes
Genomic alignments Genes Pick a genome Synteny Gene families Ensembl provides a framework for working with the genomes of higher animals (metazoans). It presents, via an interactive website, the human genome together with other genomes that are important for addressing questions in medical research and molecular biology. It uses automated methods for gene prediction and annotation to provide a consistent view of completely sequenced genomes. Users can view the data at many levels, from entire chromosomes down to single nucleotide polymorphisms. As well as accessing a wealth of data for each species, users can also perform cross-species comparisons. SNPs Across species Within species Orthology
14
Genomes 2: Ensembl Genomes
Ensembl-like genome browser for non-vertebrate species Using view options, you can select to view only the current gene or the entire expanded gene tree. Ensembl Metazoa Ensembl Bacteria Select Orthologue view to see putative orthologues. Ensembl Genomes is the combined repository for non-vertebrate genome data, consisting of five resources: Ensembl Bacteria, Ensembl Fungi, Ensembl Metazoa, Ensembl Plants, and Ensembl Protists, bringing the power of the Ensembl system to all branches of life. Ensembl Genomes re-uses and extends software developed for vertebrate genomes in the context of the Ensembl project, and replaces several pre-existing resources (Integr8, Genome Reviews and ASTD) thereby unifying services and simplifying data access for users. Across species View options
15
Nucleotides: EMBL-Bank
DDBJ GenBank Keyword and sequence searching Map-based search of environmental samples Downloads Direct submissions Patents Genome-sequencing projects Updates Third-party annotation EMBL-Bank is Europe's primary nucleotide sequence resource. The database is produced in an international collaboration (the International Sequence Database Collaboration, INSDC) with GenBank (USA) and the DNA Database of Japan (DDBJ). Main sources of DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. Users can search the data using either keyword-based searches or using sequence homology tools such as BLAST and FASTA to compare their own sequence with the contents of EMBL-Bank. There’s also a map-based search (EMBLWorld) for exploring sequences derived from environmental genome sequencing projects. The data belong to the submitter and can only be updated by the submitter, but other researchers can submit ‘third party annotations’ to EMBL-Bank if they’re associated with a peer-reviewed publication.
16
Electron density visualization
Structures: PDBe Linking to domain data Sequence mapping Ligands Assemblies Electron density visualization Active sites The Macromolecular Structure Database (MSD) is the European resource for the collection, organisation and dissemination of data about biological macromolecular structures. The MSD is one of three partners in the worldwide Protein Data Bank (wwPDB), the consortium entrusted with the collation, maintenance and distribution of the global repository of macromolecular structure data. The MSD team has developed a wide range of resources for the analysis of data in the PDB. Surface matching Fold matching
17
PDB FTP Traffic Worldwide Protein Data Bank www.wwpdb.org 17
17
18
User support 2Can bioinformatics user support – www.ebi.ac.uk/2Can
Online help pages – support – If you need help using any of our databases it’s available; if our online support pages can’t answer your question we offer support and promise to get back to you within 2 working days.
19
Research www.ebi.ac.uk/groups
As well as providing services, the EBI does research…
20
Key facts about research
Dedicated research groups aim to understand biology through new approaches to interpreting biological data Services teams also carry out R&D to enhance existing services and develop new ones Research programme complements services and the two are mutually supportive
21
Pathways, networks, systems
Research groups Text mining Rebholz-Schuhmann Genome analysis Birney, Flicek, Enright, Goldman Structural bioinformatics Thornton Transcriptome analysis Brazma, Huber Regulatory networks Luscombe Protein annotation Apweiler Cheminformatics Steinbeck, Overington The research groups map roughly onto many of the areas that we provide services in; collaboration between services and research is widespread. For example, collaborations between Wolfgang Huber’s group and the ArrayExpress developers are leading to new methods for microarray data analysis; Dietrich Rebholz-Schuhmann is working closely with our literature services team (Peter Stoehr) to develop innovative literature mining methods, some of which will be used in the new UK PubMedCentral service. Some of the services groups (Brazma, Birney and Apweiler) also have research components. Pathways, networks, systems Le Novère Differentiation and development Bertone
22
Training www.ebi.ac.uk/training
Our third mission is to provide training…
23
A tripartite user-training programme
Training comes to you Training any time, anywhere, at any pace g ics me v MBL To train researchers in using the EBI’s resources, we have a tripartite training programme, encompassing training courses in-house at the EBI, the Bioinformatics Roadshow where EBI trainers travel out to host organisations to provide hands-on training on resources requested by the host, and most recently, we are launched an elearning programme for anyone to use so they can undertake some training in their own time. Hands-on user training on all our core data resources for lab-based researchers
24
Hands-on training for all levels of experience
Interactive training in our purpose-built IT training suite at EMBL-EBI, Hinxton, Cambridge Learn from the EBI’s experts through a combination of talks and practical exercises Take a tour of all our core data resources, or focus in on specific data types Full programme at Wellcome Images
25
Genomics, proteomics, transcriptomics, protein structures…
The training programme aims to cover all the core EBI resources, both at introductory and an advanced level. For example, the two day dip course shown here, gives an overview of different resources and acts as a way to orientate yourself and become familiar with the range of resources, whereas a more specific course, such as one on transcriptomics will link the use of several different resources together. Genomics, proteomics, transcriptomics, protein structures…
27
Consolidating Bioinformatics in Europe
EU-funded projects coordinated by the EBI We’re getting into politics here; I’d exclude this series of slides for a roadshow audience.
28
SLING – Serving life science information in the next generation
Providing unrestricted access to some of the world’s most important biological databases Bioinformatics roadshows provide hands-on training for users Funded by the European Commission within its FP7 Programme within the Research Infrastructure Programme 4 partners in 4 countries
29
ENFIN Network of Excellence
Brings together experimentalists and computational biologists to develop the next generation of informatics resources for systems biology Funded by the European Commission within its FP6 programme under the thematic area ‘Life sciences, genomics and biotechnology for health’ 20 partners in 13 countries
30
EMBRACE Network of Excellence
Aims to enable bioinformatics research through better interoperability of servers, databases and services Funded by the European Commission within its FP6 programme under the thematic area ‘Life sciences, genomics and biotechnology for health’ 17 partners in 11 countries
31
ELIXIR – European life sciences infrastructure for biological information
To build a sustainable European infrastructure for biological information supporting life science research and its translation to: medicine, the environment, the bioindustries, and society 32 participants in 13 countries
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.