Download presentation
Presentation is loading. Please wait.
Published byEsther Gilmore Modified over 9 years ago
1
Vicky Schneider, EMBL-EBI Training Programme Project leader vicky@ebi.ac.uk Short Introduction To EMBL-EBI
2
08.08.2015 2 What is EMBL-EBI? Based on the Wellcome Trust Genome Campus near Cambridge, UKBased on the Wellcome Trust Genome Campus near Cambridge, UK Part of the European Molecular Biology LaboratoryPart of the European Molecular Biology Laboratory Non-profit organisationNon-profit organisation
3
3 The five branches of EMBL Mouse biology Monterotondo Structural biology Grenoble Bioinformatics Hinxton Structural biology Hamburg Basic research in molecular biology Administration EMBO Heidelberg 1500 staff >60 nationalities
4
4 EMBL member states Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom Associate member state: Australia
5
In 2010 it cost €41 million to run EMBL EBI. 5 How is EMBL-EBI funded? EMBL member states (€22.4 M) EU (€7.4 M) Charity (€4.1 M) US Govt (€2.9 M) UK Research Councils (€2.5 M)
6
What Is Bioinformatics?
7
08.08.2015 7 What is bioinformatics? storing retrieving analysing Interdisciplinary Heart of modern biology
8
8 Biology is changing Data explosion New types of data High-throughput biology Emphasis on systems, not reductionism Growth of applied biology molecular medicine agriculture food environmental sciences… Growth of raw storage at EMBL-EBI (in terabytes)
9
The molecules of life 08.08.2015 9 The ‘book of life’ DNA contains the information needed to build an organism The ‘book of life’ DNA contains the information needed to build an organism The interpreter RNA translates the DNA code into protein The interpreter RNA translates the DNA code into protein Molecular machines Proteins carry out the functions of life: Catalysts: enzymes enable reactions to occur at body temperature Structural support: keratin and collagen give structure to our tissues Transport: carrier proteins move molecules into and out of cells Defense: antibodies protect us from disease-causing organisms Movement: myosin in muscles enables them to contract Molecular machines Proteins carry out the functions of life: Catalysts: enzymes enable reactions to occur at body temperature Structural support: keratin and collagen give structure to our tissues Transport: carrier proteins move molecules into and out of cells Defense: antibodies protect us from disease-causing organisms Movement: myosin in muscles enables them to contract Nature’s ingredients Small molecules provide building blocks, messengers and helpers: Amino acids: the building blocks of proteins Nucleotides and sugars: the building blocks of DNA and RNA Co-enzymes: pigments such as chlorophyll and haem help imprortant processes such as photosynthesis and respiration Hormones: small molecules such as adrenalin and testosterone send important messages from cell to cell Nature’s ingredients Small molecules provide building blocks, messengers and helpers: Amino acids: the building blocks of proteins Nucleotides and sugars: the building blocks of DNA and RNA Co-enzymes: pigments such as chlorophyll and haem help imprortant processes such as photosynthesis and respiration Hormones: small molecules such as adrenalin and testosterone send important messages from cell to cell
10
Bioinformatics underpins life-science research 1 Genomes Contain genes 1 Genomes Contain genes 2 Genes are transcribed 5 Proteins interact with each other and with small molecules to form pathways 3 Transcripts translate to protein sequences 4 Proteins form three- dimensional structures 6 Pathways combine to build systems
11
08.08.2015 11 From molecules to medicine Molecular componentsIntegrationTranslation Genomes Nucleotides Transcripts Proteins Complexes Pathways Small molecules Structures Domains Cells Biobanks Tissues and organs Human populations Therapies Disease prevention Early Diagnosis Human individuals
12
Examples of the importance of biological information to all of us
13
Genome-wide analysis of crop plants Population growth and climate change are major challenges to food security. Traditional routes to crop improvement are too slow to keep up with this increase in demand. Understanding plant genomes helps us identify which species will be most tolerant to drought, salt and pests while still providing optimum nutrition.
14
Matching the treatment to the cancer One in ten women in the EU-27 will develop breast cancer before the age of 80. If we can identify patterns of genes that are active in different tumours, we can diagnose and treat cancers earlier.
15
Tracking the source of infectious disease Methicillin-resistant MRSA (Staphylococcus aureus) infection is a global problem. Transmission of individual clones can be tracked using small variations in DNA sequence. This technology can be used to identify the source of new outbreaks across continents and within wards.
16
Barcoding life DNA barcodes are short sections of DNA that we use to identify an organism. The Barcode of Life Initiative is developing DNA barcoding as a global standard for identifying species. Applications include: Protection of endangered species Sustaining natural resources through pest control Food labelling
17
Repurposing drugs for neglected diseases Schistosomiasis is a parasitic infection that affects 210 million people in 76 countries. Resistance is developing to the one available drug. We look at the Schistosome genome to identify the targets of existing drugs. Candidates can be tested for anti-schistosomal activity or used as leads for further optimisation.
18
18 Lots of data and new types of data Genomes Nucleotide sequence Gene expression Proteomes Protein families, domains and motifs Protein structure Protein-protein interactions Chemical entities Pathways Systems Literature Protein sequence
19
EMBL-EBI’s mission statement To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress To contribute to the advancement of biology through basic investigator-driven research in bioinformatics To provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators To help disseminate cutting-edge technologies to industry To coordinate biological data provision across Europe 08/08/2015
20
Services www.ebi.ac.uk/services
21
21 Principles of service provision Comprehensive Compatibility PortabilityQuality Accessibility @ Patrick Hoesly
22
22 Databases: molecules to systems Genomes Ensembl Ensembl Genomes EGA Genomes Ensembl Ensembl Genomes EGA Nucleotide sequence ENA Nucleotide sequence ENA Functional genomics ArrayExpress Expression Atlas Functional genomics ArrayExpress Expression Atlas Protein Sequences UniProt Protein Sequences UniProt Protein families, motifs and domains InterPro Protein families, motifs and domains InterPro Macromolecular PDBe Macromolecular PDBe Protein activity IntAct, PRIDE Protein activity IntAct, PRIDE Chemical entities ChEBI Chemical entities ChEBI Pathways Reactome Pathways Reactome Systems BioModels BioSamples Systems BioModels BioSamples Literature and ontologies CiteXplore, GO Literature and ontologies CiteXplore, GO Chemogenomics ChEMBL Chemogenomics ChEMBL
23
23 Database collaborations
24
24 Standards development – international collaborations Genome annotation www.geneontology.org Genome annotation www.geneontology.org Functional Genomics Data Society www.fged.org Protein sequence www.uniprot.org Protein sequence www.uniprot.org HUPO- Proteomics Standards Initiative (PSI) www.psidev.info/ HUPO- Proteomics Standards Initiative (PSI) www.psidev.info/ Protein structure www.wwpdb.org Protein structure www.wwpdb.org Cheminformatics www.ebi.ac.uk/chebi Cheminformatics www.ebi.ac.uk/chebi Pathways www.reactome.org www.biopax.org Pathways www.reactome.org www.biopax.org Systems modelling standards www.sbml.org Systems modelling standards www.sbml.org Metabolomics Standards Initiative (MSI) www.metabolomicssociety.org Metabolomics Standards Initiative (MSI) www.metabolomicssociety.org Genomics Standards Consortium (GSC) http://gensc.org Genomics Standards Consortium (GSC) http://gensc.org Nucleotide sequence www.insdc.org Nucleotide sequence www.insdc.org
25
ENA UniProt ArrayExpress Atlas InterProScan Pfam Ensembl PDB PDBsum IntAct Reactome IntEnz ProFunc MACiE ChEBI BioModels GenBank Pubmed CiteXplore GO BLAST FASTA CATH SCOP PubChem RefSeq VAST GEO STRING Genomes Nucleotide Sequences Protein Sequences Macromolecular Structures Small Molecules Gene Expression Molecular Interactions Reactions & Pathways Protein FamiliesProtein Families (Diagnostic) Enzymes Literature Ontologies Proteomics Sequence Similarity & Analysis Pattern & Motif SearchPattern & Motif Search (Diagnostic) Structure Analysis GOA PRIDE PRINTS UCSC Genome Browser DDBJ Gene3D Gramene Flybase
26
ENA UniProt ArrayExpressAtlas InterProScan Pfam Ensembl PDBPDBsum IntAct Reactome IntEnz ProFunc MACiE ChEBI BioModels GenBank PubmedCiteXplore GO BLASTFASTA CATHSCOP PubChem RefSeq VAST GEO Genomes Nucleotide Sequences Protein Sequences Macromolecular Structures Small Molecules Gene Expression Molecular Interactions Reactions & Pathways Protein FamiliesProtein Families (Diagnostic) Enzymes Literature Ontologies Proteomics Sequence Similarity & Analysis Pattern & Motif SearchPattern & Motif Search (Diagnostic) Structure Analysis STRING CATHSCOP InterPro RefSeq ChEBI UCSC Genome Browser PRIDE DDBJ GOA PRINTS Gene3D GrameneFlybase Gramene
27
New search service Access from the EBI’s homepage Data organised according to: gene expression protein structure literature Data organised according to: gene expression protein structure literature Species selector allows for easy comparison Explore data, return easily to your results Explore data, return easily to your results 27
28
Goals of the new EBI Search Relevant to ‘wet-lab’ biologists Organises information based around a single gene (or a small number of genes) User-expectation centric (not database centric) Smooth transition to the detailed information in many of EBI’s core databases NOT for bioinformaticians: does not provide programmatic access 28
29
Quick databases tour 29
30
30 Genomes 1: Ensembl Synteny Pick a genome Gene trees Genomic alignments Gene families Variations Genes Chromosomes User Upload Variation Effect Predictor
31
31 Genomes 2: Ensembl Genomes Interface uses Ensembl technology Pan-taxonomic comparative analysis Genome portals for the five kingdoms of life Multi-way comparison of whole bacterial chromosomes Variation data for plant, metazoan and fungal species
32
32 Nucleotides: European Nucleotide Archive (ENA) Figure adapted from: Cochrane, G. et al. Public Data Resources as the Foundation for a Worldwide Metagenomics Data Infrastructure. In: Metagenomics: Theory, Methods and Applications (Chapter 5), Caister Academic Press, Universidad Nacional de Cordoba, Argentina. Ed. D. Marco (2010). The ENA has a three-tiered data architecture. It consolidates information from EMBL-Bank, the European Trace Archive (containing raw data from electrophoresis-based sequencing machines) and the Sequence Read Archive (containing raw data from next-generation sequencing platforms).
33
33 Transcriptomes: ArrayExpress Expand results Search by keyword ArrayExpress Archive: browse experiments Spreadsheets describing the sample properties
34
Transcriptomes : Gene Expression Atlas Search by gene or biological condition Gene page Atlas: browse changes in gene expression Experiment page 34
35
35 Input sources for UniProtKB UniProt Manual curation Literature-based annotation Sequence analysis Automated annotation PRIDE GO InterPro IntAct IntEnz HAMAP RESID Functional info Protein identification data Protein families and domains Molecular interactions Enzymes Microbial protein families Post-translational modifications Some data sources for annotation Transmembrane prediction InterPro classification Signal prediction Other predictions Protein classification
36
36 Protein families, motifs and domains: InterPro Powerful tool for protein classification, integrating several methods into one resource View architectures of proteins containing a signature Compare methods of protein signature prediction Visualise the taxonomic range for a protein signature
37
37 Proteomics services IntAct: molecular interactions INTENZ: enzyme classification ChEBI: small molecules PRIDE: protein identifications from proteomics experiments
38
38 Structures: PDBe
39
Chemogenomics: ChEMBL 39 ChEMBL Neglected Tropical Disease (NTD) archive ChEMBL database Browse targets Target search Search results Compound search Kinase SARfari GPCR SARfari
40
40 Pathways: Reactome Export pathway to your favourite modelling software Compare events in different species Link to source databases View expression values overlaid on a pathway Interaction overlay on a pathway diagram
41
41 Data management Leased two new data centres (with €11.4M from UK Research Councils) Over 800 million cross- references in the databases we serve Over 4M web requests per day – over 4.6M if Ensembl is included Over 280,000 unique hosts served per month, excluding Ensembl Total disk space: 10 petabytes in 2010.
42
42 User support E-mail support – www.ebi.ac.uk/support Online help pages – www.ebi.ac.uk/help 2Can bioinformatics user support – www.ebi.ac.uk/2Can eLearning Portal – coming soon (elearning@ebi.ac.uk)elearning@ebi.ac.uk
43
Research www.ebi.ac.uk/groups
44
44 Key facts about research The EBI provides a unique environment for bioinformatics research Eight dedicated research groups aim to understand biology through new approaches to interpreting biological data Services teams also carry out R&D to enhance existing services and develop new ones Research programme complements services and the two are mutually supportive
45
Chemistry Curiosity-driven research ProteinsGenomesPathways and systemsTranscriptomes Ewan Birney Rolf Apweiler Nicolas Le Novère Nick Luscombe Paul Flicek Christoph Steinbeck John Overington Paul Bertone Anton Enright Gerard Kleywegt Janet Thornton Alvis Brazma Nick Goldman biology/medicine chemistry/chem engineering maths physics Text mining Dietrich Rebholz- Schuhmann John Marioni Julio Saez- Rodriguez
46
Training www.ebi.ac.uk/training
48
48 Hands-on training for all levels of experience Interactive training in our purpose-built IT training suite at EMBL-EBI, Hinxton, Cambridge Learn from the EBI’s experts through a combination of talks and practical exercises Take a tour of all our core data resources, or focus in on specific data types Full programme at www.ebi.ac.uk/training/handson
49
49 Predoc and postdoc training Open Days for bioinformatics early-stage researchers www.ebi.ac.uk/training/openday PhD studentships through EMBL International PhD Programme www.ebi.ac.uk/training/Studentships EIPOD interdisciplinary post-doc fellowship programme www.embl.de/training/postdocs/eipod EBI–Sanger postdoc programme ww.ebi.ac.uk/training/postdoc/ESPOD
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.