Vicky Schneider, EMBL-EBI Training Programme Project leader Short Introduction To EMBL-EBI.

Slides:



Advertisements
Similar presentations
Identity management – life sciences perspective Ugis Sarkans European Bioinformatics Institute.
Advertisements

The EMBL-European Bioinformatics Institute
EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna.
EBI resources introductory course Pablo Porras Millán
Other biological databases. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and networks Biological.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
The European Molecular Biology Laboratory (EMBL) is supported by sixteen countries. Consists of the main Laboratory in Heidelberg (Germany), Outstations.
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Archives and Information Retrieval
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Luxembourg, Sep 2001 Pedro Fernandes Inst. Gulbenkian de Ciência, Oeiras, Portugal EMBER A European Multimedia Bioinformatics Educational Resource.
EMBL-EBI and Bioinformatics Steven Newhouse, Head of Technical Services, EMBL-EBI.
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
1 Aventis Pharma. 2 Prescription drugs Aventis Pharma Vaccines Aventis Pasteur Therapeutic proteins Aventis Behring Diagnostics Dade Behring.
Bioinformatics tools for the EBI An overview.
Welcome to EMBL-EBI Dr Laura Emery. Before we start… Stand up How experienced are you in bioinformatics? Get to know each other by arranging yourselves.
Small Molecules EBI Bioinformatics Roadshow Gareth Owen, ChEBI group
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
Bioinformatics.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
European Life Sciences Infrastructure for Biological Information ELIXIR
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
VectorBase A Resource Centre for Invertebrate Hosts of Human Pathogens Bob MacCallum Imperial College London.
Learning and exploring Life science through the EBI reosurces and tools BIOQUEST workshop_2011 Vicky Schneider, EMBL-EBI Training Programme Project leader.
The SLING project is funded by the European Commission within Research Infrastructures of the FP7 Capacities Specific Programme, grant agreement number.
CCP-EM community meeting 7 February 2013 EMDB and beyond Ardan Patwardhan and Gerard Kleywegt Protein Data Bank in Europe EMBL-EBI.
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
Network Services for Biologists in the Genome Era The Work of the European Bioinformatics Institute.
Learning and exploring Life science through the EBI reosurces and tools BIOQUEST workshop_2011 Vicky Schneider, EMBL-EBI Training Programme Project leader.
Copyright OpenHelix. No use or reproduction without express written consent1.
ChEMBL– Open Access Database For Drug Discovery By – Udghosh Singh M.S.(Pharm), 3 rd Sem Pharmacoinformatics.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
REMINDERS 2 nd Exam on Nov.17 Coverage: Central Dogma of DNA Replication Transcription Translation Cell structure and function Recombinant DNA technology.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
ELIXIR: a sustainable infrastructure for biological information in Europe Workshop on the future of Big Data Management The Blackett Laboratory, Imperial.
EB3233 Bioinformatics Introduction to Bioinformatics.
Bioinformatics and Computational Biology
Learning and exploring Life science through the EBI reosurces and tools BIOQUEST workshop_2011 Vicky Schneider, EMBL-EBI Training Programme Project leader.
European Molecular Biology Laboratory: An overview
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
The (IMG) Systems for Comparative Analysis of Microbial Genomes & Metagenomes: N America: 1,180 Europe: 386 Asia: 235 Africa: 6 Oceania: 81 S America:
Central hub for biological data UniProtKB/Swiss-Prot is a central hub for biological data: over 120 databases are cross-referenced (EMBL/DDBJ/GenBank,
Describing Bioinformatic Metadata at EBI James Malone
Copyright OpenHelix. No use or reproduction without express written consent1.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
For EGI/EUDAT EMBL/ELIXIR use-cases Tony Wildish
European Life Sciences Infrastructure for Biological Information Safeguarding the results of life science research in Europe Niklas.
European Life Sciences Infrastructure for Biological Information EGI 2015, Lisbon, 18 May 2015 Rafael C Jimenez, ELIXIR CTO ELIXIR.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
1 Modelling and Simulation EMBL – Beyond Molecular Biology Physics Computational Biology Chemistry Medicine.
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
bioinformatics NeLS workshop Dept of Informatics, UiO 20 th April 2016
EBI is an Outstation of the European Molecular Biology Laboratory. Rodrigo Lopez Head of EMBL-EBI/ES Andrew Lyall ELIXIR PM. ELIXIR and the integration.
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
ELIXIR Core Data Resources and Deposition Databases
EMBL’s European Bioinformatics Institute
European Molecular Biology Laboratory
The Integrated Microbial Genome (IMG) systems
EMBL – European Molecular Biology Laboratory
ELIXIR: Authentication and Authorization Infrastructure Requirements
생물정보학 Bioinformatics.
Overview of EBI Data Resources and Services
3rd Annual Forum for SMEs: Meeting Overview
Florian Gräf Software Developer of the McEntyre group at EMBL-EBI
Presentation transcript:

Vicky Schneider, EMBL-EBI Training Programme Project leader Short Introduction To EMBL-EBI

What is EMBL-EBI? Based on the Wellcome Trust Genome Campus near Cambridge, UKBased on the Wellcome Trust Genome Campus near Cambridge, UK Part of the European Molecular Biology LaboratoryPart of the European Molecular Biology Laboratory Non-profit organisationNon-profit organisation

3 The five branches of EMBL Mouse biology Monterotondo Structural biology Grenoble Bioinformatics Hinxton Structural biology Hamburg Basic research in molecular biology Administration EMBO Heidelberg 1500 staff >60 nationalities

4 EMBL member states Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom Associate member state: Australia

In 2010 it cost €41 million to run EMBL EBI. 5 How is EMBL-EBI funded? EMBL member states (€22.4 M) EU (€7.4 M) Charity (€4.1 M) US Govt (€2.9 M) UK Research Councils (€2.5 M)

What Is Bioinformatics?

What is bioinformatics? storing retrieving analysing Interdisciplinary Heart of modern biology

8 Biology is changing Data explosion New types of data High-throughput biology Emphasis on systems, not reductionism Growth of applied biology molecular medicine agriculture food environmental sciences… Growth of raw storage at EMBL-EBI (in terabytes)

The molecules of life The ‘book of life’ DNA contains the information needed to build an organism The ‘book of life’ DNA contains the information needed to build an organism The interpreter RNA translates the DNA code into protein The interpreter RNA translates the DNA code into protein Molecular machines Proteins carry out the functions of life: Catalysts: enzymes enable reactions to occur at body temperature Structural support: keratin and collagen give structure to our tissues Transport: carrier proteins move molecules into and out of cells Defense: antibodies protect us from disease-causing organisms Movement: myosin in muscles enables them to contract Molecular machines Proteins carry out the functions of life: Catalysts: enzymes enable reactions to occur at body temperature Structural support: keratin and collagen give structure to our tissues Transport: carrier proteins move molecules into and out of cells Defense: antibodies protect us from disease-causing organisms Movement: myosin in muscles enables them to contract Nature’s ingredients Small molecules provide building blocks, messengers and helpers: Amino acids: the building blocks of proteins Nucleotides and sugars: the building blocks of DNA and RNA Co-enzymes: pigments such as chlorophyll and haem help imprortant processes such as photosynthesis and respiration Hormones: small molecules such as adrenalin and testosterone send important messages from cell to cell Nature’s ingredients Small molecules provide building blocks, messengers and helpers: Amino acids: the building blocks of proteins Nucleotides and sugars: the building blocks of DNA and RNA Co-enzymes: pigments such as chlorophyll and haem help imprortant processes such as photosynthesis and respiration Hormones: small molecules such as adrenalin and testosterone send important messages from cell to cell

Bioinformatics underpins life-science research 1 Genomes Contain genes 1 Genomes Contain genes 2 Genes are transcribed 5 Proteins interact with each other and with small molecules to form pathways 3 Transcripts translate to protein sequences 4 Proteins form three- dimensional structures 6 Pathways combine to build systems

From molecules to medicine Molecular componentsIntegrationTranslation Genomes Nucleotides Transcripts Proteins Complexes Pathways Small molecules Structures Domains Cells Biobanks Tissues and organs Human populations Therapies Disease prevention Early Diagnosis Human individuals

Examples of the importance of biological information to all of us

Genome-wide analysis of crop plants Population growth and climate change are major challenges to food security. Traditional routes to crop improvement are too slow to keep up with this increase in demand. Understanding plant genomes helps us identify which species will be most tolerant to drought, salt and pests while still providing optimum nutrition.

Matching the treatment to the cancer One in ten women in the EU-27 will develop breast cancer before the age of 80. If we can identify patterns of genes that are active in different tumours, we can diagnose and treat cancers earlier.

Tracking the source of infectious disease Methicillin-resistant MRSA (Staphylococcus aureus) infection is a global problem. Transmission of individual clones can be tracked using small variations in DNA sequence. This technology can be used to identify the source of new outbreaks across continents and within wards.

Barcoding life DNA barcodes are short sections of DNA that we use to identify an organism. The Barcode of Life Initiative is developing DNA barcoding as a global standard for identifying species. Applications include: Protection of endangered species Sustaining natural resources through pest control Food labelling

Repurposing drugs for neglected diseases Schistosomiasis is a parasitic infection that affects 210 million people in 76 countries. Resistance is developing to the one available drug. We look at the Schistosome genome to identify the targets of existing drugs. Candidates can be tested for anti-schistosomal activity or used as leads for further optimisation.

18 Lots of data and new types of data Genomes Nucleotide sequence Gene expression Proteomes Protein families, domains and motifs Protein structure Protein-protein interactions Chemical entities Pathways Systems Literature Protein sequence

EMBL-EBI’s mission statement To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress To contribute to the advancement of biology through basic investigator-driven research in bioinformatics To provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators To help disseminate cutting-edge technologies to industry To coordinate biological data provision across Europe 08/08/2015

Services

21 Principles of service provision Comprehensive Compatibility PortabilityQuality Patrick Hoesly

22 Databases: molecules to systems Genomes Ensembl Ensembl Genomes EGA Genomes Ensembl Ensembl Genomes EGA Nucleotide sequence ENA Nucleotide sequence ENA Functional genomics ArrayExpress Expression Atlas Functional genomics ArrayExpress Expression Atlas Protein Sequences UniProt Protein Sequences UniProt Protein families, motifs and domains InterPro Protein families, motifs and domains InterPro Macromolecular PDBe Macromolecular PDBe Protein activity IntAct, PRIDE Protein activity IntAct, PRIDE Chemical entities ChEBI Chemical entities ChEBI Pathways Reactome Pathways Reactome Systems BioModels BioSamples Systems BioModels BioSamples Literature and ontologies CiteXplore, GO Literature and ontologies CiteXplore, GO Chemogenomics ChEMBL Chemogenomics ChEMBL

23 Database collaborations

24 Standards development – international collaborations Genome annotation Genome annotation Functional Genomics Data Society Protein sequence Protein sequence HUPO- Proteomics Standards Initiative (PSI) HUPO- Proteomics Standards Initiative (PSI) Protein structure Protein structure Cheminformatics Cheminformatics Pathways Pathways Systems modelling standards Systems modelling standards Metabolomics Standards Initiative (MSI) Metabolomics Standards Initiative (MSI) Genomics Standards Consortium (GSC) Genomics Standards Consortium (GSC) Nucleotide sequence Nucleotide sequence

ENA UniProt ArrayExpress Atlas InterProScan Pfam Ensembl PDB PDBsum IntAct Reactome IntEnz ProFunc MACiE ChEBI BioModels GenBank Pubmed CiteXplore GO BLAST FASTA CATH SCOP PubChem RefSeq VAST GEO STRING Genomes Nucleotide Sequences Protein Sequences Macromolecular Structures Small Molecules Gene Expression Molecular Interactions Reactions & Pathways Protein FamiliesProtein Families (Diagnostic) Enzymes Literature Ontologies Proteomics Sequence Similarity & Analysis Pattern & Motif SearchPattern & Motif Search (Diagnostic) Structure Analysis GOA PRIDE PRINTS UCSC Genome Browser DDBJ Gene3D Gramene Flybase

ENA UniProt ArrayExpressAtlas InterProScan Pfam Ensembl PDBPDBsum IntAct Reactome IntEnz ProFunc MACiE ChEBI BioModels GenBank PubmedCiteXplore GO BLASTFASTA CATHSCOP PubChem RefSeq VAST GEO Genomes Nucleotide Sequences Protein Sequences Macromolecular Structures Small Molecules Gene Expression Molecular Interactions Reactions & Pathways Protein FamiliesProtein Families (Diagnostic) Enzymes Literature Ontologies Proteomics Sequence Similarity & Analysis Pattern & Motif SearchPattern & Motif Search (Diagnostic) Structure Analysis STRING CATHSCOP InterPro RefSeq ChEBI UCSC Genome Browser PRIDE DDBJ GOA PRINTS Gene3D GrameneFlybase Gramene

New search service Access from the EBI’s homepage Data organised according to: gene expression protein structure literature Data organised according to: gene expression protein structure literature Species selector allows for easy comparison Explore data, return easily to your results Explore data, return easily to your results 27

Goals of the new EBI Search Relevant to ‘wet-lab’ biologists Organises information based around a single gene (or a small number of genes) User-expectation centric (not database centric) Smooth transition to the detailed information in many of EBI’s core databases NOT for bioinformaticians: does not provide programmatic access 28

Quick databases tour 29

30 Genomes 1: Ensembl Synteny Pick a genome Gene trees Genomic alignments Gene families Variations Genes Chromosomes User Upload Variation Effect Predictor

31 Genomes 2: Ensembl Genomes Interface uses Ensembl technology Pan-taxonomic comparative analysis Genome portals for the five kingdoms of life Multi-way comparison of whole bacterial chromosomes Variation data for plant, metazoan and fungal species

32 Nucleotides: European Nucleotide Archive (ENA) Figure adapted from: Cochrane, G. et al. Public Data Resources as the Foundation for a Worldwide Metagenomics Data Infrastructure. In: Metagenomics: Theory, Methods and Applications (Chapter 5), Caister Academic Press, Universidad Nacional de Cordoba, Argentina. Ed. D. Marco (2010). The ENA has a three-tiered data architecture. It consolidates information from EMBL-Bank, the European Trace Archive (containing raw data from electrophoresis-based sequencing machines) and the Sequence Read Archive (containing raw data from next-generation sequencing platforms).

33 Transcriptomes: ArrayExpress Expand results Search by keyword ArrayExpress Archive: browse experiments Spreadsheets describing the sample properties

Transcriptomes : Gene Expression Atlas Search by gene or biological condition Gene page Atlas: browse changes in gene expression Experiment page 34

35 Input sources for UniProtKB UniProt Manual curation Literature-based annotation Sequence analysis Automated annotation PRIDE GO InterPro IntAct IntEnz HAMAP RESID Functional info Protein identification data Protein families and domains Molecular interactions Enzymes Microbial protein families Post-translational modifications Some data sources for annotation Transmembrane prediction InterPro classification Signal prediction Other predictions Protein classification

36 Protein families, motifs and domains: InterPro Powerful tool for protein classification, integrating several methods into one resource View architectures of proteins containing a signature Compare methods of protein signature prediction Visualise the taxonomic range for a protein signature

37 Proteomics services IntAct: molecular interactions INTENZ: enzyme classification ChEBI: small molecules PRIDE: protein identifications from proteomics experiments

38 Structures: PDBe

Chemogenomics: ChEMBL 39 ChEMBL Neglected Tropical Disease (NTD) archive ChEMBL database Browse targets Target search Search results Compound search Kinase SARfari GPCR SARfari

40 Pathways: Reactome Export pathway to your favourite modelling software Compare events in different species Link to source databases View expression values overlaid on a pathway Interaction overlay on a pathway diagram

41 Data management Leased two new data centres (with €11.4M from UK Research Councils) Over 800 million cross- references in the databases we serve Over 4M web requests per day – over 4.6M if Ensembl is included Over 280,000 unique hosts served per month, excluding Ensembl Total disk space: 10 petabytes in 2010.

42 User support support – Online help pages – 2Can bioinformatics user support – eLearning Portal – coming soon

Research

44 Key facts about research The EBI provides a unique environment for bioinformatics research Eight dedicated research groups aim to understand biology through new approaches to interpreting biological data Services teams also carry out R&D to enhance existing services and develop new ones Research programme complements services and the two are mutually supportive

Chemistry Curiosity-driven research ProteinsGenomesPathways and systemsTranscriptomes Ewan Birney Rolf Apweiler Nicolas Le Novère Nick Luscombe Paul Flicek Christoph Steinbeck John Overington Paul Bertone Anton Enright Gerard Kleywegt Janet Thornton Alvis Brazma Nick Goldman biology/medicine chemistry/chem engineering maths physics Text mining Dietrich Rebholz- Schuhmann John Marioni Julio Saez- Rodriguez

Training

48 Hands-on training for all levels of experience Interactive training in our purpose-built IT training suite at EMBL-EBI, Hinxton, Cambridge Learn from the EBI’s experts through a combination of talks and practical exercises Take a tour of all our core data resources, or focus in on specific data types Full programme at

49 Predoc and postdoc training Open Days for bioinformatics early-stage researchers PhD studentships through EMBL International PhD Programme EIPOD interdisciplinary post-doc fellowship programme EBI–Sanger postdoc programme ww.ebi.ac.uk/training/postdoc/ESPOD