EBI resources introductory course Pablo Porras Millán
Schedule 8:30 - 9:30Intro to EBI 9: :00Expectations assessment 10: :30 Browsing the genome and exploring sequences: DNA & RNA services Ensembl, Ensembl Genomes, ENA. 11: :00Break 12: :30Studying expression profiles: Gene expression services Array Express and Expression Atlas 12: :30 Understanding proteins: Resources for identification and annotation GO, UniProt & InterPro 13: :30Lunch 14: :30 Proteomics and systems: From mass spectrometry data to models PRIDE, IntAct, Reactome & BioModels 15: :00Break 16: :30Small molecules bioinformaticsChEMBL, ChEBI, Metabolights 16: :00Expectations re-assessments, Q&A
The hub for bioinformatics in Europe The EMBL-European Bioinformatics Institute
What is EMBL-EBI? Part of the European Molecular Biology Laboratory International, non-profit research institute Europe’s hub for biological data, services and research
The European Molecular Biology Laboratory Grenoble Structural biology Hinxton, Cambridge Bioinformatics Hamburg Structural biology Heidelberg Basic research Administration EMBO EMBL staff: 1500 people >60 nationalities EMBL staff: 1500 people >60 nationalities Monterotondo, Rome Mouse biology
EMBL-EBI’s mission Provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress Contribute to the advancement of biology through basic investigator-driven research in bioinformatics Provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators Help disseminate cutting-edge technologies to industry Coordinate biological data provision throughout Europe
EMBL member states Austria, Belgium, Croatia, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Israel, Italy, Luxembourg, the Netherlands, Norway, Portugal, Spain, Sweden, Switzerland and the United Kingdom Associate member state: Australia
Data and tools for molecular life science Services
What services do we provide? Labs around the world send us their data and we… Archive it Classify it Share it with other data providers Analyse it …provide tools to help researchers use it A virtuous circle
Bioinformatics underpins research Genomes Nucleotide sequence Gene expression Protein families, domains and motifs Protein structure Protein-protein interactions Chemical entities Pathways Systems Literature Protein sequence and proteomes
Standards – international collaborations Genome annotation Genome annotation Functional Genomics Data Society Protein sequence Protein sequence HUPO- Proteomics Standards Initiative (PSI) HUPO- Proteomics Standards Initiative (PSI) Protein structure Protein structure Cheminformatics Cheminformatics Pathways Pathways Systems modelling standards Systems modelling standards Metabolomics Standards Initiative (MSI) Metabolomics Standards Initiative (MSI) Genomics Standards Consortium (GSC) Genomics Standards Consortium (GSC) Nucleotide sequence Nucleotide sequence
EMBL-EBI users: a one-day snapshot
Key facts about our services Freely available A comprehensive collection of molecular databases Globally coordinated data collection and dissemination Produced in collaboration with other world leaders: NCBI (US) National Institute of Genetics (Japan) SIB Swiss Institute of Bioinformatics (Switzerland) Wellcome Trust Sanger Institute (UK)
Data resources DNA & RNA genes, genomes & variation Gene expression RNA, protein & metabolite expression Proteins sequences, families & motifs Structures molecular & cellular structures Systems reactions, interactions & Chemical biology chemogenomics & metabolomics Ontologies taxonomies & controlled vocabularies Literature Scientific publications & patents Other software cross-domain tools & resources pathways
The EBI Search Service Gene and protein summaries Data organised by: gene expression protein structure literature Data organised by: gene expression protein structure literature Species selector allows for easy comparison Explore the data and return easily to your results Explore the data and return easily to your results
Bioinformatics tools Over 100 analysis tools Results enriched with data from EBI resources Nucleotide sequence search e.g. BLAST nucleotide Protein sequence search e.g. BLAST protein, PSI-Search Multiple sequence alignment e.g. Clustal Omega, MUSCLE Pairwise sequence alignment e.g. Needle Protein functional analysis e.g. InterProScan Functional genomics tools e.g. Expression Atlas Molecular structure analysis e.g. PDBeFold Text mining e.g. EBIMed, Whatizit
Run tasks on EBI servers, using EBI data Ideal for large scale analyses, repetitive tasks and internal pipelines Integration of EBI resources and data EBI Search, tools, data retrieval Same programs, data and results enrichment as running via the web pages Programmatic access: EBI Web Services
Data-driven discovery PhD and postdoctoral programmes Research
Research themes Genes & gene expression Paul Bertone Ewan Birney Alvis Brazma Anton Enright Paul Flicek Nick Goldman Genes & gene expression Paul Bertone Ewan Birney Alvis Brazma Anton Enright Paul Flicek Nick Goldman Proteins, structures & chemical biology Alex Bateman Gerard Kleywegt John Overington Christoph Steinbeck Sarah Teichmann Janet Thornton Proteins, structures & chemical biology Alex Bateman Gerard Kleywegt John Overington Christoph Steinbeck Sarah Teichmann Janet Thornton Systems biology Pedro Beltrao John Marioni Julio Saez-Rodriguez Oliver Stegle Systems biology Pedro Beltrao John Marioni Julio Saez-Rodriguez Oliver Stegle
Research leaders John Overington Janet Thornton Christoph Steinbeck Ewan Birney Paul Flicek Nick Goldman John Marioni Oliver Stegle Gerard Kleywegt Paul Bertone Alex Bateman Sarah Teichmann Alvis Brazma Anton Enright Pedro Beltrao Julio Saez- Rodriguez
Examples of EMBL-EBI research What is the molecular basis of ageing? How do the neurons of someone with Parkinson’s disease signal differently from healthy neurons? What makes a stem cell decide to become skin or muscle? Which of these proteins will make good targets for drugs? Which of these changes to a genome’s structure drive cancer?
PhDs and Postdocs EMBL International PhD programme: Postdoctoral positions available from: Postdoctoral fellowships: EIPOD EMBL sponsored: interdisciplinary ESPOD EBI–Sanger: combined experimental/computational
User training For scientists working at all levels
Bioinformatics training Train at EMBL-EBI Gain hands-on experience in our state-of- the-art facilities. Train online Learn in your own time, at your own pace with our freely available online courses. Train at your place Choose the training that’s right for you and your colleagues - and our experts will come to you.
Train online Free online courses Learn in your own time, at your own pace Created for life-science researchers No previous knowledge of bioinformatics needed e
Support and collaboration Interactions with industry
The EMBL-EBI Industry Programme Helping industry make the most of innovations in bioinformatics Neutral ground for members to explore developments and concepts Pre-competitive collaboration Standards development Technical development Input into services development “The Programme’s regular meetings foster inter-company interactions as we collaborate on special projects and liaise on other industry initiatives.” - Bertram
Industry Programme members Astellas Pharma Inc. AstraZeneca Bayer Pharma AG Boehringer Ingelheim Bristol-Meyers-Squibb Eli Lilly and Company F. Hoffmann-La Roche GlaxoSmithKline Johnson & Johnson Pharmaceutical R&D Merck Serono S.A. Nestlé Institute of Health Sciences Novartis Pharma AG Novo Nordisk Syngenta Sanofi-Aventis Recherche & Développement UCB Unilever
EMBL member states The European Commission The Wellcome Trust Research Councils UK US National Institutes of Health With thanks to our funders Supported by the European Community's Seventh Framework Programme (FP7/ ) under grant agreement for Affinomics (FP ).
A brief introduction to standards and data integration
Lazebnik, Biochemistry (Mosc). 2004, PMID:
Serendipitiously Recovered Component Most Important Component Really Important Component Undoubtedly Most Important Component
A model that reflects reality The biologist’s model
Standards Images from:
Standards in bioinformatics Common identifiers Controlled vocabularies / ontologies Common formats Common schemas Minimum information guidelines Common query interfaces Schema Data distribution Reporting guideline Control vocabulary Format Identifiers
DB I I I I I Database I User The problem of data integration Ideally Reality Interface
Utility of Bioinformatics Scientific impact Too little bioinformatics Too many databases Too diverse interfaces Tim Hubbard
Data integration DB I I I I Ideally Compromise DatabaseInterface I User Combining data residing in different sources … providing users with a unified view of these data. DB I I Reality SHARED CONTROLLED VOCABULARIES!!
From xkcd: The danger with standards…
Access, exchange, sharing, portability, interoperability, annotation, comparison, verification, representation, integration, reusability. Nucleotide sequences INSDC EMBL DDBJ NCBI Molecular interactions IMEx IntAct InnateDB DIP MINT … Collaboration among data providers More data coverage Less redundancy Less inconsistency Better data management Protein indentifications ProteomeXchange PRIDE PeptideAtlas GPMDB Tranche …
Work group of the Proteomics Standards Initiative Community coordination to ensure deposition of Molecular Interaction data in public repositories Concentrating on … Annotation and representation of published MI data Accessibility of MI data to the user community Example of community development of standards standards: PSI-MI Data format/schema Data distribution Control vocabulary MIAPE Reporting guideline PSI-MI XML PSI-MITAB PSICQUIC MIMIx IMEx PSI-MI CV Scoring PSISCORE
Thank you! Facebook: EMBLEBI YouTube: EMBLMedia