The Electron Microscopy Data Bank and OME Rich data, quality assessment, and cloud computing Christoph Best European Bioinformatics Institute, Cambridge,

Slides:



Advertisements
Similar presentations
CCPN project modeling framework University of Cambridge European Bioinformatics Institute MSD group.
Advertisements

Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
EBI is an Outstation of the European Molecular Biology Laboratory. PRIDE associated tools: Practical exercise 1 PRIDE team, Proteomics Services Group PANDA.
A Geometric Database for Gene Expression Data Baylor College of Medicine Gregor Eichele Christina Thaller Wah Chiu James Carson Rice University Tao Ju.
Deposition BIOCHEMICAL SAMPLE SPECIMEN PREPARATIONIMAGING IMAGE PROCESSING RECONSTRUCTION MAP-FITTING ELECTRONIC NOTEBOOK.
The Imperial College Tissue Bank A searchable catalogue for tissues, research projects and data outcomes Prof Gerry Thomas - Dept. Surgery & Cancer The.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
Bioinformatics and Phylogenetic Analysis
Methods: Cryo-Electron Microscopy Biochemistry 4000 Dr. Ute Kothe.
The Protein Data Bank (PDB)
Digital Archives for Molecular Microscopy A community database for biological research Christoph Best European Bioinformatics Institute, Cambridge, UK.
EMDB Richard Newman Monica Chagoyen Mohamed Tagari EMBL-EBI Cryo-Electron Microscopy Structure Deposition Workshop RCSB Protein Data Bank in Rutgers University,
ExPASy - Expert Protein Analysis System The bioinformatics resource portal and other resources An Overview.
Overview of Search Engines
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Macromolecular Electron Microscopy Michael Stowell MCDB B231
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
CCP-EM community meeting 7 February 2013 EMDB and beyond Ardan Patwardhan and Gerard Kleywegt Protein Data Bank in Europe EMBL-EBI.
GTL Facilities Computing Infrastructure for 21 st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL.
EMBL-EBI Adel Golovin MSDsite The project is funded by the European Commission as the TEMBLOR, contract-no. QLRI-CT under the RTD programme.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Databank in Europe (PDBe)‏ An Introduction.
SMART Teams: Students Modeling A Research Topic Jmol Training 101!
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
Knowledge Representation and Indexing Using the Unified Medical Language System Kenneth Baclawski* Joseph “Jay” Cigna* Mieczyslaw M. Kokar* Peter Major.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
1 A National Virtual Specimen Database for Early Cancer Detection June 26, 2003 Daniel Crichton NASA Jet Propulsion Laboratory Sean Kelly NASA Jet Propulsion.
PROTEIN STRUCTURE CLASSIFICATION SUMI SINGH (sxs5729)
Worldwide Protein Data Bank Worldwide Protein Data Bank History of the PDB  1970s  Community discussions about how to establish.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Instruct Image Processing Centre
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
Baylor College of Medicine Wah Chiu*, PI Grigore Pintilie* Matthew Baker* Matthew Dougherty Steven Ludtke Rutgers University Helen Berman, co-PI Catherine.
Workshop on Structural and Computational Proteomics of Biological Complexes.
Structural Study of the  12 Virus By:Elizabeth Brown.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Data Integration and Management A PDB Perspective.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Protein Data Bank: An Introduction Learning to Use the RCSB PDB Portal.
Workshop Structural Proteomics of Biological Complexes.
Reaching the Information Limit in Cryo- EM of Biological Macromolecules: Experimental Aspects -Robert M. Glaeser and Richard J. Hall (2011)
EBI is an Outstation of the European Molecular Biology Laboratory. Quaternary Structure.
Structural Models Lecture 11. Structural Models: Introduction Structural models display relationships among entities and have a variety of uses, such.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
A Geometric Database of Gene Expression Data for the Mouse Brain Tao Ju, Joe Warren Rice University.
EMBL-EBI MSD Search and Visualization tools Jawahar Swaminathan.
3D-EM DAS Extending DAS to 3D-EM and Fitting /02/26.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Deposition, Validation, Search and Analysis Services.
Macromolecular Structure Database Project EMSD Infra-structure Services for Europe To develop an autonomous structural database capability in Europe
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
EM Maps and Models in EMDB/PDB. Growth of EM entries
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
INFSO-RI Enabling Grids for E-sciencE EGEE-2 NA4 Biomed Bioinformatics in CNRS Christophe Blanchet Institute of Biology and Chemistry.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe Search Services (PDBelite, PDBePro and BIObar) Sanchayita Sen, Ph.D. PDB Depositions.
June 3-6, 2003E-Society Lisbon Automatic Metadata Discovery from Non-cooperative Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science.
Hierarchy of Biological Complexity Interactions of machines (molecular and cellular dynamics) Macromolecular machines Proteins and nucleic acids Sequences.
CYBER-GIS FOR SCIENTIFIC DISCOVERIES. Global Forest Change Hansen, M. C. et al (2013). High-Resolution Global Maps of 21st-Century Forest Cover Change.
PDBe Protein Interfaces, Surfaces and Assemblies
What is cryo EM? EM = (Transmission) Electron Microscopy
D7.4 A REFMAC server for EM and NMR
Towards Cellular Systems in 4D
Hans Elmlund, Dominika Elmlund, Samy Bengio  Structure 
Jan Harapin, Matthias Eibauer, Ohad Medalia  Structure 
Volume 20, Issue 12, Pages (December 2012)
Zachary Frazier, Min Xu, Frank Alber  Structure 
Volume 26, Issue 6, Pages e2 (June 2018)
Presentation transcript:

The Electron Microscopy Data Bank and OME Rich data, quality assessment, and cloud computing Christoph Best European Bioinformatics Institute, Cambridge, UK

Molecules in the Mist2 Transmossion Electron Microscope ADVANTAGES Short wavelength High cross section DISADVATAGES Destroys specimen Low contrast, high noise No genetically engineerable markers Thin specimens only (<500 nm)

Molecules in the Mist3 Specimen preparation Cryo-Electron Microscopy Fixation using amorphous ice Thin ice sheet (500 nm) in holey carbon on copper grid

Molecules in the Mist4

Molecules in the Mist5 Single-particle method Tripeptidyl-peptidase II (TPP II) courtesy of B. Rockel, Martinsried

Molecules in the Mist6 Single-particle analysis: GroEL to 4A Ludtke et al, Structure 2008

Molecules in the Mist7 Single particle stack Typically several 1000 to individual images Problems: Automatic selection of particles Alignment Classification Angular assignment 3D reconstruction

Molecules in the Mist8 Classification and iterative refinement Classify images Assign angles 3D reconstruction

Molecules in the Mist9 High-resolution structures High resolution single particle averaging reached 4A how much longer till atomic resolution? Ludtke et al, Structure 2008

Molecules in the Mist10 Electron tomography 3D reconstruction by taking a series of images from different angles Difficulty: Nanometer accuracy Problems: Limited tilt range ↔ missing wedge ⇒ distortion Imperfections of the tilt ↔ alignment ⇒ limited resolution Computational reconstruction algorithms

Molecules in the Mist11 Tomography of eukaryotic cells PROJECTION SLICE O. Medalia et al, Science, 2002 Dictyostelium discoideum

Molecules in the Mist12 The cytoskeleton of S. melliferum yellow: geodetic line J. Kürner et al., Science, 2005

Molecules in the Mist13 Cytoskeleton Cytoskeleton of Spiroplasma melliferum J. Kürner et al., Science, 2005

Molecules in the Mist14 Viruses projection slice 3D segmentation H. simplex VacciniaHIV Grünewald & Czyrklaff, Curr. Opin. Microbiol., 2006

Molecules in the Mist15 3D averaging: The Nuclear Pore Complex Nuclear pore complex Beck et al., Science, 2004 Matching Aligning Averaging 3D reconstruction

Molecules in the Mist16 Macromolecular crowding

Molecules in the Mist17 “Visual proteomics”

The Electron Microscopy Data Bank contains EM-derived density maps complementary to coordinate sets in PDB established EBI (Kim Henrick) web-based submission and retrieval hand-curated (R. Newman) a bit like Ebay – and you won't make any monery, either

EMBL - EBI Outstation of European Molecular Biology Laboratory European inter-governmental organization with 20 members provides bioinformatics research and services EMDB is part of the Protein Databank in Europe wwPDB worldwide repository for protein structures (“PDB” coordinate files) since 1973 EMDB operated jointly with Rutgers University under an NIH grant Joint BBSRC grant between EBI and Dundee to develop EMDB using OME technology in

EMDep deposition system 600 entries, current rate approx /month Contents of an entry: Metadata (XML header) → experimental metadata/procedure Map (any format, converted to CCP4/MRC) Additional files

EMDB search system

EMDB Atlas pages

Deposition statistics about 300 structures now, growing fast deposition mandatory in major journals mostly single-particle structure tomograms welcome, but little infrastructure

Making EMDB useful (for tomography) Improve visualization and presentation e.g. 3D applet viewer (Houston) or tomogram slice viewer Improve search and retrieval (Googlify?) Community dictionaries & ontologies for specimen purification/preparation, imaging, and image processing → especially for tomography! Data harvesting, LIMS (e.g. EMEN) Deposition and distribution strategies for large maps (Data grids) → tomograms Encouraging best practices (conventions)

Enhancing the user experience Current EMDB interface: simple and efficient but must be extended to accommodate more complex experiments OMERO interface: geared at labs, not public databases All the beauty of AJAX high-performance visualization

Online visualization Improved visualization tools: Java applet embedded in page M. Baker, S. Ludtke, J. Warren*, L. Sridhar*, P. Feng*, W. Chiu (Houston) OpenAstexViewer (now O/S) OMERO visualization server data dictionary change: contour level becomes a mandatory data item

Rich data sets Submissions consist of maps (increasingly more than one) relations between data sets → unexpressed XML-based standards for represen-ting relationships between data: Subject-predicate-object relationships (RDF framework) Harvesting interface to EM processing software Web-based visualization for sub-mission and retrieval, complex sub-missions assembled interactively (AJAX)

Rich data submissions

Possible XML representation

Quality assessment Quality assessment requires Should we store original data? And how much? Disk/network limitations can be overcome Allows quality assessment, future algorithms? Data mining?

QA for single particle Example: Characterizing classes Shaikh et al., 2008

Grid computing for EMDB: OMERO.grid? Data upload/distribution Background data upload tool Remote filesystem/image server access Distributed computing resources Access to cloud computing resources Dynamically instatiated virtual machines Quality assessment processing

Conclusions Electron Microscopy is an example of a mature bioimaging field Community with similar experimental/computationa protocols Many images/data sets combined to yield single result Where OME can improve it Improved user experience In-house database (EMDB is depository only) Future directions with OME Rich data sets Quality assessment tools, data set browsing Grid/cloud computing