MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia, PA 19111
Dunbrack Lab, FCCC - NIGMS Workshop Agenda Background MolIDE in retrospect MolIDE2 Demonstration Discussion
Dunbrack Lab, FCCC - NIGMS Workshop Background Homology Modelling What is it and why do we need it? Given a protein without known structure, predict its 3D structure based on its sequence: Search structure databases for homologous sequences Transfer coordinates of known protein onto unknown MQEQLTDFSKVETNLISW-QGSLETVEQMEPWAGSDANSQTEAY MHQQVSDYAKVEHQWLYRVAGTIETLDNMSPANHSDAQTQAA | |..|. |||... |..||.|.| | |||..| | = Identity. = positively related
Dunbrack Lab, FCCC - NIGMS Workshop Background An inconvenient truth: huge gap between known structures and known sequences. Experimentally determined structures (through x-ray crystallography & NMR spectroscopy.) As of 2/24/2009, PDB has 56,066 entries (< 52K protein structures) Decoded protein sequences As of 2/10/2009, UniProtKB/Swiss-Prot (Release 56.8 ) contains 410,518 sequence entries. As of 2/10/2009, UniProtKB/TrEMBL(Release 39.8 ) contains 7,157,600 sequence entries
Dunbrack Lab, FCCC - NIGMS Workshop Background Various methods Swiss-model – fully automated modeling server Modeller – satisfaction of spatial restraints Nest – based on artificial evolution Similar steps Template identification Sequence alignment Backbone generation Side-chain prediction & loop modeling Structure refinement Homology modeling
Dunbrack Lab, FCCC - NIGMS Workshop In Retrospect MolIDE: A graphical IDE for homology modeling A. Canutescu and R.L. Dunbrack, Bioinformatics, 2005
Dunbrack Lab, FCCC - NIGMS Workshop In Retrospect sequence search MolIDE: open-source, cross-platform multiple-round psiblast alignments secondary structure prediction assisted alignment editing (joint with a template viewer) side chain conformation prediction loop building
Dunbrack Lab, FCCC - NIGMS Workshop In Retrospect MolIDE automatically downloads these large sequence databases (nr or uniref100) and formats them for use with BLAST. MolIDE1.6 (released on July 1, 2008) Easy installer for Windows version One-step updating of PDB and Non-redundant protein sequence databases PSI-BLAST search of non-redundant database can be opened as a sortable table for browsing homologues of query List of templates from PDB includes protein names and species Works with current remediated XML files from the PDB NCBI's non-redundant protein database (nr) can be replaced with Uniprot's Uniref100 database: Wang et al. Nature Protocols, Dec., 2008
Dunbrack Lab, FCCC - NIGMS Workshop In Retrospect Some examples: Structure with ligands Multi-chain protein complex Modeling of biological unit Modeling with multiple templates Structural or functional restraints in modeling … Previous MolIDE CAN NOT deal with protein oligomers and protein complexes
Dunbrack Lab, FCCC - NIGMS Workshop MolIDE2 Identify an appropriate template; One-to-many sequence alignment; Better understanding of Biological Unit (BU). Challenge: Develop a new homology modeling program capable of modeling protein oligomers and protein complexes. Goal:
Dunbrack Lab, FCCC - NIGMS Workshop MolIDE2 Able to model protein oligomers and complexes; Modeling process based on Biological Unit (BU). Identifying structural template based on domain and family information; Integrated database providing protein structural and sequence information; Better organization and representation of template information; User-friendly graphical interface for selecting template; Integration of profile-profile sequence alignment method; Improved graphical editing of sequence alignment; Key features:
Dunbrack Lab, FCCC - NIGMS Workshop MolIDE2 Screenshot
Dunbrack Lab, FCCC - NIGMS Workshop MolIDE2 Operation flowchart
Dunbrack Lab, FCCC - NIGMS Workshop Demonstration 1.Open sequence file 2.Run hmmpfam (generate domain file) 3.Run psiblast (generate query profile) 4.Run psipred (predict secondary structure of query sequence) 5.Open domain file 6.Search PDB for potential template; get profile-profile alignment result 7.Open alignment file 8.Manually modify sequence alignment (optional) 9.Copy backbone structure 10.Run SCWRL for sidechain replacement. 11.Build loops (not implemented yet). A typical modeling process:
Dunbrack Lab, FCCC - NIGMS Workshop Demonstration Sequences and domains
Dunbrack Lab, FCCC - NIGMS Workshop Demonstration Finding template (1)
Dunbrack Lab, FCCC - NIGMS Workshop Demonstration Finding template (2)
Dunbrack Lab, FCCC - NIGMS Workshop Demonstration Editing sequence alignment & generating model
Dunbrack Lab, FCCC - NIGMS Workshop Demonstration Editing sequence alignment & generating model (cont’d)
Dunbrack Lab, FCCC - NIGMS Workshop Discussion Some future work Loop modeling component dealing with space symmetry Involvement of protein-protein interaction information (ProtBuD) Modeling with multiple templates modeling of ligands refinement of models with Rosetta and RosettaDock …
Dunbrack Lab, FCCC - NIGMS Workshop Acknowledge Dr. Adrian Canutescu, Dr. Mark Andrake NIH R01 GM84453 NIH R01 GM73784
Dunbrack Lab, FCCC - NIGMS Workshop Q & A