Applications of Homology Modeling Hanka Venselaar
This seminar…. Homology Modeling… Why? What? When? How? And a few real world examples….
Hearing loss No structure: MGTPWRKRKGIAGPGLPDLSCALVLQPRAQVGTMSPAI ALAFLPLVVTLLVRYRHYFRLLVRTVLLRSLRDCLSGLRI EERAFSYVLTHALPGDPGHILTTLDHWSSRCEYLSHMG PVKGQILMRLVEEKAPACVLELGTYCGYSTLLIARALPP GGRLLTVERDPRTAAVAEKLIRLAGFDEHMVELIVGSSE DVIPCLRTQYQLSRADLVLLAHRPRCYLRDLQLLEAHAL LPAGATVLADHVLFPGAPRFLQYAKSCGRYRCRLHHTG LPDFPAIKDGIAQLTYAGPG DFNB 63 Sequence:
KKIALSDARSMKHALREIKIIRRL DHDNIVKVYEVLGPKGTDLQGELF KFSVAYIVQEYMETDLARLLEQGT LAEEHAKLFMYQLLRGLKYIHSAN VLHRDLPANIFISTEDLVLKIGDF GLARIVDQHYSHKGYLSEGLVTKW YRSPRLLLSPNNYTKAIDMWAAGC ILAEMLTGRMLFAGAHELEQMQLL ETIPVIREEDKDELLRVMPSFVSS ? Why homology modeling? Lab Translation Bioinformatics ATOM 1 N GLN A N ATOM 2 CA GLN A C ATOM 3 C GLN A C ATOM 4 O GLN A O ATOM 5 CB GLN A C ATOM 6 CG GLN A C ATOM 7 CD GLN A C ATOM 8 OE1 GLN A O ATOM 9 NE2 GLN A N ATOM 10 N SER A N ATOM 11 CA SER A C ATOM 12 C SER A C ATOM 13 O SER A O ATOM 14 CB SER A C ATOM 15 OG SER A O ATOM 16 N CYS A N ATOM 17 CA CYS A C ATOM 18 C CYS A C ATOM 19 O CYS A O ATOM 20 CB CYS A C ATOM 21 SG CYS A S ATOM 22 N LEU A N 4
Protein structures – 4 levels Primary Secondar y Tertiary Quaternary Shape of the protein determines its function…..
Protein structures…where can we find them? Protein DataBank =
PDB-file: contains the coördinaties for every atom in a protein Visualisation with PDB-viewers -Jmol -PyMol -SwissPDB viewer -YASARA
So, 3D Protein-structures provide useful information But…… Not enough protein structures in the PDB database
Predictions/Annotations
Homology modeling in short… Prediction of structure based upon a highly similar structure 2 basic assumptions: Structure defines function During evolution structures are more conserved than sequence 2 basic assumptions: Structure defines function During evolution structures are more conserved than sequence Use one structure to predict another
Homology modeling – When? Example: by 80 residues 30% identity sufficient O
Homology modeling in short… Prediction of structure based upon a highly similar structure Add sidechains, Molecular Dynamics simulation on model Unknown structure NSDSECPLSHDG || || | || NSYPGCPSSYDG NSDSECPLSHDG || || | || NSYPGCPSSYDG Model sequence Known structure Back bone copied Copy backbone and conserved residues Model!
The 8 steps of Homology modeling
1: Template recognition and initial alignment
BLAST your sequence against PDB Initial alignment Best hit is usually your template
1: Template recognition and initial alignment 2: Alignment correction
Functional residues conserved Use multiple sequence alignments Deletions shift gaps CPISRTGASIFRCW CPISRTA---FRCW CPISRT---AFRCW CPISRTGASIFRCW CPISRTA---FRCW CPISRT---AFRCW CPISRTAAS-FRCW CPISRTG-SMFRCW CPISRTA--TFRCW CPISRTAASHFRCW CPISRTGASIFRCW CPISRTA---FRCW CPISRTAAS-FRCW CPISRTG-SMFRCW CPISRTA--TFRCW CPISRTAASHFRCW CPISRTGASIFRCW CPISRTA---FRCW Both are possible Multipe sequence alignment Correct alignment Sequence with known structure Your sequence
2: Alignment correction Core residues conserved Use multiple sequence alignments Deletions in your sequence shift gaps Known structure FDICRLPGSAEAV Model FNVCRMP---EAI Model FNVCR---MPEAI S G P L A E R C IV C R M P E V C R M P E Correct alignment F-D- -A-V
1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation
Making the model…. Copy backbone of template to model Make deletions as discussed (Keep conserved residues)
1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling
Known structure GVCMYIEA---LDKYACNC Your sequence GECFMVKDLSNPSRYLCKC Loop library, try different options
1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling 5: Sidechain modeling
5: Side-chain modeling Several options Libraries of preferred rotamers based upon backbone conformation
1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling 5: Sidechain modeling 6: Model optimization
Molecular dynamics simulation Remove big errors Structure moves to lowest energy conformation
1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling 5: Sidechain modeling 6: Model optimization 7: Model validation
7: Model Validation Second opinion by PDBreport /WHATIF Errors in active site? new alignment/ template No errors? Model!
1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling 5: Sidechain modeling 6: Model optimization 7: Model validation 8: Iteration
Model! 1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling 5: Sidechain modeling 6: Model optimization 7: Model validation 8: Iteration
8 steps of homology modeling 1: Template recognition and initial alignment 2: Alignment correction 3: Backbone generation 4: Loop modeling 5: Side-chain modeling 6: Model optimization 7: Model validation 8: Iteration Alignment Modeling Correction
Hearing loss Structure! MGTPWRKRKGIAGPGLPDLSCALVLQPRAQVGTMSPAI ALAFLPLVVTLLVRYRHYFRLLVRTVLLRSLRDCLSGLRI EERAFSYVLTHALPGDPGHILTTLDHWSSRCEYLSHMG PVKGQILMRLVEEKAPACVLELGTYCGYSTLLIARALPP GGRLLTVERDPRTAAVAEKLIRLAGFDEHMVELIVGSSE DVIPCLRTQYQLSRADLVLLAHRPRCYLRDLQLLEAHAL LPAGATVLADHVLFPGAPRFLQYAKSCGRYRCRLHHTG LPDFPAIKDGIAQLTYAGPG DFNB 63 Sequence:
Mutation: Tryptophan 105 -> Arginine Hydrophobic contacts from the Tryoptohan are lost, introduction of an hydrophilic and charged residue
The three mutated residues are all important for the correct positioning of Tyrosine 111 Tyrosine 111 is important for substrate binding Published in Nature Genetics: 2008 Oct 26.
Voorbeeld: C-terminale deletie van 10 aa in Dectine Afdeling: Interne geneeskunde of Internal Medicine >Dectin_1_Isoform_a MEYHPDLENLDEDGYTQLHFDSQSNTRIAVVS EKGSCAASPPWRLIAVILGILCLVILVIAVVL GTMAIWRSNSGSNTLENGYFLSRNKENHSQPT QSSLEDSVTPTKAVKTTGVLSSPCPPNWIIYE KSCYLFSMSLNSWDGSKRQCWQLGSNLLKIDS SNELGFIVKQVSSQPDNSFWIGLSRPQTEVPW LWEDGSTFSSNLFQIRTTATQENPSPNCVWIH VSVIYDQLCSVPSYSICEKKFSM
MSQSTQTNEFLSPEVFQHIWDFLEQPICSVQPIDLNFVDEPSEDGATNKI EISMDCIRMQDSDLSDMWPQYTNLGLLNSMDQQIQNGSSSTSPYNTDHAQ NSVTAPSPYAQPSSTFDALSPSPAIPSNTDYPGPHSFDVSFQQSSTAKSA TWTYSTELKKLYCQIAKTCPIQIKVMTPPPQGAVIRAMPVYKKAEHVTEV VKRCPNHELSREFNEGQIAPPSHLIRVEGNSHAQYVEDPITGRQSVLVPY EPPQVGTEFTTVLYNFMCNSSCVGGMNRRPILIIVTLETRDGQVLGRRCF EARICACPGRDRKADEDSIRKQQVSDSTKNGDGTKRPFRQNTHGIQMTSI KKRRSPDDELLYLPVRGRETYEMLLKIKESLELMQYLPQHTIETYRQQQQ QQHQHLLQKQTSIQSPSSYGNSSPPLNKMNSMNKLPSVSQLINPQQRNAL TPTTIPDGMGANIPMMGTHMPMAGDMNGLSPTQALPPPLSMPSTSHCTPP PPYPTDCSIVSFLARLGCSSCLDYFTTQGLTTIYQIEHYSMDDLASLKIP EQFRHAIWKGILDHRQLHEFSSPSHLLRTPSSASTVSVGSSETRGERVID AVRFTLRQTISFPPRDEWNDFNFDMDARRNKQQRIKEEGE P63 sequence Structure! EEC syndrome
Arginine Serine Mutation R S Loss of negative charge Loss of interaction with the DNA
Homology Modeling… What? Prediction of an unknown structure based on an homologous and known structure Why? To answer biological and medical questions when the “real” structure is unknown When? A template with enough identity must be available How? 8 Steps Use the models for mutant analysis, experimental design and understanding of the protein in general To conclude….