Corrections
SEQUENCE 4 >seq4 MSTNNYQTLSQNKADRMGPGGSRRPRNSQHATASTPSASSCKEQQKDVEH EFDIIAYKTTFWRTFFFYALSFGTCGIFRLFLHWFPKRLIQFRGKRCSVE NADLVLVVDNHNRYDICNVYYRNKSGTDHTVVANTDGNLAELDELRWFKY RKLQYTWIDGEWSTPSRAYSHVTPENLASSAPTTGLKADDVALRRTYFGP NVMPVKLSPFYELVYKEVLSPFYIFQAISVTVWYIDDYVWYAALIIVMSL YSVIMTLRQTRSQQRRLQSMVVEHDEVQVIRENGRVLTLDSSEIVPGDVL VIPPQGCMMYCDAVLLNGTCIVNESMLTGESIPITKSAISDDGHEKIFSI DKHGKNIIFNGTKVLQTKYYKGQNVKALVIRTAYSTTKGQLIRAIMYPKP ADFKFFRELMKFIGVLAIVAFFGFMYTSFILFYRGSSIGKIIIRALDLVT IVVPPALPAVMGIGIFYAQRRLRQKSIYCISPTTINTCGAIDVVCFDKTG TLTEDGLDFYALRVVNDAKIGDNIVQIAANDSCQNVVRAIATCHTLSKIN NELHGDPLDVIMFEQTGYSLEEDDSESHESIESIQPILIRPPKDSSLPDC QIVKQFTFSSGLQRQSVIVTEEDSMKAYCKGSPEMIMSLCRPETVPENFH DIVEEYSQHGYRLIAVAEKELVVGSEVQKTPRQSIECDLTLIGLVALENR LKPVTTEVIQKLNEANIRSVMVTGDNLLTALSVARECGIIVPNKSAYLIE HENGVVDRRGRTVLTIREKEDHHTERQPKIVDLTKMTNKDCQFAISGSTF SVVTHEYPDLLDQLVLVCNVFARMAPEQKQLLVEHLQDVGQTVAMCGDGA NDCAALKAAHAGISLSEAEASIAAPFTSKVADIRCVITLISEGRAALVTS YSAFLCMAGYSLTQFISILLLYWIATSYSQMQFLFIDIAIVTNLAFLSSK TRAHKELASTPPPTSILSTASMVSLFGQLAIGGMAQVAVFCLITMQSWFI PFMPTHHDNDEDRKSLQGTAIFYVSLFHYIVLYFVFAAGPPYRASIASNK AFLISMIGVTVTCIAIVVFYVTPIQYFLGCLQMPQEFRFIILAVATVTAV ISIIYDRCVDWISERLREKIRQRRKGA
Compute pI/Mw tool !!! If you choose the wrong format for the sequence… With the correct format:
ProtParam
SAPS
SAPS (1)
SAPS (2)
doi: /bioinformatics/bti797
The coiled-coil domains are annotated according to 3D structure data (experimental data)
Coiled-coil prediction Coils
Coils prediction
Coiled-coil prediction PairCoil (not always working…)
Paircoil prediction
Coiled-coil prediction PairCoil2
Parcoil2 results
Coiled-coil prediction Sliding window (Protscale)
Sliding window amino acid scale- example:
Bad results---- Bad results….
Sliding windows and amino acid scales Transmembrane domain: alpha-helix of 20 amino acids (hydrophobic) -> amino acid scales: hydrophobicity and alpha helix -> sliding window size: 20 amino acids
Protscale Amino acid scale: Kyte and Doolittle (hydrophobicity) Sliding window size: 21 amino acids
Protscale Amino acid scale: Chou&Fasman (alpha helix) Sliding window size: 21
Sliding windows and amino acid scales Transmembrane domain: alpha-helix of 20 amino acids (hydrophobic) -> amino acid scales: hydrophobicity and alpha helix -> sliding window size: 20 amino acids
Method based HMM or NN
HMMTOP
Protein: seq4 Length: 1127 N-terminus: IN Number of transmembrane helices: 8 Transmembrane helices:
TMHMM (1)
TMHMM (2)
TMpred (1)
PSORT II (1)
- Look for the presence of a signal peptide.
No signal peptide Signal peptides are often predicted as ‘transmembrane’ domains (or vice versa) as they amino acids with similar biochemical properties (hydrophic and alpha helix).
Transmembrane: resume HMMTOP (8 TM) PSORT II (10 TM) Tmpred (10 TM) TMHMM (11 TM) in out Big loop
? missed TM
The protein is known to contain 12 TM: one TM is missing at the N-terminus The possible ways to find the correct protein topology is to do a multiple alignment with other family members, or to do some 3D experiment (which are difficult with proteins containing transmembrane domains) Kristian Axelsen: personnal communication SEQ4 = Q9N323Q9N323
The Aquaglyceroporin contains ½ transmembrane regions which can not be predicted by programs, because the region is too short (less than 20 amino acids). There is no way to predict such transmembrane regions, except by doing 3D experiments. 3D experiments is the only way to confirm and ‘predict’ correctly transmembrane domains. Similarity analysis could then help to predict such regions in other protein of the same family. P0AER0
M3 and M7 are ‘demi’ transmembrane: not predictable
Look for the transmembrane regions of P31243 (try the different transmembrane prediction programs): your conclusions ?
No transmembrane domains are found by any program because this protein, a porin, is anchored in the membrane by a specific 3D structure called beta barrel which does not have any alpha helix….
‘beta barrel’ Mainly composed of beta-sheets in a 16-stranded beta-barrel formation and forms a pore in the membrane nm in diameter. Note that the orientation of the strands is such that side chains alternately point into the interior and exterior of the pore; the former are strongly polar residues while the latter are very hydrophobic.
Beta barrel Porin from Rhodobacter
Alignment of the 2 isoforms The gene has two in-frame initiation codons and two different proteins are made by alternative initiation (of translation)
According to this publication (PubMed: ), there is a 'Dual targeting of spinach protoporphyrinogen oxidase II to mitochondria and chloroplasts by alternative use of two in- frame initiation codons'.
Immunoblot analysis of Protox II in spinach leaf. Watanabe N et al. J. Biol. Chem. 2001;276: ©2001 by American Society for Biochemistry and Molecular Biology chloromitoTotal leaf
Q94IG7 – Long isoform wolfPSORT: chloroplast TargetP: chloroplast CH score: MI score: ER score: Other location: SignalP-NN: not secreted score (D): SignalP-HMM: not secreted SP probability: 6.2% SA probability: 0.2% ChloroP: chloroplast prediction score: MITOPROT: mitochondria !!! exported to mitochondria with a probability of 0.71 !!!! Q94IG7 – Short isoform wolfPSORT: mitochondrial TargetP: mitochondrial CH score: MI score: ER score: Other location: SignalP-NN: not secreted score (D): SignalP-HMM: not secreted SP probability: 3.1% SA probability: 5% ChloroP: not in chloroplast prediction score: MITOPROT: other location exported to mitochondria with a probability of 0.33 !!!!!!
Cystein (61 modifications) and serine (46 modifications) are the amino acids with the highest number of known associated PTM. Beware: Resid considers the selenocystein as a PTM…this is not the case !
Phosphorylation
P03372
UniProt data: Experimentally proved P03372
The phosphorylation sites are localized on the ‘surface’ of the protein (homodimer) (where the amino acid are accessible to the kinases !)
O-glycosylation
P02724
Myristoylation
P51876
NMT
Myristoylator
Protein: secreted protein (P02751, fibronectin) Can be predicted: -Subcellular location (PSORT, TargetP) -Domains (InterPro) -Signal -Sulfation -N-glycosylation -O-glycosylation -Phosphorylation (Not predictable…) (predictable…)
THE END