Download presentation
Presentation is loading. Please wait.
Published byMelvyn Blair Modified over 8 years ago
1
Click anywhere to go on to the next slide This demonstration is best viewed as a slide show, enabling you to simulate a session and make changes in cursor position more obvious. To do this, click Slide Show on the top tool bar, then View show. Statistics of small peptides This tour guides you through a computational experiment that you can perform within BioBIKE. To get to BioBIKE, go to: http://ixion.csbc.vcu.edu:8003/biologin Enter a login name (letters only, no spaces) No password necessary
2
Statistics of small peptides How many peptides are there with a single amino acid? In other words, how many ways can you fill the box below with a different amino acid? Amino acid goes here (how many different amino acids are there?) How many types of peptides are there of each size class?
3
Statistics of small peptides How about peptides with two amino acids? How many ways can you fill the boxes below with a different amino acids? Amino acids go here (If you don't see the answer, then simplify the problem and count by hand) How many types of peptides are there of each size class?
4
To verify your answer in BioBIKE… (though you should be so certain of your answer that if BioBIKE were to disagree, you'd think that BioBIKE is wrong, not you!) Strategy: Generate all possible proteins of a given length, then count them.
5
To verify your answer in BioBIKE… (though you should be so certain of your answer that if BioBIKE were to disagree, you'd think that BioBIKE is wrong, not you!) Strategy: Generate all possible proteins of a given length, then count them.
9
That gives you all the peptide sequences of length 1. Is the list correct? How many are there? With this list you can count by hand, but later this won't be possible. To automate the process, wrap the function in COUNT-OF.
12
That gives you the number of all the peptide sequences of length 1. Now for something more interesting. Change the length from 1 to 5 (remembering to close the entry by pressing Enter).
14
Whoops! A problem. BioBIKE is attempting to save you from doing something potentially stupid by accident. You could easily use this command to ask for more sequences than there are electrons in the universe. But read the advice carefully and note that there is a way out.
15
You can go on from there on your own.
16
Statistics of small peptides If you were given a peptide sequence, say "QWER" (glutamine-tryptophan-glutamate- arginine), is this enough information to identify the protein it came from? This is sort of like a variation on the birthday problem: How likely is it that someone in the room has the same birthday as you do? It depends on how many people there are in the room and how many birthdays there are to choose from. With 365 people in the room, what would be your chances? (ignore leap years) Identification of a protein from a peptide sequence
17
Statistics of small peptides Even without doing the calculation, you can see that only if the number of birthdays is much greater than the number of people do you stand a good chance of having a unique birthday. So how many possible peptides (analogous to birthdays) are there? You did this already. And how many 4-aa peptides are in the proteins of, say, ss120 (analogous to the number of people in the room)? Simplify: How many 4-aa peptides are there in a single protein? Suppose the protein has 100 amino acids. Identification of a protein from a peptide sequence
18
Statistics of small peptides Imagine that protein, with 100 amino acids: Identification of a protein from a peptide sequence aa 1 - aa 2 - aa 3 - aa 4 - aa 5 - aa 6 - …aa 95 - aa 96 - aa 97 - aa 98 - aa 99 - aa 100 How many 4-aa sequences are there in this protein? You might want to simplify. Suppose the protein were only 4 amino acids in length. How many would there be? Suppose it were 5 amino acids in length? 10? What's the rule? If I tell you the length of the protein, can you tell me the number of 4-aa peptides?
19
Statistics of small peptides Now imagine that there are many 100's of proteins in an organism (say ss120), with different lengths. What do you need to know to calculate the total number of 4-aa sequences in the proteins of ss120? You can get all the information you need in BioBIKE using the functions illustrated on the following slides. Identification of a protein from a peptide sequence
26
Assembling these functions should get you the number of 4-amino acid peptides there are in ss120 proteins. How does this number compare with the number of possible 4-amino acid peptide sequences you calculated earlier?
27
Statistics of small peptides There several problems in attempting to identify a protein from a single small peptide. Let's examine one of them. Mass spectrometry directly gives you not the sequence of a peptide but rather its molecular weight. If every peptide has a different molecular weight, then one can go directly from molecular weight to sequence. Is this the case? Consider the set of 3-amino acid peptides as an example. How much overlap is there in the molecular weights of different peptides?
28
Statistics of small peptides Strategy: - Calculate the molecular weights of all 3-amino-acid peptides - Bin (count) each size class - Write the results to a file - Download the file - Upload the file into Excel - Make a histogram of the results You'll want to consider the BioBIKE functions on the following slides. How much overlap is there in the molecular weights of different peptides?
29
MW-OF (from the GENES-PROTEIN menu; Translation submenu) Use it to get the molecular weights of all protein sequences of length 3. Use the SEQUENCE option so that the function knows enough to interpret a sequence like "PHE" as "proline-histidine- glutamate", using the one-letter code, rather than the abbreviation of phenylalanine, using the three-letter code.
30
BIN-DATA-OF (you used this in the previous tour) Use it to count the instances of each molecular weight. The interval should be set to 1 so each size class is counted individually. The max should be set to the biggest molecular weight a 3-amino acid peptide can have. That would be 3 times the molecular weight of the biggest amino acid. What's that?
31
WRITE (you used this in the previous tour) Use it to write the counts of the binned molecular weights, i.e. the previous result. (PREVIOUS-RESULT from the OTHER- FUNCTIONS menu may be of use here) Make up any file name you want, so long as you put it in quotes. Select TAB-DELIMITED from the Options menu, since the file will be uploaded into Excel.
32
Statistics of small peptides You should now be in a position to create a histogram within Excel. If you do, you'll see something remarkable, like… How much overlap is there in the molecular weights of different peptides?
33
Statistics of small peptides This is peculiar… How much overlap is there in the molecular weights of different peptides? part of the histogram, blown up to show detail
34
Statistics of small peptides This is peculiar… The numbers of instances each molecular weight class appears to skip by a discrete unit. Why is that? Let's examine the peptides and their molecular weights more closely. How much overlap is there in the molecular weights of different peptides? part of the histogram, blown up to show detail
35
Repeat the molecular weight calculation, but this time labeling the result (you'll see what labeling does in a moment) Execute the resulting function.
36
Note that each molecular weight now comes with the peptide that is associated with it. To compare this result with the histogram, we need to sort the result by molecular weight.
37
Surround the MW-OF function and wrap SORT around it.
38
We want to sort by the molecular weight (the second position), not the peptide (the first position).
39
Execute the function and compare the results closely with your histogram in Excel. What accounts for the numbers? Why are molecular weights with only one peptide so rare? How many are there?
40
In this tour, you've seen: - How to determine the number of peptides in each size class. - Problems related to the identification of proteins from their peptides. - The degeneracy of molecular weights in peptides. - Some causes of this degeneracy. Statistics of small peptides
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.