Automation System For Checking Protein Prediction Sigalit Kanevsky Academic Advisor: Dr. Chen Keasar The Department of Computer Science, Ben Gurion University.
CASP - Critical Assessment of Techniques for Protein Structure Prediction A competition for protein structure prediction taking place every two years. The participating groups receive a primary structure (sequence) of protein - the TARGET - and need to return a model for the target’s tertiary structure.
Goal Create a system that compares a score of a given structure of protein to the official solution that contains the actual score of structure of the protein. The system will be used in the CASP competition.
System flow Download links In the competition Chen and his group send proposed protein structures prediction as solution, and in the end of the competition the real answer is uploaded to the CASP website. Once a day the system checks the CASP website for new links to the answer file. Add to working directory examine the prediction
Add to working directory Download links When the system finds a new answer file, the answer file is downloaded to a working directory from the CASP website, and the system adds the proposed prediction to the same directory, which will be later compared with the actual solution Add to working directory examine the prediction
Examine the prediction Download links The system generates a score comparison for every protein between our given score and the actual score, and calculates the correlation and distance average. Add to working directory Examine the prediction
Results Good prediction When we made a good prediction for protein structure we got a high correlation score, as we can see in target T0889 which got correlation score of 0.979020819. We can see in the graph that the actual quality of the protein and the predicted quality are correlated. target T0889
Good quality Bad quality
Bad prediction When we made a bad prediction for protein structure we got a low correlation score, as we can see in target T0865 which got correlation score of -0.096035055. We can see in the graph that we don’t have any match between our proposed quality and the actual quality. Target T0865
Correlations of all targets
Thank you all for your time and attention.