Cluster Based Protein Folding Douglas Fuller and Brandon McKethan.

Cluster Based Protein Folding Douglas Fuller and Brandon McKethan

Overview: What is Folding@Home?  Screensaver-based distributed computing program run by Stanford  Utilizes unused processing power to fold proteins through a finite number of frames  The same work units are run on numerous computers to confirm accuracy

What Does F@H Accomplish?  Helps in showing the process of linear amino acid chain to 3D protein structure  Is used on proteins involved with many diseases in order to elucidate how misfolding occurs  Will eventually lead to mutation-to- phenotype simulations

Computational Aspects  Monte Carlo simulation using lowest energy state calculations  Non-parallel, unimolecular program  Heuristic approaches

Implications of Heuristics/Unimolecular  The environment of the cell and molecular interactions  Solvents and extramolecular interactions cannot be ignored in the process of folding  Many diseases arise from misfolding that is not influenced by the internal energy state

Diseases of Protein Misfolding May Require Multimolecular Interactions  Cancers – B-RAF, Hsp-90 and 17-AAG  Prions – Infectious Proteins  BSE (Mad Cow Disease)  Kuru  Sheep Scrapie

Computational Weight of Multimolecular Interactions  The number of energy states and inter/intra-molecular interactions are much higher than unimolecular  Pushes the computational return time above appreciable limits for the F@H project  Desktop computing and the Lowest Common Denominator

Cluster Based Folding and Future Aspects of Folding  Use cluster computing as the testing ground for truly parallel simulations  Individual proteins are discrete units  Allows the program to be refined while highly parallel desktop computing comes to fruition  5-10 year timeframe

Post-Multithreading Possibilities  Next truly discrete unit is the atom itself  Atom-per-processor modeling vs. Monte Carlo  Requires incredibly high number of processors – 100’s of thousands  Once again clusters provide testing ground

Parallelizing a computation  Considered “re-bugging” your code  Distribute work to multiple processors  Requires communication to deal with dependencies  Requires computation to distribute work and recombine results  Now what?

Domain Decomposition  Decide how to divide work  Spatially  Temporally  Other?  Introduces overhead  Can pessimize instead of optimize

Cheat!  “Embarassingly parallel” code  Splits naturally into small pieces  Small pieces can ignore each other  Small pieces can be computed by a single node  Folding@Home  Problem: fold all proteins they care about  Decomposition: individual proteins  Dependencies: none!

Domain Decomposition: Challenges  Analyze dependencies  Communication patterns  Communication volume  Data distribution  Overlap computation/communication  Consider system characteristics  Communication latency/bandwidth  Computational efficiency  Computation/communication ratio  Do this all ahead of time?

Domain Decomposition: Pitfalls  Parallel overhead  Computation waiting on communication  Feed-forward dependencies  Dynamic decomposition schemes  Pick two: performance, portability, scalability

Cluster Based Protein Folding Douglas Fuller and Brandon McKethan.

Similar presentations

Presentation on theme: "Cluster Based Protein Folding Douglas Fuller and Brandon McKethan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cluster Based Protein Folding Douglas Fuller and Brandon McKethan.

Similar presentations

Presentation on theme: "Cluster Based Protein Folding Douglas Fuller and Brandon McKethan."— Presentation transcript:

Similar presentations

About project

Feedback