Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modelling proteins and proteomes using Linux clusters Ram Samudrala University of Washington.

Similar presentations


Presentation on theme: "Modelling proteins and proteomes using Linux clusters Ram Samudrala University of Washington."— Presentation transcript:

1 Modelling proteins and proteomes using Linux clusters Ram Samudrala University of Washington

2 Examples of biological problems Protein structure prediction/docking simulations - need to run different trajectories that sometimes talk with each other Molecular dynamics simulations - need more cohesive parallelisation Polarisable force fields - need true parallelisation Bioinformatics searches/exploration - trivially parallelisable

3 Computational issues Need efficient methods to start/stop jobs Need load/balancing queuing system Need fast communications at times Need stability (months/years uptimes) Need low maintainance/management overhead Need low installation overhead Needs to be cheap!

4 Hardware and operating system 256 AMD and Intel CPUs (1-2.5 GHz) 0.5-1 GB RAM, 100-200 GB HD, dual processor MBs 100Mbps ethernet connectivity for 64 processor sets White boxes are good but use up space – 1u racks ideal Minimal Linux installation – create clone “CD” – copy on all machines

5 Our solution No single solution – user implements their own Completely decentralised Analyse problem and determine parallelisable parts Implementation specific to problem Use local scratch space for computation Redundant storage of data for faster access Limit problem space to specific problems

6 Problem specific implementation MCSA/GA: socket-based communication of trajectories; multiple trajectories on different CPUs Docking: sample different ligands/regions of the protein on different CPUs MD: Pairwise force-fields are additive PFF: ? Bioinformatics: trivial parallelisation; communication by disk

7 Semi-exhaustive segment-based folding EFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEVK generate fragments from database 14-state ,  model …… minimise monte carlo with simulated annealing conformational space annealing, GA …… filter all-atom pairwise interactions, bad contacts compactness, secondary structure

8 T170/sfrp3 – 4.8 Å for all 69 aa Ab initio prediction at CASP

9 Comparative modelling at CASP T182 – 1.0 Å (249 aa; 41% id)

10 Prediction of SARS CoV proteinase inhibitors Ekachai Jenwitheesuk

11 Bioverse – S. typhimurium protein-protein interaction network Jason McDermott

12 Bioverse – H. sapiens protein-protein interaction network Jason McDermott

13 Future directions Network connection with multiple ethernet cards based on traffic analysis Gigabit ethernet (switches are still expensive) Better network filesystems


Download ppt "Modelling proteins and proteomes using Linux clusters Ram Samudrala University of Washington."

Similar presentations


Ads by Google