Download presentation
Presentation is loading. Please wait.
1
David Hoover Scientific Computing Branch, Division of Computer System Services CIT, NIH Swarms and Bundles: Bioinformatics and Biostatistics on Biowulf
2
Embarrassingly Parallel Problems GWAS, with huge numbers of SNPs Sequence analysis, assembly, and mapping Testing and validating statistical models Protein folding and threading Molecular docking and compound screening Tomographic reconstruction
3
Tsai et al., Mol. Biochem. Parasitology, online preprint 2008 Protein folding calculations with Rosetta++ 100,000 cpu hours Characterization of Surface Protein 3 from Malaria Parasite P. Falciparum
4
How to run multiple independent processes in parallel 16 independent processes input command outputinputoutput command
5
Biowulf Cluster Batch System batch job1 job1.out script batch job16 job16.out script
6
Node 1Node 2Node 3Node 4 job1job2job3job4 job1.outjob2.outjob3.outjob4.out biowulf% swarm -f file Swarm
7
Node 1 job1 job1.out biowulf% swarm -f file -b 4 Bundled Swarm
8
Swarm Facts Written and maintained by Helix Systems Staff swarm introduced in late 2000 82% of all batch jobs run on the cluster since 2002 are swarm jobs ~60% of all wall time spent on swarm jobs swarm has been shared with clusters around the world
9
Swarm World Records Largest swarm: 683,445 commands Largest bundle: 24,000 commands per CPU
10
Future Challenges How to deal with larger multicore nodes? Node 1 Node 2Node 3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.