Enabling Multiple Sequence Comparison by Log- Expectation (MUSCLE) on EUAsiaGrid EUAsiaGrid Master Class 6 May 2010 By: Lee Hong Kai and Thomas Tay NUHS Molecular Diagnostic Center
Norovirus Main pathogen causing non-bacterial outbreaks of gastroenteritis Easily transmitted in semi-closed communities such as hospitals and long-term care facilities Extremely infectious and a potentially dangerous pathogen when present in immunocompromised patients.
Norovirus Small, round +ssRNA virus of about 50nm Genome size of about 7.5kb, encoding: - major structural protein, VP1 - minor caspid protein, VP2 Genetically and antigenically diverse –genotyping of virus strains to determine epidemiology link of infected patients in an epidemic outbreak or transmission event.
Purpose A need for multiple sequence alignment of norovirus from genogroup I, II and IV ‣ Effective primer probe design ‣ Phylogeny analysis (sequence editing) ‣ SNP analysis ‣ Check sequence variability
Problem ClustalW takes 9hrs for about 1000 sequences MUSCLE is much faster by still limited by memory MUSCLE v3.6 by Robert C. Edgar This software is donated to the public domain.Please cite: Edgar, R.C. Nucleic Acids Res 32(5), noro 6423 seqs, max length 7746, avg length :05:08 10 MB(2%) Iter % K-mer dist pass 1 00:05:10 10 MB(2%) Iter % K-mer dist pass 2 muscle(1066) malloc: *** mmap(size= ) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug *** OUT OF MEMORY *** Memory allocated so far 10 MB
Thank You!