Download presentation
Presentation is loading. Please wait.
Published byJillian Daniel Modified over 9 years ago
1
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Protein Folding Landscapes in a Distributed Environment All Hands Meeting, 2001 University of Virginia Andrew Grimshaw Anand Natrajan Scripps (TSRI) Charles L. Brooks III Michael Crowley SDSC Nancy Wilkins-Diehr
2
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Outline CHARMM –Issues Legion The Run –Results –Lessons AmberGrid Summary
3
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE CHARMM Routine exploration of folding landscapes helps in search for protein folding solution Understanding folding critical to structural genomics, biophysics, drug design, etc. Key to understanding cell malfunctions in Alzheimer’s, cystic fibrosis, etc. CHARMM and Amber benefit majority (>80%) of bio-molecular scientists Structural genomic & protein structure predictions
4
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Folding Free Energy Landscape Molecular Dynamics Simulations 100-200 structures to sample (r,R gyr ) space R gyr
5
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Application Characteristics Parameter-space study –Parameters correspond to structures along & near folding path Path unknown - could be many or broad –Many places along path sampled for determining local low free energy states –Path is valley of lowest free energy states from high free energy state of unfolded protein to lowest free energy state (folded native protein)
6
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Folding of Protein L Immunoglobulin-binding protein –62 residues (small), 585 atoms –6500 water molecules, total 20085 atoms –Each parameter point requires O(10 6 ) dynamics steps –Typical folding surfaces require 100-200 sampling runs CHARMM using most accurate physics available for classical molecular dynamics simulation –PME, 9 A o cutoff, heuristic list update, SHAKE Multiple 16-way parallel runs - maximum efficiency
7
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Application Characteristics Many independent runs –200 sets of data to be simulated in two sequential runs Equilibration (4-8 hours) Production/sampling (8 to 16 hours) Each point has task name, e.g., pl_1_2_1_e
8
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Scientists Using Legion Binaries for each type Script for dispatching jobs Script for keeping track of results Script for running binary at site –optional feature in Legion Abstract interface to resources –queues, accounting, firewalls, etc. Binary transfer (with caching) Input file transfer Job submission Status reporting Output file transfer
9
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Legion Complete, Integrated Infrastructure for Secure Distributed Resource Sharing
10
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Grid OS Requirements Wide-area High Performance Complexity Management Extensibility Security Site Autonomy Input / Output Heterogeneity Fault-tolerance Scalability Simplicity Single Namespace Resource Management Platform Independence Multi-language Legacy Support
11
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Transparent System
12
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE npacinet
13
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE The Run
14
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Computational Issues Provide improved response time Access large set of resources transparently –geographically distributed –heterogeneous –different organisations 5 organisations 7 systems 9 queues 5 architectures ~1000 processors
15
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE IBM Blue Horizon SDSC 375MHz Power3 512/1184 IBM Blue Horizon SDSC 375MHz Power3 512/1184 Resources Available HP SuperDome CalTech 440 MHz PA-8700 128/128 HP SuperDome CalTech 440 MHz PA-8700 128/128 IBM SP3 UMich 375MHz Power3 24/24 IBM SP3 UMich 375MHz Power3 24/24 IBM Azure UTexas 160MHz Power2 32/64 IBM Azure UTexas 160MHz Power2 32/64 Sun HPC 10000 SDSC 400MHz SMP 32/64 Sun HPC 10000 SDSC 400MHz SMP 32/64 DEC Alpha UVa 533MHz EV56 32/128 DEC Alpha UVa 533MHz EV56 32/128
16
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Scientists Using Legion Binaries for each type Script for dispatching jobs Script for keeping track of results Script for running binary at site –optional feature in Legion Abstract interface to resources –queues, accounting, firewalls, etc. Binary transfer (with caching) Input file transfer Job submission Status reporting Output file transfer
17
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Mechanics of Runs Legion Register binaries Create task directories & specification Dispatch equilibration Dispatch equilibration & production
18
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Distribution of CHARMM Work
19
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE LEGION Network slowdowns –Slowdown in the middle of the run –100% loss for packets of size ~8500 bytes Site failures –LoadLeveler restarts –NFS/AFS failures Legion –No run-time failures –Archival support lacking –Must address binary differences Problems Encountered UVaSDSCUMich 01101
20
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Successes Science accomplished faster –1 month on 128 SGI Origins @Scripps –1.5 days on national grid with Legion Transparent access to resources –User didn’t need to log on to different machines –Minimal direct interaction with resources Problems identified Legion remained stable –Other Legion users unaware of large runs Large grid application run at powerful resources by one person from local resource Collaboration between natural and computer scientists
21
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE AmberGrid Easy Interface to Grid
22
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Simple point-and-click interface to Grids –Familiar access to distributed file system –Enables & encourages sharing Application portal model for HPC –AmberGrid –RenderGrid –Accounting Legion GUIs Transparent Access to Remote Resources Intended Audience is Scientists
23
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Logging in to npacinet
24
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE View of contexts (Distributed File System)
25
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Control Panel
26
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Running Amber
27
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Run Status (Legion) Graphical View (Chime)
28
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Summary CHARMM Run –Succeeded in starting big runs –Encountered problems –Learnt lessons for future –Let’s do it again! more processors, systems, organisations AmberGrid –Showed proof-of-concept - grid portal –Need to resolve licence issues
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.