N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Protein Folding Landscapes in a Distributed Environment All Hands Meeting, 2001 University.

Slides:



Advertisements
Similar presentations
PRAGMA BioSciences Portal Raj Chhabra Susumu Date Junya Seo Yohei Sawai.
Advertisements

1 Use Cases Application provisioning (version control) Workload management/load-balancing (server consolidation) Data Federation/sharing E-utilities (provisioning.
M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.
Data Grids Jon Ludwig Leor Dilmanian Braden Allchin Andrew Brown.
An Overview of the Amoeba Distributed Operating System Mallikarjuna Reddy Srinivas Vadlamani University of California Irvine.
Legion: The Grid OS Architecture and User View Anand Natrajan ( ) Marty Humphrey ( ) The Legion Project, University.
A Computation Management Agent for Multi-Institutional Grids
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
Active Directory: Final Solution to Enterprise System Integration
Network Management Overview IACT 918 July 2004 Gene Awyzio SITACS University of Wollongong.
Using Metacomputing Tools to Facilitate Large Scale Analyses of Biological Databases Vinay D. Shet CMSC 838 Presentation Authors: Allison Waugh, Glenn.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Microsoft Virtual Server 2005 Product Overview Mikael Nyström – TrueSec AB MVP Windows Server – Setup/Deployment Mikael Nyström – TrueSec AB MVP Windows.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Legion Worldwide virtual computer. About Legion Made in University of Virginia Object-based metasystems software project middleware that connects computer.
Workload Management Massimo Sgaravatto INFN Padova.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Discovery Environments Susan L. Graham Chief Computer Scientist Peter.
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Brains to Bays --Scaleable Visualization Toolkits Arthur J. Olson Interaction Environments.
University of Virginia Experiences with NMI at the University of Virginia NMI Integration Testbed: Experiences in Middleware Deployment Spring 2003 Internet2.
DISTRIBUTED COMPUTING
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
DCE (distributed computing environment) DCE (distributed computing environment)
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Molecular Science in NPACI Russ B. Altman NPACI Molecular Science Thrust Stanford Medical.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Collaborative Tools for Grid Support Laura F. M c Ginnis Pittsburgh Supercomputing Center Pittsburgh, PA, USA July 20, 2002 NeSC Workshop on Applications.
Backdrop Particle Paintings created by artist Tom Kemp September Grid Information and Monitoring System using XML-RPC and Instant.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002.
Framework for MDO Studies Amitay Isaacs Center for Aerospace System Design and Engineering IIT Bombay.
1/20 Study of Highly Accurate and Fast Protein-Ligand Docking Method Based on Molecular Dynamics Reporter: Yu Lun Kuo
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
2.1 © 2004 Pearson Education, Inc. Exam Designing a Microsoft ® Windows ® Server 2003 Active Directory and Network Infrastructure Lesson 2: Examining.
Studying Protein Folding on the Grid: Experiences Using CHARMM on NPACI Resources under Legion University of Virginia Anand Natrajan Marty A. Humphrey.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
== Enovatio Delivers a Scalable Project Management Solution Minus Large Upfront Infrastructure Costs, Thanks to the Powerful Microsoft Azure Platform MICROSOFT.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Enabling the use of e-Infrastructures with.
ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart
Capacity and Capability Computing using Legion Anand Natrajan ( ) The Legion Project, University of Virginia (
AMH001 (acmse03.ppt - 03/7/03) REMOTE++: A Script for Automatic Remote Distribution of Programs on Windows Computers Ashley Hopkins Department of Computer.
A Remote Collaboration Environment for Protein Crystallography HEPiX-HEPNT Conference, 8 Oct 1999 Nicholas Sauter, Stanford Synchrotron Radiation Laboratory.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
IBM Express Runtime Quick Start Workshop © 2007 IBM Corporation Deploying a Solution.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Chapter 1 Characterization of Distributed Systems
Web: Parallel Computing Rabie A. Ramadan , PhD Web:
Hierarchical Architecture
Many-core Software Development Platforms
A Characterization of Approaches to Parrallel Job Scheduling
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
Presentation transcript:

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Protein Folding Landscapes in a Distributed Environment All Hands Meeting, 2001 University of Virginia Andrew Grimshaw Anand Natrajan Scripps (TSRI) Charles L. Brooks III Michael Crowley SDSC Nancy Wilkins-Diehr

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Outline CHARMM –Issues Legion The Run –Results –Lessons AmberGrid Summary

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE CHARMM Routine exploration of folding landscapes helps in search for protein folding solution Understanding folding critical to structural genomics, biophysics, drug design, etc. Key to understanding cell malfunctions in Alzheimer’s, cystic fibrosis, etc. CHARMM and Amber benefit majority (>80%) of bio-molecular scientists Structural genomic & protein structure predictions

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Folding Free Energy Landscape Molecular Dynamics Simulations structures to sample (r,R gyr ) space R gyr 

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Application Characteristics Parameter-space study –Parameters correspond to structures along & near folding path Path unknown - could be many or broad –Many places along path sampled for determining local low free energy states –Path is valley of lowest free energy states from high free energy state of unfolded protein to lowest free energy state (folded native protein)

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Folding of Protein L Immunoglobulin-binding protein –62 residues (small), 585 atoms –6500 water molecules, total atoms –Each parameter point requires O(10 6 ) dynamics steps –Typical folding surfaces require sampling runs CHARMM using most accurate physics available for classical molecular dynamics simulation –PME, 9 A o cutoff, heuristic list update, SHAKE Multiple 16-way parallel runs - maximum efficiency

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Application Characteristics Many independent runs –200 sets of data to be simulated in two sequential runs Equilibration (4-8 hours) Production/sampling (8 to 16 hours) Each point has task name, e.g., pl_1_2_1_e

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Scientists Using Legion Binaries for each type Script for dispatching jobs Script for keeping track of results Script for running binary at site –optional feature in Legion Abstract interface to resources –queues, accounting, firewalls, etc. Binary transfer (with caching) Input file transfer Job submission Status reporting Output file transfer

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Legion Complete, Integrated Infrastructure for Secure Distributed Resource Sharing

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Grid OS Requirements Wide-area High Performance Complexity Management Extensibility Security Site Autonomy Input / Output Heterogeneity Fault-tolerance Scalability Simplicity Single Namespace Resource Management Platform Independence Multi-language Legacy Support

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Transparent System

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE npacinet

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE The Run

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Computational Issues Provide improved response time Access large set of resources transparently –geographically distributed –heterogeneous –different organisations 5 organisations 7 systems 9 queues 5 architectures ~1000 processors

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE IBM Blue Horizon SDSC 375MHz Power3 512/1184 IBM Blue Horizon SDSC 375MHz Power3 512/1184 Resources Available HP SuperDome CalTech 440 MHz PA /128 HP SuperDome CalTech 440 MHz PA /128 IBM SP3 UMich 375MHz Power3 24/24 IBM SP3 UMich 375MHz Power3 24/24 IBM Azure UTexas 160MHz Power2 32/64 IBM Azure UTexas 160MHz Power2 32/64 Sun HPC SDSC 400MHz SMP 32/64 Sun HPC SDSC 400MHz SMP 32/64 DEC Alpha UVa 533MHz EV56 32/128 DEC Alpha UVa 533MHz EV56 32/128

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Scientists Using Legion Binaries for each type Script for dispatching jobs Script for keeping track of results Script for running binary at site –optional feature in Legion Abstract interface to resources –queues, accounting, firewalls, etc. Binary transfer (with caching) Input file transfer Job submission Status reporting Output file transfer

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Mechanics of Runs Legion Register binaries Create task directories & specification Dispatch equilibration Dispatch equilibration & production

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Distribution of CHARMM Work

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE LEGION Network slowdowns –Slowdown in the middle of the run –100% loss for packets of size ~8500 bytes Site failures –LoadLeveler restarts –NFS/AFS failures Legion –No run-time failures –Archival support lacking –Must address binary differences Problems Encountered UVaSDSCUMich 01101

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Successes Science accomplished faster –1 month on 128 SGI –1.5 days on national grid with Legion Transparent access to resources –User didn’t need to log on to different machines –Minimal direct interaction with resources Problems identified Legion remained stable –Other Legion users unaware of large runs Large grid application run at powerful resources by one person from local resource Collaboration between natural and computer scientists

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE AmberGrid Easy Interface to Grid

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Simple point-and-click interface to Grids –Familiar access to distributed file system –Enables & encourages sharing Application portal model for HPC –AmberGrid –RenderGrid –Accounting Legion GUIs Transparent Access to Remote Resources Intended Audience is Scientists

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Logging in to npacinet

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE View of contexts (Distributed File System)

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Control Panel

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Running Amber

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Run Status (Legion) Graphical View (Chime)

N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Summary CHARMM Run –Succeeded in starting big runs –Encountered problems –Learnt lessons for future –Let’s do it again! more processors, systems, organisations AmberGrid –Showed proof-of-concept - grid portal –Need to resolve licence issues