Presentation is loading. Please wait.

Presentation is loading. Please wait.

AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance.

Similar presentations


Presentation on theme: "AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance."— Presentation transcript:

1 AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance Computing University of Washington Dept. of Physics Thanks to TeraGrid community (the source of many of these slides) esp. Daniel S. Katz, LSU

2 University of Washington – e-Science Overview TeraGrid is US national resource Funded by the NSF Office of Cyberinfrastructure Gives any researcher in the U.S. access to leading-edge computational resources Detailed info about the TeraGrid How to start using the TeraGrid 2

3 University of Washington – e-Science What is Cyberinfrastructure? “Cyberinfrastructure is a technological solution to the problem of efficiently connecting data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge.” 1 Term was used by the NSF Blue Ribbon committee in 2003 in response to the question: “How can NSF… remove existing barriers to the rapid evolution of high performance computing, making it truly usable by all the nation's scientists, engineers, scholars, and citizens?” The TeraGrid 2 is the NSF’s response to this question. Cyberinfrastructure is also called e-Science 3 1 Source: Wikipedia 2 More properly, the TeraGrid in it’s current form: the “Extensible Terascale Facility” 3 Source: NSF

4 University of Washington – e-Science What is the TeraGrid? World’s largest infrastructure for open scientific discovery Leadership class resources at eleven partner sites combined to create an integrated, persistent computational resource –High-performance networks –High-performance computers (>750 TFlops) –Visualization systems –Data resources and tools (>30 PB, >100 discipline-specific databases) –Science Gateways –User portal –User services - Help desk, training, advanced app support Allocated through national peer-review process (It’s free!) 4 ~2 PFlops in another year!

5 University of Washington – e-Science TeraGrid Resources 5

6 University of Washington – e-Science TeraGrid Systems 2007-8 Computational Resources (size approximate - not to scale) Slide Courtesy Tommy Minyard, TACC SDSC TACC UC/ANL NCSA ORNL PU IU PSC NCAR 2007 (504TF) 2008 (~1PF) Tennesse e LONI/LSU 6

7 University of Washington – e-Science Who Uses TeraGrid Molecular Biosciences 31% Chemistry 17% Physics 17% Astronomical Sciences 12% Materials Research 6% Earth Sciences 3% All 19 Others 4% Advanced Scientific Computing 2% Atmospheric Sciences 3% Chemical, Thermal Systems 5% 7

8 University of Washington – e-Science How TeraGrid Is Used 8 Use Modality Community Size (rough est. - number of users) Batch Computing on Individual Resources 850 Exploratory and Application Porting 650 Workflow, Ensemble, and Parameter Sweep 250 Science Gateway Access 500 Remote Interactive Steering and Visualization 35 Tightly-Coupled Distributed Computation 10 2006 data from

9 University of Washington – e-Science TeraGrid Usage 33% Annual Growth Specific AllocationsRoaming Allocations 200 100 Normalized Units (millions) TeraGrid currently delivers an average of 420,000 cpu-hours per day Dave Hart (dhart@sdsc.edu)

10 University of Washington – e-Science TG usage: Predicting storms 10 Hurricanes and tornadoes cause massive loss of life and damage to property TeraGrid supported spring 2007 NOAA and University of Oklahoma Hazardous Weather Testbed –Major Goal: assess how well ensemble forecasting predicts thunderstorms, including the supercells that spawn tornadoes –Nightly reservation at PSC –Delivers “better than real time” prediction –Used 675,000 CPU hours for the season –Used 312 TB on HPSS storage at PSC Slide courtesy of Dennis Gannon, IU, and LEAD Collaboration

11 University of Washington – e-Science TG Usage: Gravitational Waves Observations Models Analysis & Insight Visualization Credits: Werner Benger, Ralf Kaehler, LSU/AEI Data Simulation Credits: LSU/AEI relativity groups Cactus Gravitational waves predicted from colliding black holes, neutron stars, supernovae

12 University of Washington – e-Science TG Usage: Cosmology The Cosmic Web Cosmological evolution and galaxy formation using a 3D cosmological n-body gravity+hydrodynamics code, Gasoline. 100 million light years Credits: Tom Quinn, Jeff Gardner, Univ. of Washington

13 University of Washington – e-Science TG usage: Biology 13 High resolution 3-D reconstruction of infectious viruses Wen Jiang, Weimin Wu, Purdue University Matthew L. Baker, Joanita Jakana and Wah Chiu, Baylor College of Medicine Peter R. Weigele and Jonathan King, MIT High resolution 3-D structure of virus particles provide important insights to the development of effective prevention and treatment strategies. This work used the electron cryo-microscopy to demonstrate the 3-D reconstruction of the infectious bacterial virus ε15 at 4.5 Å resolution, which allowed tracing of the polypeptide backbone of its major capsid protein gp7. The structure reveals similar protein architecture to that of other tailed double-stranded DNA viruses, even in the absence of detectable sequence similarity. However, the connectivity of the secondary structure elements (topology) in gp7 is unique. Large numbers (10 4 -10 5 ) of 2-D images (800 2 pixels/image), representing the projections of identical 3-D structure viewed at different angles, were collected. These images require intensive computation to accurately determine their relative orientations before the 2-D images can be coherently merged into a single high resolution 3-D structure. These results have been just published on Nature (Feb 28, 2008). Slide courtesy of Purdue and TeraGrid

14 University of Washington – e-Science Solve any Rubik’s Cube in 26 moves? Rubik's Cube is perhaps the most famous combinatorial puzzle of its time > 43 quintillion states (4.3x10^19) Gene Cooperman and Dan Kunkle of Northeastern Univ. proved any state can be solved in 26 moves 7TB of distributed storage on TeraGrid allowed them to develop the proof Source: http://www.physorg.com/news99843195.html

15 University of Washington – e-Science Community Engagement through Science Gateways Increasing investment by communities in their own cyberinfrastructure, but heterogeneous: –Resources –Users – from expert to K-12 –Software stacks, policies Three common forms: –Web portal with users in front and services in back –Client server model where application programs run on users' machines (i.e. desktops) and access services –Bridges across multiple grids, allowing communities to utilize both community developed grids and shared grids Science Gateways –Provide “TeraGrid Inside” capabilities –Leverage community investment Slide courtesy of Nancy Wilkins-Diehr 15

16 University of Washington – e-Science Current Science Gateways Biology and Biomedicine Science Gateway Open Life Sciences Gateway The Telescience Project Grid Analysis Environment (GAE) Neutron Science Instrument Gateway TeraGrid Visualization Gateway, ANL BIRN Open Science Grid (OSG) Special PRiority and Urgent Computing Environment (SPRUCE) National Virtual Observatory (NVO) Linked Environments for Atmospheric Discovery (LEAD) Computational Chemistry Grid (GridChem) Computational Science and Engineering Online (CSE-Online) GEON(GEOsciences Network) Network for Earthquake Engineering Simulation (NEES) SCEC Earthworks Project Network for Computational Nanotechnology and nanoHUB GIScience Gateway (GISolve) Gridblast Bioinformatics Gateway Earth Systems Grid Astrophysical Data Repository (Cornell) Slide courtesy of Nancy Wilkins-Diehr 16

17 University of Washington – e-Science SGW Highlight: National Virtual Observatory - Facilitating Scientific Discovery 17 Access to telescope images from around the world NVO provides access to combined sky surveys –Different views of the same cosmological phenomenon can reveal new insights New science enabled by enhancing access to data and computing resources –Data correlation –Understanding of physical processes –Identification of new phenomenon NVO is a set of tools used to exploit the data avalanche

18 University of Washington – e-Science SGW Highlight: Linked Environments for Atmospheric Discovery (LEAD) Providing tools that are needed to make accurate predictions of tornados and hurricanes Meteorological data Forecast models Analysis and visualization tools Data exploration and Grid workflow Slide courtesy of Nancy Wilkins-Diehr 18

19 University of Washington – e-Science SGW Highlight: GridChem’s Client-Server Approach Provides Power and a Rich Feature Set Slide courtesy of Sudhakar Pamidighantam, NCSA 19

20 University of Washington – e-Science Gateways 20 0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 Jan-07 Feb-07 Mar-07 Apr-07 May-07 Jun-07 Jul-07 Aug-07 Sep-07 Oct-07 Nov-07 Dec-07 # of Gateway Jobs Nearly 500k gateways jobs in CY2007 GridChem: 192k jobs, >210k TG SUs CIGportal: 94k jobs, >154k TG SUs LEAD: 40k jobs, >54k TG SUs Slide courtesy of Nancy Wilkins-Diehr

21 University of Washington – e-Science TG New Large Resources Ranger@TACC –First NSF ‘Track2’ HPC system –504 TFlops –15,744 Quad-Core AMD Opteron processors –123 TB memory, 1.7 PB disk Kraken@NICS (UT/ORNL) –Second NSF ‘Track2’ HPC system –170 TFlops Cray XT4 system –Will be upgraded to Cray XT5 at nearly 1 PFlops 10,000+ compute sockets 100 TB memory, 2.3 PB disk 21

22 University of Washington – e-Science So how do I get on the TeraGrid? The best thing to do: Talk to your local TeraGrid “Campus Champion” (for UW, that’s me) Campus Champion can: –Direct you to the most appropriate TeraGrid platforms –Give you an experimental TeraGrid account –Help you write proposals to acquire TeraGrid time

23 University of Washington – e-Science TeraGrid Resource Allocations Every TeraGrid award of time is either: –System-specific (“Type S”): Time is awarded for a specific system, e.g. PSC Cray XT3. –TeraGrid Roaming (“Type R”): Time is awarded that can be used on any* TeraGrid system. TeraGrid time is awarded in “Service Units” or “SUs”. SUs correspond roughly to CPU-hours: –For system-specific awards, 1 SU = 1 CPU-hour on that machine. –For TeraGrid Roaming awards, 1 SU = 1 CPU-hour on a 1.5GHz Itanium2 system and will be converted for whichever machine you actually run on. *With a few exceptions

24 University of Washington – e-Science TeraGrid Resource Allocations The easiest type of allocation to get is a “Development Allocation” or “DAC”*: –Currently DACs are 30,000 SUs –Submit a single page (i.e. no more than 3 paragraphs) description of your research and your goals for trying TeraGrid. –Development applications are reviewed and awarded continuously –You will be up and running within a few weeks. *DAC = “Development Allocation Committee”

25 University of Washington – e-Science POPS - Allocations POPS is the on-line system used for the allocations process (pops-submit.teragrid.org) –Allocation Requests –Peer reviews –Usage information pops.teragrid.org for now – also accessible from the TeraGrid user portal (portal.teragrid.org) 25

26 University of Washington – e-Science Allocation Process Types of Requests –DAC (up to 30k SUs, continually reviewed) DACS on Ranger can be larger, same for other new large resources (Kraken, Track2c, etc.) –MRAC (up to 500k SUs, reviewed quarterly) –LRAC (over 500k SUs, reviewed semi-annually) –Can apply for compute, data, support resources –Also, there are community accounts… Awards –Most awards are granted in full! –For one or more 12 month periods – can be renewed –Can rebut reviewers who reject or cut award 26

27 University of Washington – e-Science How You Can Use TeraGrid Compute Service Viz Service Data Service Network, Accounting, … Site 1 Site 3 Site 2 TeraGrid Infrastructure (Accounting, Network, Authorization,…) POPS (for now) Science Gateways User Portal Command Line Slide courtesy of Dane Skow and Craig Stewart 27

28 University of Washington – e-Science User Portal: portal.teragrid.org 28

29 University of Washington – e-Science Access to resources Terminal: ssh, gsissh Portal: TeraGrid user portal, Gateways –Once logged in to portal, click on “Login” 29

30 University of Washington – e-Science User Portal – Compute/Viz Resources 30

31 University of Washington – e-Science User Portal – Other Resources 31

32 University of Washington – e-Science User Portal – Other Information 32 Knowledge Base for quick answers to technical questions Documentation Science Highlights News and press releases Education, outreach and training events and resources

33 University of Washington – e-Science Data Storage Resources GPFS-WAN –700 TB disk storage at SDSC, accessible from machines at NCAR, NCSA, SDSC, ANL Data Capacitor –535 TB storage at IU, including databases Data Collections –Storage at SDSC (files, databases) for collections used by communities Tape Storage –Available at IU, NCAR, NCSA, SDSC, PSC Access is generally through GridFTP Typical data transfer speeds are 100MB/s! 33

34 University of Washington – e-Science Conclusions TeraGrid is not a secret government agency Just a collection of universities working together, funded by NSF Currently, an abundance of cycles available Talk to current users/participants or your Campus Champion for help with proposals and other information 34


Download ppt "AT LOUISIANA STATE UNIVERSITY University of Washington – e-Science Introduction to the TeraGrid Jeffrey P. Gardner Sr. Research Scientist, High Performance."

Similar presentations


Ads by Google