TeraGrid Resources Enabling Scientific Discovery Through Cyberinfrastructure (CI) Diane Baxter, Ph.D. San Diego Supercomputer Center University of California,

Slides:



Advertisements
Similar presentations
Xsede eXtreme Science and Engineering Discovery Environment Ron Perrott University of Oxford 1.
Advertisements

1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
SACNAS, Sept 29-Oct 1, 2005, Denver, CO What is Cyberinfrastructure? The Computer Science Perspective Dr. Chaitan Baru Project Director, The Geosciences.
June 6, 2007 TeraGrid07 CI-TEAM Advances in the MSI Community and the MSI Cyber Infrastructure Empowerment Coalition MSI-CIEC Richard A. Aló, Presenter,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
Is 'Designing' Cyberinfrastructure - or, Even, Defining It - Possible? Peter A. Freeman National Science Foundation January 29, 2007 The views expressed.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
Simo Niskala Teemu Pasanen
Network, Operations and Security Area Tony Rimovsky NOS Area Director
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
TeraGrid National Cyberinfrasctructure for Scientific Research PRESENTER NAMES AND AFFILIATIONS HERE.
The TeraGrid: An essential tool for 21st century science Craig Stewart, Associate Dean, Research Technologies Chief Operating Officer, Pervasive Technology.
ASQ World Conference on Quality and Improvement May 24-26, 2010, St. Louis, MO Quality in Chaos: a view from the TeraGrid environment John Towns.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
18:15:32Service Oriented Cyberinfrastructure Lab, Grid Deployments Saul Rioja Link to presentation on wiki.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
The TeraGrid David Hart Indiana University AAAS’09, FEBRUARY 13, 2009.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
Advancing Scientific Discovery through TeraGrid Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National.
August 2007 Advancing Scientific Discovery through TeraGrid Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and.
1 TeraGrid ‘10 August 2-5, 2010, Pittsburgh, PA State of TeraGrid in Brief John Towns TeraGrid Forum Chair Director of Persistent Infrastructure National.
August 2007 Advancing Scientific Discovery through TeraGrid Adapted from S. Lathrop’s talk in SC’07
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
Ohio State University Department of Computer Science and Engineering 1 Cyberinfrastructure for Coastal Forecasting and Change Analysis Gagan Agrawal Hakan.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
TeraGrid Overview Cyberinfrastructure Days Internet2 10/9/07 Mark Sheddon Resource Provider Principal Investigator San Diego Supercomputer Center
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
National Computational Science Alliance Expanding Participation in Computing and Communications -- the NSF Partnerships for Advanced Computational Infrastructure.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
SAN DIEGO SUPERCOMPUTER CENTER Impact Requirements Analysis Team Co-Chairs: Mark Sheddon (SDSC) Ann Zimmerman (University of Michigan) Members: John Cobb.
CyberInfrastructure workshop CSG May Ann Arbor, Michigan.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Navajo Technical College, September 25, 2008.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
1 CReSIS Lawrence Kansas February Geoffrey Fox (PI) Computer Science, Informatics, Physics Chair Informatics Department Director Digital Science.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
Joint Meeting of the AUS, US, XS Working Groups TG10 Tuesday August 3, hrs Elwood II.
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
SC06, Tampa FL November 11-17, 2006 Science Gateways on the TeraGrid Powerful Beyond Imagination! Nancy Wilkins-Diehr TeraGrid Area Director for Science.
TeraGrid Quarterly Meeting Arlington, VA Sep 6-7, 2007 NCSA RP Status Report.
Riding the Crest: High-End Cyberinfrastructure Experiences and Opportunities on the NSF TeraGrid A Panel Presentation by Laura M c GinnisRadha Nandkumar.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
Minority-Serving Institutions (MSI) Cyberinfrastructure (CI) Institute [MSI-CI 2 ] and CI Empowerment Coalition MSI-CIEC October Geoffrey Fox
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Sergiu April 2006June 2006 Overview of TeraGrid Resources and Services Sergiu Sanielevici, TeraGrid Area Director for User.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Data, Visualization and Scheduling (DVS) TeraGrid Annual Meeting, April 2008 Kelly Gaither, GIG Area Director DVS.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
National Science Foundation Blue Ribbon Panel on Cyberinfrastructure Summary for the OSCER Symposium 13 September 2002.
AT LOUISIANA STATE UNIVERSITY CCT: Center for Computation & Technology Introduction to the TeraGrid Daniel S. Katz Lead, LONI as a TeraGrid.
October 2007 TeraGrid : Advancing Scientific Discovery and Learning Diane A. Baxter, Ph.D. Education Director San Diego Supercomputer Center University.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
TG ’08, June 9-13, State of TeraGrid John Towns Co-Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing.
SAN DIEGO SUPERCOMPUTER CENTER Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SDSC Director of Consulting,
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
TeraGrid’s Process for Meeting User Needs. Jay Boisseau, Texas Advanced Computing Center Dennis Gannon, Indiana University Ralph Roskies, University of.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Clouds , Grids and Clusters
Tools and Services Workshop
Joslynn Lee – Data Science Educator
CI-TEAM Advances in the MSI Community and the MSI Cyber Infrastructure Empowerment Coalition MSI-CIEC Richard A. Aló, Presenter, PI Co PIs, co Authors:
Cyberinfrastructure and PolarGrid
Data Management Components for a Research Data Archive
Presentation transcript:

TeraGrid Resources Enabling Scientific Discovery Through Cyberinfrastructure (CI) Diane Baxter, Ph.D. San Diego Supercomputer Center University of California, San Diego

To clarify – “Cyberinfrastructure” is a coordinated set of hardware, software, and services, all integrated and working together “CI” encompasses networks, computers, data, sensors, handheld devices, other technologies, and the services or human “glue” that holds them all together. network data computer storage field instrument network computer data network computer viz computer sensors field data wireless The “computer” as an integrated set of resources

TeraGrid National Research Cyberinfrastructure includes: Computing systems, Data storage systems, and data repositories, Visualization environments, and People, all linked together by High Performance Networks. 3

TeraGrid.... Is an open scientific discovery infrastructure Provides leadership class resources at 11 partner sites Is an integrated, persistent computational resource Is the world's largest, most comprehensive distributed cyberinfrastructure for open scientific research.

SDSC TACC UC/ANL NCSA ORNL PU IU PSC NCAR Caltech USC/ISI UNC/RENCI UW Resource Provider (RP) Software Integration Partner Grid Infrastructure Group (UChicago) LSU U Tenn. The National TeraGrid

A complex collaboration of over a dozen organizations working together to provide cyberinfrastructure that goes beyond what can be provided by individual institutions, to improve research productivity and enable breakthroughs not otherwise possible. 6

TeraGrid... Uses high-performance network connections (10-30 Tb/sec) Integrates high-performance computers; data resources for analysis, visualization, and storage; data collection tools, high-end experimental facilities; and supporting expertise around the country Provides more than a petaflop of computing capability Consists of more than 30 petabytes of online and archival data storage, as well as systems to manage data acquisition and access Provides researchers access to over 100 discipline-specific databases.

What’s in it (TeraGrid) for me? Instruments that delivers high-end IT resources - computation, storage, visualization, and data/service –A computational facility – over a PetaFLOP in parallel computing capability –A data storage and management facility - over 30 PetaBytes of storage (disk and tape), over 100 scientific data collections –A high-bandwidth national data network Services: help desk and consulting, Advanced Support for TeraGrid Applications (ASTA), education and training events and resources Access - without financial cost –Research accounts allocated via peer review –Startup and Education accounts automatic 8

TeraGrid Compute Power Computational Resources (size approximate - not to scale) Slide Courtesy Tommy Minyard, TACC SDSC TACC UC/ANL NCSA ORNL PU IU PSC NCAR 2007 (504TF) 2009 (~1PF) Tennessee LONI/LS U 9

10 TG Data storage and management.1 (tape) TeraGrid provides persistent storage on disk and tape Backups of critical data stored remote from your home Allocatable tape-based storage systems: IU (Indiana University) - geographically distributed NCAR (National Center for Atmospheric Research) - also supports dual copy NCSA (National Center for Supercomputing Applications) SDSC (San Diego Supercomputer Center) Note: In addition, most sites have massive data storage systems that provide storage in support of computation Command line usage is reasonably straightforward with GridFTP, very easy with File Manager tool in the TeraGrid User Portal ©Trustees of Indiana University. May be reused so long as IU and TeraGrid logos remain, and any modifications to original are noted. Courtesy Craig A. Stewart, IU

11 Data storage and management.2 (Disk) GPFS-WAN (General Parallel File System Wide Area Network). ~ 1 petabyte –Home at San Diego Supercomputer Center; may be accessed as if it were a local file system from NCAR, NCSA, IU, UC/ANL IU Data Capacitor –1 petabyte of spinning disk –Primarily for short term storage of data Long term disk storage allocations –Indiana University, National Center for Supercomputing Applications, San Diego Supercomputer Center ©Trustees of Indiana University. May be reused so long as IU and TeraGrid logos remain, and any modifications to original are noted. Courtesy Craig A. Stewart, IU

TeraGrid Architecture Compute Service Viz Service Data Service Network, Accounting, … RP 1 RP 3 RP 2 TeraGrid Infrastructure (Network, Authorization, Accounting, …) POPS Science Gateways User Portal Command Line 12

13

????? Translation please!

Enter: Science Gateways A Science Gateway –Enables scientific communities of users with a common scientific goal –Has a common interface –Leverages community investment Three common forms: –Web-based Portals –Application programs running on users' machines but accessing services in TeraGrid –Coordinated access points enabling users to move seamlessly between TeraGrid and other grids. 15

Today, there are approximately 29 gateways using the TeraGrid

How do Gateways help? Makes science more productive –Researchers use same tools –Complex workflows –Common data formats –Data sharing Brings TeraGrid capabilities to the broad science community –Lots of disk space –Lots of compute resources –Powerful analysis capabilities –A community-friendly interface to information and research tools

But it’s not just ease of use. What can scientists do that they couldn’t do previously? LEAD - access to radar data NVO – access to sky surveys OOI – access to sensor data PolarGrid – access to polar ice sheet data SIDGrid – analysis tools GridChem – developing multiscale coupling How would this have been done before gateways?

Gateways can further investments in other projects Increase access –To instruments Increase capabilities –To data analysis tools Improve workforce development –For underserved populations, through broad access to learning resources Increase outreach Increase public awareness –Public sees value in investments in large facilities Slice bread

Gateways Greatly Expand Access Almost anyone can investigate scientific questions using high end resources –Not just those in the research groups of those who request allocations –Gateways allow anyone with a web browser to explore Fosters new ideas, cross-disciplinary approaches Encourages students to experiment But used in production too –Significant number of papers resulting from gateways including GridChem, nanoHUB –Scientists can focus on challenging science problems rather than challenging infrastructure problems

Advanced support for Gateway Development Same peer review process used to request resources –30,000 CPUs –+ 6 months of help from a TG Gateway Team member –Reviews based on appropriate use of resources, science is not reviewed if already funded Petascale Multisite workflows Gateways Domain expertise

Support is Very Targeted Start with well-defined objectives –Focus on efficient or novel use of national CI resources Minimum.25 FTE for months to a year –Enough investment to really understand and help solve complex problems Must have commitment from PIs –Want to make sure work is incorporated into production codes and gateways Good candidates for targeted support include: –Large, high impact projects –Ability to influence new communities –Suggestions from NSF directorates on important projects Lessons learned move into training and documentation

When might a gateway be most appropriate? Researchers using defined sets of tools in different ways –Same executables, different input GridChem, CHARMM –Creating multi-scale or complex workflows –Shared datasets Common data formats –National Virtual Observatory –Earth System Grid –Some groups have invested significant efforts here caBIG, extensive discussions to develop common terminology and formats BIRN, extensive data sharing agreements Difficult to access data/advanced workflows –Sensor/radar input LEAD, GEON

TeraGrid Pathways Activities 2 Gateway components –Adapt gateways for educational use by underrepresented communities GEON – SDSC, Navajo Tech –Teach participants from underrepresented communities how to build gateways PolarGrid – IU, ECSU

Navajo Technical College and gateways Incorporating the use of gateways in their curricula GEON, GISolve areas of initial interest

Work by Emad Tajkhorshid and James Gumbart, of University of Illinois Urbana-Champaign. –Mechanics of Force Propagation in TonB- Dependent Outer Membrane Transport. Biophysical Journal 93: (2007). –Results of the simulation may be seen at 2.5Ans.mpg 2.5Ans.mpg Modeled mechanisms for transport of molecules through cell membrane. Used 400,000 CPU hours [45 processor-years] on systems at National Center for Supercomputing Applications, IU, Pittsburgh Supercomputing Center Image courtesy of Emad Tajkhorshid, UIUC What you can do with the TeraGrid: Simulation of cell membrane processes 26

Predicting storms Hurricanes and tornadoes cause massive loss of life and damage to property TeraGrid supported spring 2007 NOAA and University of Oklahoma Hazardous Weather Testbed –Major Goal: assess how well ensemble forecasting predicts thunderstorms, including supercells  tornadoes. –Delivers “better than real time” prediction –Used 675,000 CPU hours for the season –Used 312 TB on HPSS storage at PSC Slide courtesy of Dennis Gannon, IU, and LEAD Collaboration 27

PolarGrid Cyberinfrastructure Center for Polar Science (CICPS) –Experts in polar science, remote sensing and cyberinfrastructure –Indiana, ECSU, CReSIS Satellite observations show disintegration of ice shelves in West Antarctica and speed-up of several glaciers in southern Greenland –Most existing ice sheet models, including those used by IPCC cannot explain the rapid changes olargrid/images/4/42/C005 0-polargrid-big.m4v Source: Geoffrey Fox

Components of PolarGrid –Expedition grid consisting of ruggedized laptops in a field grid linked to a low power multi-core base camp cluster –Prototype and two production expedition grids feed into a 17 Teraflops "lower 48" system at Indiana University and Elizabeth City State (ECSU) split between research, education and training. –Gives ECSU a top-ranked 5 Teraflop MSI high performance computing system Access to expensive data High-end resources for analysis MSI student involvement Source: Geoffrey Fox

Recent Gateways using TeraGrid Significantly SCEC SIDGrid CIG