TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGU, Vienna, May 3, 2010.

Slides:



Advertisements
Similar presentations
1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
Advertisements

NG-CHC Northern Gulf Coastal Hazards Collaboratory Simulation Experiment Integration Sandra Harper 1, Manil Maskey 1, Sara Graves 1, Sabin Basyal 1, Jian.
Earth System Curator Spanning the Gap Between Models and Datasets.
SAN DIEGO SUPERCOMPUTER CENTER The Integration of 2 Science Gateways: CyberGIS + OpenTopography Choonhan Youn, Nancy Wilkins-Diehr, SDSC Christopher Crosby,
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NBCR Summer Institute, August 4, 2009.
The ADAMANT Project: Linking Scientific Workflows and Networks “Adaptive Data-Aware Multi-Domain Application Network Topologies” Ilia Baldine, Charles.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Computing in Atmospheric Sciences, September.
(e)Science-Driven, Production- Quality, Distributed Grid and Cloud Data Infrastructure for the Transformative, Disruptive, Revolutionary, Next-Generation.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
2. Point Cloud x, y, z, … Complete LiDAR Workflow 1. Survey 4. Analyze / “Do Science” 3. Interpolate / Grid USGS Coastal & Marine.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Core Services I & II David Hart Area Director, UFP/CS TeraGrid Quarterly Meeting December 2008.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
CCSM Portal/ESG/ESGC Integration (a PY5 GIG project) Lan Zhao, Carol X. Song Rosen Center for Advanced Computing Purdue University With contributions by:
TeraGrid Science Gateways Nancy Wilkins-Diehr Area Director for Science Gateways TeraGrid 09 Education Program, June 22, 2009.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
NE II NOAA Environmental Software Infrastructure and Interoperability Program Cecelia DeLuca Sylvia Murphy V. Balaji GO-ESSP August 13, 2009 Germany NE.
TeraGrid Resources Enabling Scientific Discovery Through Cyberinfrastructure (CI) Diane Baxter, Ph.D. San Diego Supercomputer Center University of California,
TeraGrid Science Gateways in Portland Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SC09, November 14-20, 2009.
TeraGrid Science Gateways in Barcelona Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGEE 2009, September 21-25, 2009.
CloudCom Software for Science Gateways: Open Grid Computing Environments Marlon Pierce, Suresh Marru Pervasive Technology Institute Indiana University.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid 10, August 2-5, 2010 South Tenth Street Bridge, Pittsburgh.
Gateways Tutorial Outline: Morning Session Overview: Marlon Pierce OGCE Introduction: Marlon Demos for OGCE Gadget Container, GFAC Application Factory,
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
Planning for Arctic GIS and Geographic Information Infrastructure Sponsored by the Arctic Research Support and Logistics Program 30 October 2003 Seattle,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Cyber-GIS Workshop, February 2-3, 2010 White.
1 Preparing Your Application for TeraGrid Beyond 2010 TG09 Tutorial June 22, 2009.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways MURPA lecture, March 12, 2010.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Research and Educational Networking and Cyberinfrastructure Russ Hobby, Internet2 Dan Updegrove, NLR University of Kentucky CI Days 22 February 2010.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Where to find LiDAR: Online Data Resources.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Navajo Technical College, September 25, 2008.
1 CReSIS Lawrence Kansas February Geoffrey Fox (PI) Computer Science, Informatics, Physics Chair Informatics Department Director Digital Science.
1 TeraGrid and the Path to Petascale John Towns Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing Applications.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008.
ISERVOGrid Architecture Working Group Brisbane Australia June Geoffrey Fox Community Grids Lab Indiana University
GEOSCIENCE NEEDS & CHALLENGES Dogan Seber San Diego Supercomputer Center University of California, San Diego, USA.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
7. Grid Computing Systems and Resource Management
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins-Diehr
PEER 2003 Meeting 03/08/031 Interdisciplinary Framework Major focus areas Structural Representation Fault Systems Earthquake Source Physics Ground Motions.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
TG ’08, June 9-13, State of TeraGrid John Towns Co-Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing.
Purdue RP Highlights TeraGrid Round Table November 5, 2009 Carol Song Purdue TeraGrid RP PI Rosen Center for Advanced Computing Purdue University.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Cyberinfrastructure and PolarGrid
rvGAHP – Push-Based Job Submission Using Reverse SSH Connections
CReSIS Cyberinfrastructure
Presentation transcript:

TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGU, Vienna, May 3, 2010

TeraGrid is one of the largest investments in shared CI from NSF’s Office of Cyberinfrastructure EGU, Vienna, May 3, 2010

TeraGrid resources today include: Tightly Coupled Distributed Memory Systems, 2 systems in the top 10 at top500.org –Kraken (NICS): Cray XT5, 99,072 cores, 1.03 Pflop –Ranger (TACC): Sun Constellation, 62,976 cores, 579 Tflop, 123 TB RAM Shared Memory Systems –Cobalt (NCSA): Altix, 8 Tflop, 3 TB shared memory –Pople (PSC): Altix, 5 Tflop, 1.5 TB shared memory Clusters with Infiniband –Abe (NCSA): 90 Tflops –Lonestar (TACC): 61 Tflops –QueenBee (LONI): 51 Tflops Condor Pool (Loosely Coupled) –Purdue- up to 22,000 cpus Gateway hosting –Quarry (IU): virtual machine support Visualization Resources –TeraDRE (Purdue): 48 node nVIDIA GPUs –Spur (TACC): 32 nVIDIA GPUs Storage Resources –GPFS-WAN (SDSC) –Lustre-WAN (IU) –Various archival resources EGU, Vienna, May 3, 2010 Source: Dan Katz, U Chicago New systems: Data Analysis and Vis systems Longhorn (TACC): Dell/NVIDIA, CPU and GPU Nautilus (NICS): SGI UltraViolet, 1024 cores, 4TB global shared memory Data-Intensive Computing Dash (SDSC): Intel Nehalem, 544 processors, 4TB flash memory FutureGrid Experimental computing grid and cloud test-bed to tackle research challenges in computer science Keeneland Experimental, high-performance computing system with NVIDIA Tesla accelerators

So how did the Gateway program develop? A natural result of the impact of the internet on worldwide communication and information retrieval Implications on the conduct of science are still evolving –1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by , still a working portal today –1989 World Wide Web developed at CERN –1992 Mosaic web browser developed –1995 “International Protein Data Bank Enhanced by Computer Browser” –2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program –Today, Web 3.0 and programmatic exchange of data between web pages Simultaneous explosion of digital information –Growing analysis needs in many, many scientific areas –Sensors, telescopes, satellites, digital images, video, genomic sequencers –#1 machine on Top500 today over 1000x more powerful than all combined entries on the first list in 1993 EGU, Vienna, May 3, 2010 Only 18 years since the release of Mosaic!

vt100 in the 1980s and a login window on Ranger today EGU, Vienna, May 3, 2010

Why are gateways worth the effort? Increasing range of expertise needed to tackle the most challenging scientific problems –How many details do you want each individual scientist to need to know? PBS, RSL, Condor Coupling multi-scale codes Assembling data from multiple sources Collaboration frameworks EGU, Vienna, May 3, 2010 #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg- login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager- pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1

Today, there are approximately 35 gateways using the TeraGrid EGU, Vienna, May 3, 2010

Linked Environments for Atmospheric Discovery (LEAD) Providing tools that are needed to make accurate predictions of tornados and hurricanes Meteorological data Forecast models Analysis and visualization tools Data exploration and Grid workflow EGU, Vienna, May 3, 2010

Highlights: LEAD Inspires Students Advanced capabilities regardless of location A student gets excited about what he was able to do with LEAD “Dr. Sikora:Attached is a display of 2- m T and wind depicting the WRF's interpretation of the coastal front on 14 February It's interesting that I found an example using IDV that parallels our discussion of mesoscale boundaries in class. It illustrates very nicely the transition to a coastal low and the strong baroclinic zone with a location very similar to Markowski's depiction. I created this image in IDV after running a 5-km WRF run (initialized with NAM output) via the LEAD Portal. This simple 1-level plot is just a precursor of the many capabilities IDV will eventually offer to visualize high-res WRF output. Enjoy! Eric” ( , March 2007) EGU, Vienna, May 3, 2010

Center for Multiscale Modeling of Atmospheric Processes (CMMAP) Improving the representation of cloud processes in climate models Multi-scale modeling framework (MMF) represents cloud processes on their native scales, includes cloud- scale interactions in the simulated physical and chemical processes Gateway in development, but will include simulation on both NSF and DOE resources as well as a digital library EGU, Vienna, May 3, 2010 Results can be evaluated by comparison of simulated and observed cloud-scale processes

Community Climate System Model (CCSM) Makes a world-leading, fully coupled climate model easier to use and available to a wide audience Compose, configure, and submit CCSM simulations to the TeraGrid EGU, Vienna, May 3, 2010

The Earth System Grid (ESG) Integrates supercomputers with large-scale data and analysis servers to create a powerful environment for next generation climate research –Results of long-running, high-resolution (read expensive) simulations made available to national researchers –Key piece of infrastructure 12,000 registered users and over 250 TB of data collections EGU, Vienna, May 3, 2010

This year in TeraGrid Two great tastes that taste great together Environmental Science Gateway Built on ESG and CCSM –Long term vision Semantically enabled environment that includes modeling, simulated and observed data holdings, and visualization and analysis for climate as well as related domains –Short term objectives (by July, 2010) Extend the Earth System Grid-Curator (ESGC) Science Gateway so that CCSM runs can be initiated on the TeraGrid Integrate data publishing and wide-area transport capabilities such that model run datasets and metadata may be published back into ESG, from both the ESGC as well as CCSM Portal Investigate interface strategies that allow ESG to federate with Purdue’s climate model archives, such that Purdue holdings become visible and accessible as part of the ESG collections EGU, Vienna, May 3, 2010

PolarGrid brings CI to polar ice sheet measurement EGU, Vienna, May 3, 2010 Cyberinfrastructure Center for Polar Science (CICPS) –Experts in polar science, remote sensing and cyberinfrastructure –Indiana, ECSU, CReSIS Satellite observations show disintegration of ice shelves in West Antarctica and speed-up of several glaciers in southern Greenland –Most existing ice sheet models, including those used by IPCC cannot explain the rapid changes

Components of PolarGrid –Expedition grid consisting of ruggedized laptops in a field grid linked to a low power multi-core base camp cluster –Prototype and two production expedition grids feed into a 17 Teraflops system at Indiana University and Elizabeth City State (ECSU) split between research, education and training. –Gives ECSU (a minority serving institution) a top-ranked 5 Teraflop high performance computing system Access to expensive data TeraGrid resources for analysis –Large level 0 and level 1 data sets require once and done processing and storage –Filters applied to level 1 data by users in real time via the web Student involvement EGU, Vienna, May 3, 2010 Source: Geoffrey Fox

Southern California Earthquake Consortium (SCEC) Gateway used to produce realistic hazard map Probabilistic Seismic Hazard Analysis (PSHA) map for California –Created from Earthquake Rupture Forecasts (ERC) ~7000 ruptures can have 415,000 variations Warm colors indicate regions with a high probability of experiencing strong ground motion in the next 50 years Ground motion calculated using full 3-D waveform modeling for improved accuracy –Results in significant CPU use EGU, Vienna, May 3, 2010

SCEC: Why a gateway? Calculations need to be done for each of the hundreds of thousands of rupture variations –SCEC has developed the “CyberShake computational platform” Hardware, software and people which combine to produce a useful scientific result –For each site of interest - two large-scale MPI calculations and hundreds of thousands of independent post-processing jobs with significant data generation »Jobs aggregated to appear as a single job to the TeraGrid »Workflow throughput optimizations and use of SCEC’s gateway “platform” reduced time to solution by a factor of three –Computationally-intensive tasks, plus the need for reduced time to solution is a priority make TeraGrid a good fit EGU, Vienna, May 3, 2010 Source: S. Callahan et.al. “Reducing Time-to-Solution Using Distributed High-Throughput Mega- Workflows – Experiences from SCEC CyberShake”.

Future Technical Areas Web technologies change fast –Must be able to adapt quickly Gateways and gadgets –Gateway components incorporated into any social networking page –75% of 18 to 24 year-olds have social networking websites iPhone apps? Web 3.0 –Beyond social networking and sharing content –Standards and querying interfaces to programmatically share data across sites Resource Description Framework (RDF), SPARQL EGU, Vienna, May 3, 2010

Gateways can further investments in other projects Increase access –To instruments, expensive data collections Increase capabilities –To analyze data Improve workforce development –Can prepare students to function in today’s cross-disciplinary world Increase outreach Increase public awareness –Public sees value in investments in large facilities –Pew 2006 study indicates that half of all internet users have been to a site specializing in science –Those who seek out science information on the internet are more likely to believe that scientific pursuits have a positive impact on society EGU, Vienna, May 3, 2010

But, sustained funding is a problem Gateways can be used for the most challenging problems, but –Scientists won’t rely on something that they are not confident will be around for the duration We see this with software, but even more so with gateway infrastructure A sustained gateway program can –Reduce duplication of effort Sporadic development with many small programs –Increase diversity of end users –Increase skill set diversity of developers –Bring together teams to address the toughest problems EGU, Vienna, May 3, 2010

Thank you for your attention! Questions? Nancy Wilkins-Diehr,