TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Computing in Atmospheric Sciences, September.

Slides:



Advertisements
Similar presentations
Xsede eXtreme Science and Engineering Discovery Environment Ron Perrott University of Oxford 1.
Advertisements

1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
Earth System Curator Spanning the Gap Between Models and Datasets.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NBCR Summer Institute, August 4, 2009.
(e)Science-Driven, Production- Quality, Distributed Grid and Cloud Data Infrastructure for the Transformative, Disruptive, Revolutionary, Next-Generation.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
Simo Niskala Teemu Pasanen
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Core Services I & II David Hart Area Director, UFP/CS TeraGrid Quarterly Meeting December 2008.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
CCSM Portal/ESG/ESGC Integration (a PY5 GIG project) Lan Zhao, Carol X. Song Rosen Center for Advanced Computing Purdue University With contributions by:
TeraGrid Science Gateways Nancy Wilkins-Diehr Area Director for Science Gateways TeraGrid 09 Education Program, June 22, 2009.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
TeraGrid Science Gateways in Portland Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SC09, November 14-20, 2009.
TeraGrid Science Gateways in Barcelona Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGEE 2009, September 21-25, 2009.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid 10, August 2-5, 2010 South Tenth Street Bridge, Pittsburgh.
Gateways Tutorial Outline: Morning Session Overview: Marlon Pierce OGCE Introduction: Marlon Demos for OGCE Gadget Container, GFAC Application Factory,
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Cyber-GIS Workshop, February 2-3, 2010 White.
1 Preparing Your Application for TeraGrid Beyond 2010 TG09 Tutorial June 22, 2009.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways MURPA lecture, March 12, 2010.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Gateway Update for AUS Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways AUS telecon, January 28, 2010.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGU, Vienna, May 3, 2010.
NanoHUB.org and HUBzero™ Platform for Reproducible Computational Experiments Michael McLennan Director and Chief Architect, Hub Technology Group and George.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Navajo Technical College, September 25, 2008.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Soil and Water Conservation Modeling: MODELING SUMMIT SUMMARY COMMENTS Dennis Ojima Natural Resource Ecology Laboratory COLORADO STATE UNIVERSITY 31 MARCH.
1 TeraGrid and the Path to Petascale John Towns Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing Applications.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008.
Leveraging the InCommon Federation to access the NSF TeraGrid Jim Basney Senior Research Scientist National Center for Supercomputing Applications University.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
7. Grid Computing Systems and Resource Management
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
Distributed Data for Science Workflows Data Architecture Progress Report December 2008.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Power and Cooling at Texas Advanced Computing Center Tommy Minyard, Ph.D. Director of Advanced Computing Systems 42 nd HPC User Forum September 8, 2011.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
AT LOUISIANA STATE UNIVERSITY CCT: Center for Computation & Technology Introduction to the TeraGrid Daniel S. Katz Lead, LONI as a TeraGrid.
Climate-SDM (1) Climate analysis use case –Described by: Marcia Branstetter Use case description –Data obtained from ESG –Using a sequence steps in analysis,
Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins-Diehr
TG ’08, June 9-13, State of TeraGrid John Towns Co-Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Purdue RP Highlights TeraGrid Round Table November 5, 2009 Carol Song Purdue TeraGrid RP PI Rosen Center for Advanced Computing Purdue University.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Cyberinfrastructure and PolarGrid
Presentation transcript:

TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Computing in Atmospheric Sciences, September 13-16, 2009

Has anyone heard about TeraGrid or the gateway program? Today I hope to answer –What is the TeraGrid? –What are gateways? Why are gateways worth the effort? What do they allow scientists to do that they couldn't do without gateways? What are some specific examples of this? Why are these examples important? –What about atmospheric sciences? Should I use an existing gateway? Should I develop my own? Meeting the needs of next generation scientists Computing in Atmospheric Sciences, September 13-16, 2009

What is the TeraGrid? A unique combination of fundamental CI components Computing in Atmospheric Sciences, September 13-16, 2009

What is the TeraGrid? Computing in Atmospheric Sciences, September 13-16, 2009 One of the world’s largest grids for open scientific discovery Leadership class resources at eleven partner sites combined to create an integrated, persistent computational resource –High-performance networks –High-performance computers (>1 Pflops (~100,000 cores) -> 1.75 Pflops) And a Condor pool (w/ ~25,000 CPUs) –Visualization systems –Data Collections (>30 PB, >100 discipline-specific databases) –Science Gateways –User portal –User services - Help desk, training, advanced application support Allocated to US researchers and their collaborators through national peer- review process –CPUs and advanced support available at no cost through peer review process Extremely user-driven –MPI jobs, ssh or grid (GRAM) access, etc. Adapted from Dan Katz, TeraGrid

Very Powerful Tightly Coupled Distributed Memory –Kraken (NICS): Cray XT5, 66,048 cores, 608 Tflop, > 1 Pflop in 2009 #6 top500.org –Ranger (TACC): Sun Constellation, 62,976 cores, 579 Tflop, 123 TB RAM #8 top500.org Shared Memory –Cobalt (NCSA): Altix, 8 Tflop, 3 TB shared memory –Pople (PSC): Altix, 5 Tflop, 1.5 TB shared memory Clusters with Infiniband –Abe (NCSA): 90 Tflops –Lonestar (TACC): 61 Tflops –QueenBee (LONI): 51 Tflops Condor Pool (Loosely Coupled) –Purdue: 22,000 CPUs Gateway hosting –Quarry (IU): virtual machine support Visualization Resources –TeraDRE (Purdue): 48 node nVIDIA GPUs –Spur (TACC): 32 nVIDIA GPUs Storage Resources –GPFS-WAN (SDSC) –Lustre-WAN (IU) –Various archival resources Computing in Atmospheric Sciences, September 13-16, 2009

Gateways A natural result of the impact of the internet on worldwide communication and information retrieval Implications on the conduct of science are still evolving –1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by , still a working portal today –1992 Mosaic web browser developed –1995 “International Protein Data Bank Enhanced by Computer Browser” –2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program –Today, Web 3.0 and programmatic exchange of data between web pages Simultaneous explosion of digital information –Growing analysis needs in many, many scientific areas –Sensors, telescopes, satellites, digital images and video –#1 machine on Top500 today is 1000x more powerful than all combined entries on the first list in 1993 Computing in Atmospheric Sciences, September 13-16, 2009 Only 17 years since the release of Mosaic!

Access to supercomputers hasn’t changed much in 20 years But the world around them sure has! Computing in Atmospheric Sciences, September 13-16, 2009

Login window on Ranger today Computing in Atmospheric Sciences, September 13-16, 2009

Why are gateways worth the effort? Increasing range of expertise needed to tackle the most challenging scientific problems –How many details do you want each individual scientist to need to know? PBS, RSL, Condor Coupling multi-scale codes Assembling data from multiple sources Collaboration frameworks Computing in Atmospheric Sciences, September 13-16, 2009 #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg- login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager- pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1

Gateways democratize access to high end resources Almost anyone can investigate scientific questions using high end resources –Not just those in the research groups of those who request allocations –Gateways allow anyone with a web browser to explore Opportunities can be uncovered via google –My 11-year-old son discovered nanoHUB.org himself while his class was studying Bucky Balls Foster new ideas, cross-disciplinary approaches –Encourage students to experiment But used in production too –Significant number of papers resulting from gateways including GridChem, nanoHUB –Scientists can focus on challenging science problems rather than challenging infrastructure problems Computing in Atmospheric Sciences, September 13-16, 2009

Today, there are approximately 35 gateways using the TeraGrid Computing in Atmospheric Sciences, September 13-16, 2009

Not just ease of use What can scientists do that they couldn’t do previously? Linked Environments for Atmospheric Discovery (LEAD) - access to radar data National Virtual Observatory (NVO) – access to sky surveys Ocean Observing Initiative (OOI) – access to sensor data PolarGrid – access to polar ice sheet data SIDGrid – expensive datasets, analysis tools GridChem –coupling multiscale codes How would this have been done before gateways? Computing in Atmospheric Sciences, September 13-16, 2009

Gateways for Atmospheric Sciences –List of all TeraGrid gateways Center for Multiscale Modeling of Atmospheric Processes (CMMAP) – Community Climate System Model (CCSM) TeraGrid Gateway – High Resolution Daily Temperature and Precipitation Data for the Northeast United States – Linked Environments for Atmospheric Discovery (LEAD) – The Earth System Grid (ESG) – Computing in Atmospheric Sciences, September 13-16, 2009

Center for Multiscale Modeling of Atmospheric Processes (CMMAP) Improving the representation of cloud processes in climate models Multi-scale modeling framework (MMF) represents cloud processes on their native scales, includes cloud- scale interactions in the simulated physical and chemical processes Gateway in development, but will include simulation on both NSF and DOE resources as well as a digital library Computing in Atmospheric Sciences, September 13-16, 2009 Results can be evaluated by comparison of simulated and observed cloud-scale processes

Community Climate System Model (CCSM) Makes a world-leading, fully coupled climate model easier to use and available to a wide audience Compose, configure, and submit CCSM simulations to the TeraGrid Computing in Atmospheric Sciences, September 13-16, 2009

High Resolution Daily Temperature and Precipitation Data for the Northeast United States Data developed by the Northeast Regional Climate Center (NRCC) and updated daily Northeast Regional Climate Center –Accessible as an XML stream via Web Services Computing in Atmospheric Sciences, September 13-16, 2009

Linked Environments for Atmospheric Discovery (LEAD) Providing tools that are needed to make accurate predictions of tornados and hurricanes Meteorological data Forecast models Analysis and visualization tools Data exploration and Grid workflow Computing in Atmospheric Sciences, September 13-16, 2009

LEAD Inspires Students A student gets excited about what he was able to do with LEAD “Dr. Sikora:Attached is a display of 2- m T and wind depicting the WRF's interpretation of the coastal front on 14 February It's interesting that I found an example using IDV that parallels our discussion of mesoscale boundaries in class. It illustrates very nicely the transition to a coastal low and the strong baroclinic zone with a location very similar to Markowski's depiction. I created this image in IDV after running a 5-km WRF run (initialized with NAM output) via the LEAD Portal. This simple 1-level plot is just a precursor of the many capabilities IDV will eventually offer to visualize high-res WRF output. Enjoy! Eric” ( , March 2007) Computing in Atmospheric Sciences, September 13-16, 2009

The Earth System Grid (ESG) Integrates supercomputers with large-scale data and analysis servers to create a powerful environment for next generation climate research –Results of long-running, high-resolution (read expensive) simulations made available to national researchers –Key piece of infrastructure 12,000 registered users and over 250 TB of data collections Computing in Atmospheric Sciences, September 13-16, 2009

This year in TeraGrid Two great tastes that taste great together Environmental Science Gateway Built on ESG and CCSM –Long term vision Semantically enabled environment that includes modeling, simulated and observed data holdings, and visualization and analysis for climate as well as related domains –Short term objectives (by July, 2010) Extend the Earth System Grid-Curator (ESGC) Science Gateway so that CCSM runs can be initiated on the TeraGrid Integrate data publishing and wide-area transport capabilities such that model run datasets and metadata may be published back into ESG, from both the ESGC as well as CCSM Portal Investigate interface strategies that allow ESG to federate with Purdue’s climate model archives, such that Purdue holdings become visible and accessible as part of the ESG collections Computing in Atmospheric Sciences, September 13-16, 2009

What if you want to build your own gateway? Request an allocation –Only a 1 paragraph abstract required for up to 200k CPU hours Register your gateway –Visibility on public TeraGrid page Request a community account –Run jobs for others via your portal Staff support is available! Computing in Atmospheric Sciences, September 13-16, 2009

Tremendous Opportunities Using the Largest Shared Resources - Challenges too! What’s different when the resource doesn’t belong just to me? –Resource discovery –Accounting –Security –Proposal-based requests for resources (peer-reviewed access) Code scaling and performance numbers Justification of resources Gateway citations Tremendous benefits at the high end, but even more work for the developers Potential impact on science is huge –Small number of developers can impact thousands of scientists –But need a way to train and fund those developers and provide them with appropriate tools Computing in Atmospheric Sciences, September 13-16, 2009

Gateways can further investments in other projects Increase access –To instruments, expensive data collections, we’ll see an example today Increase capabilities –To analyze data, we’ll see an example today Improve workforce development –Can prepare students to function in today’s cross-disciplinary world, we’ll see an example today Increase outreach Increase public awareness –Public sees value in investments in large facilities –Pew 2006 study indicates that half of all internet users have been to a site specializing in science –Those who seek out science information on the internet are more likely to believe that scientific pursuits have a positive impact on society Computing in Atmospheric Sciences, September 13-16, 2009

Now for a look at a few non- atmospheric sciences gateways Computing in Atmospheric Sciences, September 13-16, 2009

Gateways in the marketplace Kids control telescopes and share images “In seconds my computer screen was transformed into a live telescopic view” –“Slooh's users include newbies and professional astronomers in 70 countries” Observatories in the Canary Islands and Chile, Australia coming soon 5000 images/month since 2003 Increases public support for investment in these facilities Computing in Atmospheric Sciences, September 13-16, 2009

Social Informatics Data Grid Collaborative access to large, complex datasets SIDGrid is unique among social science data archive projects –Streaming data which change over time Voice, video, images (e.g. fMRI), text, numerical (e.g. heart rate, eye movement) –Investigate multiple datasets, collected at different time scales, simultaneously Large data requirements Sophisticated analysis tools Computing in Atmospheric Sciences, September 13-16,

Viewing multimodal data like a symphony conductor “Music-score” display and synchronized playback of video and audio files –Pitch tracks –Text –Head nods, pause, gesture references Central archive of multi-modal data, annotations, and analyses –Distributed annotation efforts by multiple researchers working on a common data set History of updates Computational tools –Distributed acoustic analysis using Praat –Statistical analysis using R –Matrix computations using Matlab and Octave Computing in Atmospheric Sciences, September 13-16, 2009 Source: Studying Discourse and Dialog with SIDGrid, Levow, 2008

Future Work Web technologies change fast –Must be able to adapt quickly Gateways and gadgets –Gateway components incorporated into any social networking page –75% of 18 to 24 year-olds have social networking websites iPhone apps? Web 3.0 –Beyond social networking and sharing content –Standards and querying interfaces to programmatically share data across sites Resource Description Framework (RDF), SPARQL Computing in Atmospheric Sciences, September 13-16, 2009

Tremendous Potential for Gateways In only 17 years, the Web has fundamentally changed human communication Science Gateways can leverage this amazingly powerful tool to: –Transform the way scientists collaborate –Streamline conduct of science –Influence the public’s perception of science Reliability, trust, continuity are fundamental to truly change the conduct of science through the use of gateways –High end resources can have a profound impact The future is very exciting! Computing in Atmospheric Sciences, September 13-16, 2009

We’re actively soliciting projects in atmospheric sciences, contact me if you have a gateway development need Thank you for your attention! Questions? Nancy Wilkins-Diehr,