Presentation is loading. Please wait.

Presentation is loading. Please wait.

TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008.

Similar presentations


Presentation on theme: "TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008."— Presentation transcript:

1 TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways wilkinsn@sdsc.edu NSF Program Officers, September 10, 2008

2 Today I hope to answer What are gateways? Why are gateways worth the effort What do they allow scientists to do that they couldn't without gateways? What are some specific examples of this? Why are these examples important? Impact on education and workforce development Why sustainable gateways are important We’ll demonstrate these with individual examples NSF Program Officers, September 10, 2008

3 May, 2007 Gateway presentation at the NSF How many of you were here? 4 hour recap in two slides Web developments, explosion of digital data are leading to the increased importance of gateways –16 years after the availability of Mosaic, full impact on science yet to be felt –Many studies point to the impact of the internet on science Public perception of the value of science increases with their use of science-based websites –Web usage model resonates with scientists But, need persistency if the Web is to have a profound impact on science NSF Program Officers, September 10, 2008

4 NSF has a long history in combining science and technology –PACI, ITR, STCs –Leadership continues today 5 great presentations –Gerhard Klimeck, Purdue, nanoHUB –Dennis Gannon, Indiana University, LEAD –Sudhakar Pamidighantam, UIUC, GridChem –John McGee, RENCI, TeraGrid Bioportal –Shaowen Wang, UIUC, GISolve NSF Program Officers, September 10, 2008

5 Today, there are approximately 29 gateways using the TeraGrid NSF Program Officers, September 10, 2008

6 Does a gateway have to use TeraGrid to be a gateway? No, I just talk about those that do because of my funding –But my position exposes me to a variety of gateways, many –Using high end resources is more work and is not recommended unless it serves a demonstrated need Gateways are an excellent way to extend the impact of high-end resources Are they all funded by TeraGrid? –Can TeraGrid claim success for all gateways? No, we don’t make gateways the gateways you use, we make the gateways you use better NSF Program Officers, September 10, 2008

7 Tremendous Opportunities Using the Largest Shared Resources - Challenges too! What’s different when the resource doesn’t belong just to me? –Resource discovery –Accounting –Security –Proposal-based requests for resources (peer-reviewed access) Code scaling and performance numbers Justification of resources Gateway citations Tremendous benefits at the high end, but even more work for the developers Potential impact on science is huge –Small number of developers can impact thousands of scientists –But need a way to train and fund those developers and provide them with appropriate tools NSF Program Officers, September 10, 2008

8 Why are gateways worth the effort? Increasing range of expertise needed to tackle the most challenging scientific problems –How many details do you want each individual scientist to need to know? PBS, RSL, Condor Coupling multi-scale codes Assembling data from multiple sources Collaboration frameworks NSF Program Officers, September 10, 2008 #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg- login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager- pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1

9 Not just ease of use What can scientists do that they couldn’t do previously? LEAD - access to radar data NVO – access to sky surveys OOI – access to sensor data PolarGrid – access to polar ice sheet data SIDGrid – analysis tools GridChem – developing multiscale coupling How would this have been done before gateways? NSF Program Officers, September 10, 2008

10 Gateways can further investments in other projects Increase access –To instruments, we’ll see an example today Increase capabilities –To analyze data, we’ll see an example today Improve workforce development –For underserved populations, we’ll see an example today Increase outreach Increase public awareness –Public sees value in investments in large facilities Slice bread Pack the kids’ lunch, etc. NSF Program Officers, September 10, 2008

11 Gateways in the marketplace Kids control telescopes and share images “In seconds my computer screen was transformed into a live telescopic view” –“Slooh's users include newbies and professional astronomers in 70 countries” Observatories in the Canary Islands and Chile, Australia coming soon 5000 images/month since 2003 Increases public support for investment in these facilities NSF Program Officers, September 10, 2008

12 Gateways Greatly Expand Access Almost anyone can investigate scientific questions using high end resources –Not just those in the research groups of those who request allocations –Gateways allow anyone with a web browser to explore Opportunities can be uncovered via google –My 11-year-old son discovered nanoHUB.org himself while his class was studying Bucky Balls Fosters new ideas, cross-disciplinary approaches Encourages students to experiment But used in production too –Significant number of papers resulting from gateways including GridChem, nanoHUB –Scientists can focus on challenging science problems rather than challenging infrastructure problems NSF Program Officers, September 10, 2008

13 TeraGrid Pathways Activities 2 Gateway components –Adapt gateways for educational use by underrepresented communities GEON – SDSC, Navajo Tech –Teach participants from underrepresented communities how to build gateways PolarGrid – IU, ECSU NSF Program Officers, September 10, 2008

14 Navajo Technical College and gateways NSF Program Officers, September 10, 2008 Incorporating the use of gateways in their curricula GEON, GISolve areas of initial interest

15 PolarGrid Cyberinfrastructure Center for Polar Science (CICPS) –Experts in polar science, remote sensing and cyberinfrastructure –Indiana, ECSU, CReSIS Satellite observations show disintegration of ice shelves in West Antarctica and speed-up of several glaciers in southern Greenland –Most existing ice sheet models, including those used by IPCC cannot explain the rapid changes NSF Program Officers, September 10, 2008 http://www.polargrid.org/polargrid/images/4/42/C0050- polargrid-big.m4v Source: Geoffrey Fox

16 Components of PolarGrid –Expedition grid consisting of ruggedized laptops in a field grid linked to a low power multi-core base camp cluster –Prototype and two production expedition grids feed into a 17 Teraflops "lower 48" system at Indiana University and Elizabeth City State (ECSU) split between research, education and training. –Gives ECSU a top-ranked 5 Teraflop MSI high performance computing system Access to expensive data High-end resources for analysis MSI student involvement NSF Program Officers, September 10, 2008 Source: Geoffrey Fox

17 Recent Gateways using TeraGrid Significantly SCEC SIDGrid CIG NSF Program Officers, September 10, 2008

18 SCEC using gateway to produce hazard map PSHA hazard map for California using newly released Earthquake Rupture Forecast (UCERF2.0) calculated using SCEC Science Gateway Warm colors indicate regions with a high probability of experiencing strong ground motion in the next 50 years. High resolution map, significant CPU use NSF Program Officers, September 10, 2008

19 Social Informatics Data Grid Heavy use of “multimodal” data. –Subject might be viewing a video, while a researcher collects heart rate and eye movement data. Events must be synchronized for analysis, large datasets result Extensive analysis capabilities are not something that each researcher should have to create for themselves. NSF Program Officers, September 10, 2008 http://www.ci.uchicago.edu/research/files/sidgrid.mov

20 Social scientists have traditionally worked in isolated labs without the capability to share data or insights with others. SIDGrid enables a number of capabilities. –Data that is expensive to collect can now be shared with others, increasing the potential for scientific impact. –Geographically distant researchers can collaborate on the analysis of the same data set. –Complex analysis tools and workflows are now available for all to use, rather than having each lab duplicate efforts. –All researchers now have access to the highest quality computational resources SIDGrid uses TeraGrid resources for computationally-intensive tasks such as media transcoding algorithms for pitch analysis of audio tracks and fMRI image analysis SIDGrid is unique among social science data archive projects –Focused on streaming data which change over time –Provides the ability to investigate multiple datasets, collected at different time scales, simultaneously Active users of the SIDGrid system include a human neuroscience group and linguistic research groups from the University of Chicago and the University of Nottingham, UK NSF Program Officers, September 10, 2008

21 40 institutional members –9 foreign affiliates Researchers request synthetic seismograms for any given earthquake –Allows scientists to understand the ground motion associated with any given earthquake Requested and received advanced support from TeraGrid NSF Program Officers, September 10, 2008

22 Advanced support for OCI resources Including gateway integration Same peer review process used to request resources –30,000 CPUs –+ 6 months of Nancy Reviews based on appropriate use of resources, science is not reviewed if already funded Petascale Multisite workflows Gateways Domain expertise NSF Program Officers, September 10, 2008 Or someone really talented

23 Support is Very Targeted Start with well-defined objectives –Focus on efficient or novel use of OCI resources Minimum.25 FTE for months to a year –Enough investment to really understand and help solve complex problems Must have commitment from PIs –Want to make sure work is incorporated into production codes and gateways Good candidates for targeted support include: –Large, high impact projects –Ability to influence new communities –Happy for feedback from directorates on important projects Lessons learned move into training and documentation NSF Program Officers, September 10, 2008

24 Gateway white paper recommends sustained funding Gateways can be used for the most challenging problems, but –Scientists won’t rely on something that they are not confident will be around for the duration We see this with software, but even more so with gateway infrastructure A sustained gateway program can –Reduce duplication of effort Sporadic development with many small programs –Increase diversity of end users –Increase skill set diversity of developers –Bring together teams to address the toughest problems NSF Program Officers, September 10, 2008

25 Recommend 10-year program with interim reviews Characteristics of 5-year or less cycles –Build exciting prototypes with input from scientists –Work with early adopters to extend capabilities –Tools are publicized, more scientists interested –Funding ends –Scientists who invested their time to use new tools are disillusioned Less likely to try something new again –Start again on new short-term project Need to break this cycle NSF Program Officers, September 10, 2008

26 Begin with user-driven workshops What are the most fundamental capabilities in each directorate? –What is the next PDB? nanoHUB? Earth System Grid? What is the community calling for? –Curated data collections Which collections? –Simulation, visualization and analysis –Collaboration tools or workspaces –Generation of complex workflows –Access to instruments, sensor or radar data that have limited exposure today Merit review and assessment will be critical to a long-term program NSF Program Officers, September 10, 2008

27 When might a gateway be appropriate? Researchers using defined sets of tools in different ways –Same executables, different input GridChem, CHARMM –Creating multi-scale or complex workflows –Datasets Common data formats –National Virtual Observatory –Earth System Grid –Some groups have invested significant efforts here caBIG, extensive discussions to develop common terminology and formats BIRN, extensive data sharing agreements Difficult to access data/advanced workflows –Sensor/radar input LEAD, GEON NSF Program Officers, September 10, 2008

28 Tremendous Potential for Gateways In only 16 years, the Web has fundamentally changed human communication Science Gateways can leverage this amazingly powerful tool to: –Transform the way scientists collaborate –Streamline conduct of science –Influence the public’s perception of science Reliability, trust, continuity are fundamental to truly change the conduct of science through the use of gateways –High end resources can have a profound impact The future is very exciting! NSF Program Officers, September 10, 2008

29 Thank you for your attention For more information –www.teragrid.orgwww.teragrid.org –wilkinsn@sdsc.eduwilkinsn@sdsc.edu Live demonstration of the Neutron Science Gateway –Vickie Lynch, Oak Ridge National Laboratory NSF Program Officers, September 10, 2008

30 Afternoon Agenda 2:00 pm Break –(aka recover from this talk, ask questions) 2:15 pm Track 2 Resources 2:15-2:35 pm Ranger –Jay Boisseau, Texas Advanced Computing Center 2:35-2:55 pm Kraken –Bruce Loftis, National Institute for Computational Sciences 2:55-3:15 pm Track 2c –Nick Nystrom, Pittsburgh Supercomputing Center 3:15 pm Blue Waters –John Towns, National Center for Supercomputing Applications 3:30 pm Open discussion with all presenters NSF Program Officers, September 10, 2008


Download ppt "TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008."

Similar presentations


Ads by Google