TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Cyber-GIS Workshop, February 2-3, 2010 White.

Slides:



Advertisements
Similar presentations
INDIANAUNIVERSITYINDIANAUNIVERSITY GENI Global Environment for Network Innovation James Williams Director – International Networking Director – Operational.
Advertisements

Xsede eXtreme Science and Engineering Discovery Environment Ron Perrott University of Oxford 1.
1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NBCR Summer Institute, August 4, 2009.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Computing in Atmospheric Sciences, September.
(e)Science-Driven, Production- Quality, Distributed Grid and Cloud Data Infrastructure for the Transformative, Disruptive, Revolutionary, Next-Generation.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
April 2009 OSG Grid School - RDU 1 Open Science Grid John McGee – Renaissance Computing Institute University of North Carolina, Chapel.
Core Services I & II David Hart Area Director, UFP/CS TeraGrid Quarterly Meeting December 2008.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Network, Operations and Security Area Tony Rimovsky NOS Area Director
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
TeraGrid Science Gateways Nancy Wilkins-Diehr Area Director for Science Gateways TeraGrid 09 Education Program, June 22, 2009.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
Advancing Scientific Discovery through TeraGrid Scott Lathrop TeraGrid Director of Education, Outreach and Training University of Chicago and Argonne National.
TeraGrid Resources Enabling Scientific Discovery Through Cyberinfrastructure (CI) Diane Baxter, Ph.D. San Diego Supercomputer Center University of California,
TeraGrid Science Gateways in Portland Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SC09, November 14-20, 2009.
TeraGrid Science Gateways in Barcelona Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGEE 2009, September 21-25, 2009.
CloudCom Software for Science Gateways: Open Grid Computing Environments Marlon Pierce, Suresh Marru Pervasive Technology Institute Indiana University.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid 10, August 2-5, 2010 South Tenth Street Bridge, Pittsburgh.
Gateways Tutorial Outline: Morning Session Overview: Marlon Pierce OGCE Introduction: Marlon Demos for OGCE Gadget Container, GFAC Application Factory,
August 2007 Advancing Scientific Discovery through TeraGrid Adapted from S. Lathrop’s talk in SC’07
SAN DIEGO SUPERCOMPUTER CENTER NUCRI Advisory Board Meeting November 9, 2006 Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director.
Transformation of Research and Education in the 21 st Century Edward Seidel Director, Office of Cyberinfrastructure National Science Foundation
1 Preparing Your Application for TeraGrid Beyond 2010 TG09 Tutorial June 22, 2009.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways MURPA lecture, March 12, 2010.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Gateway Update for AUS Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways AUS telecon, January 28, 2010.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways EGU, Vienna, May 3, 2010.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Navajo Technical College, September 25, 2008.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
TeraGrid Advanced Scheduling Tools Warren Smith Texas Advanced Computing Center wsmith at tacc.utexas.edu.
1 CReSIS Lawrence Kansas February Geoffrey Fox (PI) Computer Science, Informatics, Physics Chair Informatics Department Director Digital Science.
1 TeraGrid and the Path to Petascale John Towns Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing Applications.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways NSF Program Officers, September 10, 2008.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
TeraGrid Extension Gateway Activities Nancy Wilkins-Diehr TeraGrid Quarterly, September 24-25, 2009 The Extension Proposal!
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Education, Outreach and Training (EOT) and External Relations (ER) Scott Lathrop Area Director for EOT Extension Year Plans.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Attribute-based Authentication for Gateways Jim Basney Terry Fleury Stuart Martin JP Navarro Tom Scavo Nancy Wilkins-Diehr.
Science Gateways What are they and why are they having such a tremendous impact on science? Nancy Wilkins-Diehr
Petascale Computing Resource Allocations PRAC – NSF Ed Walker, NSF CISE/ACI March 3,
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
TG ’08, June 9-13, State of TeraGrid John Towns Co-Chair, TeraGrid Forum Director, Persistent Infrastructure National Center for Supercomputing.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Purdue RP Highlights TeraGrid Round Table November 5, 2009 Carol Song Purdue TeraGrid RP PI Rosen Center for Advanced Computing Purdue University.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
TeraGrid’s Process for Meeting User Needs. Jay Boisseau, Texas Advanced Computing Center Dennis Gannon, Indiana University Ralph Roskies, University of.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
TeraGrid Software Integration: Area Overview (detailed in 2007 Annual Report Section 3) Lee Liming, JP Navarro TeraGrid Annual Project Review April, 2008.
CI Updates and Planning Discussion
Tools and Services Workshop
Joslynn Lee – Data Science Educator
Cyberinfrastructure and PolarGrid
Presentation transcript:

TeraGrid Science Gateways Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways Cyber-GIS Workshop, February 2-3, 2010 White House gateway

Cyber-GIS Why now? Increasing deluge of spatial data –Environmental science and engineering, geography and related social sciences, geosciences, public health Increasing reach and power of cyberinfrastructure –Mobile devices, networking, supercomputers, data storage and analysis, clouds, visualization How does CI need to adapt to address important societal and research needs in GIS? How does GIS need to adapt to take advantage of developments in CI? Answering these questions is one of the goals of this workshop Thank you for your attendance and assistance Cyber-GIS Workshop, February 2-3, 2010

Agenda Cyber-GIS Workshop, February 2-3, 2010

Now, a word from our sponsor But hopefully some useful information for you as well Cyber-GIS Workshop, February 2-3, 2010

Follow-on program to TeraGrid now being competed eXtreme Digital (XD) The XD Solicitation (NSF ) consists of 5 services: Current planning grants address the CMS/AUSS/TEOS integrative services. ServiceAcronymFunding (all 5 yrs) Expected Start Date Coordination and ManagementCMS$12M/yrApril 2011 Advanced User SupportAUSS$8M/yrApril 2011 Training, Education and OutreachTEOS$3M/yrApril 2011 Technology Audit and InsertionTAIS$3M/yrApril 2010 High-Performance Remote Visualization and Data Analysis (2 awards) RVDAS2 * $3M/yrLate 2009 Mid 2010 Cyber-GIS Workshop, February 2-3, 2010 Source: Richard Moore, SDSC

XD as Integrator XD will be in a position similar to TeraGrid Grid Infrastructure Group (GIG) and EGI –Deliver common services: e.g. allocations, central documentation/portal/helpdesk, standard user software environments, networking, advanced user support, training/education/outreach, etc. –Provide integration function(s) across Resource Providers (RPs), including governance –Define/implement a high-level integrating system architecture –Provide continuity of services for TeraGrid users –Work with TAIS awardee in an integrated fashion XD will not –Deploy its own HPC systems –Prescribe how Resource Providers operate resources But will work with RPs to define a common user environment –Define the science/engineering objectives for users Cyber-GIS Workshop, February 2-3, 2010 Source: Richard Moore, SDSC

TeraGrid/XD Resource Evolution IU Big Red NCSA Cobalt and Mercury (IA-64) ORNL NSTG PSC Big Ben Purdue Brutus and Condor Pool SDSC IA-64 UC/ANL IA-64 IU Big Red NCSA Cobalt and Mercury (IA-64) ORNL NSTG PSC Big Ben Purdue Brutus and Condor Pool SDSC IA-64 UC/ANL IA-64 Ending Service on or before 3/31/10* IU Quarry LONI Queen Bee NCAR Frost NCSA Abe NCSA Lincoln Purdue Steele PSC Pople SDSC Dash TACC Lonestar IU Quarry LONI Queen Bee NCAR Frost NCSA Abe NCSA Lincoln Purdue Steele PSC Pople SDSC Dash TACC Lonestar Track 2C Track 2D: Data-Intensive Computing Track 2D: High-Throughput Computing Track 2D: Experimental Architecture Track 2D: Grid Testbed Other resources as designated by NSF Track 2C Track 2D: Data-Intensive Computing Track 2D: High-Throughput Computing Track 2D: Experimental Architecture Track 2D: Grid Testbed Other resources as designated by NSF Start dates TBD XD RV-DA Systems TACC Longhorn (~12/09) & NICS Nautilus (~6/10) * Check the TeraGrid Resource Catalog at for start/end dates. To be continued through 3/31/11* TACC Ranger (Track 2A) NICS Kraken (Track 2B) TACC Ranger (Track 2A) NICS Kraken (Track 2B) Continuing XD TeraGrid Cyber-GIS Workshop, February 2-3, 2010 Source: Richard Moore, SDSC

TeraGrid resources today include: Tightly Coupled Distributed Memory Systems, 2 systems in the top 10 at top500.org –Kraken (NICS): Cray XT5, 99,072 cores, 1.03 Pflop –Ranger (TACC): Sun Constellation, 62,976 cores, 579 Tflop, 123 TB RAM Shared Memory Systems –Cobalt (NCSA): Altix, 8 Tflop, 3 TB shared memory –Pople (PSC): Altix, 5 Tflop, 1.5 TB shared memory Clusters with Infiniband –Abe (NCSA): 90 Tflops –Lonestar (TACC): 61 Tflops –QueenBee (LONI): 51 Tflops Condor Pool (Loosely Coupled) –Purdue- up to 22,000 cpus Gateway hosting –Quarry (IU): virtual machine support Visualization Resources –TeraDRE (Purdue): 48 node nVIDIA GPUs –Spur (TACC): 32 nVIDIA GPUs Storage Resources –GPFS-WAN (SDSC) –Lustre-WAN (IU) –Various archival resources Cyber-GIS Workshop, February 2-3, 2010 Source: Dan Katz, U Chicago

How did the Gateway program develop? A natural result of the impact of the internet on worldwide communication and information retrieval Implications on the conduct of science are still evolving –1980’s, Early gateways, National Center for Biotechnology Information BLAST server, search results sent by , still a working portal today –1989 World Wide Web developed at CERN –1992 Mosaic web browser developed –1995 “International Protein Data Bank Enhanced by Computer Browser” –2004 TeraGrid project director Rick Stevens recognized growth in scientific portal development and proposed the Science Gateway Program –Today, Web 3.0 and programmatic exchange of data between web pages Simultaneous explosion of digital information –Growing analysis needs in many, many scientific areas –Sensors, telescopes, satellites, digital images and video, –#1 machine on Top500 today is 1000x more powerful than all combined entries on the first list in 1993 Cyber-GIS Workshop, February 2-3, 2010 Only 17 years since the release of Mosaic!

Access to supercomputers hasn’t changed much in 20 years But the world around them sure has! Cyber-GIS Workshop, February 2-3, 2010

vt100 in the 1980s and a login window on Ranger today Cyber-GIS Workshop, February 2-3, 2010

Why are gateways worth the effort? Increasing range of expertise needed to tackle the most challenging scientific problems –How many details do you want each individual scientist to need to know? PBS, RSL, Condor Coupling multi-scale codes Assembling data from multiple sources Collaboration frameworks Cyber-GIS Workshop, February 2-3, 2010 #! /bin/sh #PBS -q dque #PBS -l nodes=1:ppn=2 #PBS -l walltime=00:02:00 #PBS -o pbs.out #PBS -e pbs.err #PBS -V cd /users/wilkinsn/tutorial/exercise_3../bin/mcell nmj_recon.main.mdl +( &(resourceManagerContact="tg- login1.sdsc.teragrid.org/jobmanager-pbs") (executable="/users/birnbaum/tutorial/bin/mcell") (arguments=nmj_recon.main.mdl) (count=128) (hostCount=10) (maxtime=2) (directory="/users/birnbaum/tutorial/exercise_3") (stdout="/users/birnbaum/tutorial/exercise_3/globus.out") (stderr="/users/birnbaum/tutorial/exercise_3/globus.err") ) ======= # Full path to executable executable=/users/wilkinsn/tutorial/bin/mcell # Working directory, where Condor-G will write # its output and error files on the local machine. initialdir=/users/wilkinsn/tutorial/exercise_3 # To set the working directory of the remote job, we # specify it in this globus RSL, which will be appended # to the RSL that Condor-G generates globusrsl=(directory='/users/wilkinsn/tutorial/exercise_3') # Arguments to pass to executable. arguments=nmj_recon.main.mdl # Condor-G can stage the executable transfer_executable=false # Specify the globus resource to execute the job globusscheduler=tg-login1.sdsc.teragrid.org/jobmanager- pbs # Condor has multiple universes, but Condor-G always uses globus universe=globus # Files to receive sdout and stderr. output=condor.out error=condor.err # Specify the number of copies of the job to submit to the condor queue. queue 1

Gateways democratize access to high end resources Almost anyone can investigate scientific questions using high end resources –Not just those in the research groups of those who request allocations –Gateways allow anyone with a web browser to explore Opportunities can be uncovered via google –My 11-year-old son discovered nanoHUB.org himself while his class was studying Bucky Balls Foster new ideas, cross-disciplinary approaches –Encourage students to experiment But used in production too –Significant number of papers resulting from gateways including GridChem, nanoHUB –Scientists can focus on challenging science problems rather than challenging infrastructure problems Cyber-GIS Workshop, February 2-3, 2010

Today, there are approximately 35 gateways using the TeraGrid Cyber-GIS Workshop, February 2-3, 2010

Not just ease of use What can scientists do that they couldn’t do previously? Linked Environments for Atmospheric Discovery (LEAD) - access to radar data National Virtual Observatory (NVO) – access to sky surveys Ocean Observing Initiative (OOI) – access to sensor data PolarGrid – access to polar ice sheet data SIDGrid – expensive datasets, analysis tools GridChem –coupling multiscale codes How would this have been done before gateways? Cyber-GIS Workshop, February 2-3, 2010

What makes a gateway a TeraGrid gateway? TeraGrid gateways use TeraGrid resources Are they all developed by TeraGrid? –No, we don’t make gateways the gateways you use, we make the gateways you use better –The strength of the program lies in the development of end user interfaces by those in the community TeraGrid does provide staff to assist with gateway use of the resources –Anyone can request support via the same peer review process used to request CPU hours or a data allocation Cyber-GIS Workshop, February 2-3, 2010

3 steps to connect a gateway to TeraGrid Request an allocation –Only a 1 paragraph abstract required for up to 200k CPU hours Register your gateway –Visibility on public TeraGrid page Request a community account –Run jobs for others via your portal Staff support is available! Cyber-GIS Workshop, February 2-3, 2010

Tremendous Opportunities Using the Largest Shared Resources - Challenges too! What’s different when the resource doesn’t belong just to me? –Resource discovery –Accounting –Security –Proposal-based requests for resources (peer-reviewed access) Code scaling and performance numbers Justification of resources Gateway citations Tremendous benefits at the high end, but even more work for the developers Potential impact on science is huge –Small number of developers can impact thousands of scientists –But need a way to train and fund those developers Cyber-GIS Workshop, February 2-3, 2010

Gateways in the marketplace Kids control telescopes and share images “In seconds my computer screen was transformed into a live telescopic view” –“Slooh's users include newbies and professional astronomers in 70 countries” Observatories in the Canary Islands and Chile, Australia coming soon 5000 images/month since 2003 Increases public support for investment in these facilities Cyber-GIS Workshop, February 2-3, 2010

Linked Environments for Atmospheric Discovery LEAD Providing tools that are needed to make accurate predictions of tornados and hurricanes Data exploration and Grid workflow Cyber-GIS Workshop, February 2-3, 2010

PolarGrid brings CI to polar ice sheet measurement Cyber-GIS Workshop, February 2-3, 2010 Cyberinfrastructure Center for Polar Science (CICPS) –Experts in polar science, remote sensing and cyberinfrastructure –Indiana, ECSU, CReSIS Satellite observations show disintegration of ice shelves in West Antarctica and speed-up of several glaciers in southern Greenland –Most existing ice sheet models, including those used by IPCC cannot explain the rapid changes

Components of PolarGrid –Expedition grid consisting of ruggedized laptops in a field grid linked to a low power multi-core base camp cluster –Prototype and two production expedition grids feed into a 17 Teraflops "lower 48" system at Indiana University and Elizabeth City State (ECSU) split between research, education and training. –Gives ECSU (a minority serving institution) a top-ranked 5 Teraflop high performance computing system Access to expensive data TeraGrid resources for analysis –Large level 0 and level 1 data sets require once and done processing and storage –Filters applied to level 1 data by users in real time via the web Student involvement Cyber-GIS Workshop, February 2-3, 2010 Source: Geoffrey Fox

Social Informatics Data Grid Collaborative access to large, complex datasets SIDGrid is unique among social science data archive projects –Streaming data which change over time Voice, video, images (e.g. fMRI), text, numerical (e.g. heart rate, eye movement) –Investigate multiple datasets, collected at different time scales, simultaneously Large data requirements Sophisticated analysis tools Cyber-GIS Workshop, February 2-3,

Viewing multimodal data like a symphony conductor “Music-score” display and synchronized playback of video and audio files –Pitch tracks –Text –Head nods, pause, gesture references Central archive of multi-modal data, annotations, and analyses –Distributed annotation efforts by multiple researchers working on a common data set History of updates Computational tools –Distributed acoustic analysis using Praat –Statistical analysis using R –Matrix computations using Matlab and Octave Cyber-GIS Workshop, February 2-3, 2010 Source: Studying Discourse and Dialog with SIDGrid, Levow, 2008

Future Technical Areas Web technologies change fast –Must be able to adapt quickly Gateways and gadgets –Gateway components incorporated into any social networking page –75% of 18 to 24 year-olds have social networking websites iPhone apps? Web 3.0 –Beyond social networking and sharing content –Standards and querying interfaces to programmatically share data across sites Resource Description Framework (RDF), SPARQL Cyber-GIS Workshop, February 2-3, 2010

Gateways can further investments in other projects Increase access –To instruments, expensive data collections Increase capabilities –To analyze data Improve workforce development –Can prepare students to function in today’s cross-disciplinary world Increase outreach Increase public awareness –Public sees value in investments in large facilities –Pew 2006 study indicates that half of all internet users have been to a site specializing in science –Those who seek out science information on the internet are more likely to believe that scientific pursuits have a positive impact on society Cyber-GIS Workshop, February 2-3, 2010

Where are we going with the program next? Prebuilt VMs with gateway software Better visibility of apps available through gateways in addition to those available at the command line on TeraGrid More example-based documentation –Less talk, more action Improved use of remote vis resources by gateways Continued targeted support of an ever-changing portfolio of projects –Peer-reviewed requests for assistance Helpdesk support expanded Cyber-GIS Workshop, February 2-3, 2010

Tremendous Potential for Gateways In only 17 years, the Web has fundamentally changed human communication Science Gateways can leverage this amazingly powerful tool to: –Transform the way scientists collaborate –Streamline conduct of science –Influence the public’s perception of science Reliability, trust, continuity are fundamental to truly change the conduct of science through the use of gateways –High end resources can have a profound impact The future is very exciting! Cyber-GIS Workshop, February 2-3, 2010

Thank you for your attention! Questions? Nancy Wilkins-Diehr,