Information Technology Infrastructure Committee (ITIC) Report to the NAC March 8, 2012 Larry Smarr Chair ITIC.

Slides:



Advertisements
Similar presentations
INDIANAUNIVERSITYINDIANAUNIVERSITY GENI Global Environment for Network Innovation James Williams Director – International Networking Director – Operational.
Advertisements

21 st Century Science and Education for Global Economic Competition William Y.B. Chang Director, NSF Beijing Office NATIONAL SCIENCE FOUNDATION.
High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting Stem Cell Research Invited Presentation Sanford Consortium for Regenerative.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Joint CASC/CCI Workshop Report Strategic and Tactical Recommendations EDUCAUSE Campus Cyberinfrastructure Working Group Coalition for Academic Scientific.
SAN DIEGO SUPERCOMPUTER CENTER Emerging HIPAA and Protected Data Requirements for Research Computing at SDSC Ron Hawkins Director of Industry Relations.
SAN DIEGO SUPERCOMPUTER CENTER Choonhan Youn Viswanath Nandigam, Nancy Wilkins-Diehr, Chaitan Baru San Diego Supercomputer Center, University of California,
SAN DIEGO SUPERCOMPUTER CENTER Niches, Long Tails, and Condos Effectively Supporting Modest-Scale HPC Users 21st High Performance Computing Symposia (HPC'13)
GENI: Global Environment for Networking Innovations Larry Landweber Senior Advisor NSF:CISE Joint Techs Madison, WI July 17, 2006.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
IDC HPC User Forum Conference Appro Product Update Anthony Kenisky, VP of Sales.
SDSC Computing the 21st Century Talk Given to the NSF Sugar Panel May 27, 1998.
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political, and Economic Presentation by Larry Smarr to the NSF Campus Bridging Workshop.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO IEEE Symposium of Massive Storage Systems, May 3-5, 2010 Data-Intensive Solutions.
An Introduction to the Open Science Data Cloud Heidi Alvarez Florida International University Robert L. Grossman University of Chicago Open Cloud Consortium.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
A.V. Bogdanov Private cloud vs personal supercomputer.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Ocean Observatories Initiative OOI CI Release 2 Life Cycle Objectives Review CyberPoPs & Network.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
and beyond Office of Vice President for Information Technology.
INTERNET2 COLLABORATIVE INNOVATION PROGRAM DEVELOPMENT Florence D. Hudson Senior Vice President and Chief Innovation.
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
National Center for Supercomputing Applications Observational Astronomy NCSA projects radio astronomy: CARMA & SKA optical astronomy: DES & LSST access:
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
Cloud Computing in NASA Missions Dan Whorton CTO, Stinger Ghaffarian Technologies June 25, 2010 All material in RED will be updated.
Open Science Grid For CI-Days Elizabeth City State University Jan-2008 John McGee – OSG Engagement Manager Manager, Cyberinfrastructure.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Cyberinfrastructure A Status Report Deborah Crawford, Ph.D. Interim Director, Office of Cyberinfrastructure National Science Foundation.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Michael L. Norman Principal Investigator Interim Director, SDSC Allan Snavely.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
Federated Discovery and Access in Astronomy Robert Hanisch (NIST), Ray Plante (NCSA)
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
Stellar Stars: Reflections of a Center CIO James F. Williams Ames Research Center August 15, 2011.
The VAO is operated by the VAO, LLC. LSST VAO Meeting Robert Hanisch Space Telescope Science Institute Director, Virtual Astronomical Observatory.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
3 December 2015 Examples of partnerships and collaborations from the Internet2 experience Interworking2004 Ottawa, Canada Heather Boyles, Internet2
A Data Centre for Science and Industry Roadmap. INNOVATION NETWORKING DATA PROCESSING DATA REPOSITORY.
Cyberinfrastructure: An investment worth making Joe Breen University of Utah Center for High Performance Computing.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
What’s Happening at Internet2 Renee Woodten Frost Associate Director Middleware and Security 8 March 2005.
Lawrence H. Landweber National Science Foundation SC2003 November 20, 2003
Advanced research and education networking in the United States: the Internet2 experience Heather Boyles Director, Member and Partner Relations Internet2.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
HUIT Cloud Initiative Update November, /20/2013 Ryan Frazier & Rob Parrott.
NASA Earth Exchange (NEX) A collaborative supercomputing environment for global change science Earth Science Division/NASA Advanced Supercomputing (NAS)
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Joslynn Lee – Data Science Educator
Campus Cyberinfrastructure
Presentation transcript:

Information Technology Infrastructure Committee (ITIC) Report to the NAC March 8, 2012 Larry Smarr Chair ITIC

Committee Members Membership  Dr. Larry Smarr, Director- California Institute of Telecommunications and Information Technology  Mr. Alan Paller, Research Director- SANS Institute  Dr. Robert Grossman, Professor- University of Chicago  Dr. Charles Holmes, Retired- NASA  Dr. Alexander Szalay, Professor- Johns Hopkins University  Ms. Karen Harper (Exec Sec), IT Workforce Manager, Office of Chief Information Officer, NASA Committee Met March 6-7, 2012: Fact Finding at Johns Hopkins FACA at NASA HQ

Finding #1  To enable new scientific discoveries, in a fiscally constrained environment, NASA must develop more productive IT infrastructure through “frugal innovation” and “agile development” Easy to use as “flickr” Elastic to demand Continuous improvement More capacity for fixed investment Adaptable to changing requirements of multiple missions Built-in security that doesn’t hinder deployment

NASA is Falling Behind Federal and Non-Federal Institutions Big Data CI 10G  100G GPU Clusters Hybrid HPC Non-FedGoogle/MS/ Amazon GLIF/I2/ CENIC Japan TSUBAME GPUs 2.4 PF China #2 Fastest 5 PF MC/GPU NSFGordonGENI Next Gen Internet TAAC 512 GPUs Blue Waters* MC/GPU 12 PF DOEMagellanANI ARRA 100Gb ANL 256 GPUs NG Jaguar* MC/GPU 20 PF NASANebula, Testbed Goddard to Ames 10G Ames 136 GPU 2 x 64 at Ames & GFSC Pleiades MC 1PF * Later in 2012

p r o j e c t e d Source: Tsengdar Lee, Mike Little, NASA SMD is a Growing NASA HPC User Community

Leading Edge is Moving to Hybrid Processors: Requiring Major Software Innovations “With Titan’s arrival, fundamental changes to computer architectures will challenge researchers from every scientific discipline.” SEATTLE, WA -- (Marketwire) -- 11/14/ SC11 --

Partnering Opportunities with NSF: SDSC’s Gordon-Dedicated Dec. 5, 2011  Data-Intensive Supercomputer Based on SSD Flash Memory and Virtual Shared Memory SW  Emphasizes MEM and IOPS over FLOPS  Supernode has Virtual Shared Memory:  2 TB RAM Aggregate  8 TB SSD Aggregate  Total Machine = 32 Supernodes  4 PB Disk Parallel File System >100 GB/s I/O  System Designed to Accelerate Access to Massive Datasets being Generated in Many Fields of Science, Engineering, Medicine, and Social Science Source: Mike Norman, Allan Snavely SDSC

Gordon Bests Previous Mega I/O per Second by 25x

Rapid Evolution of 10GbE Port Prices Makes Campus-Scale 10Gbps CI Affordable $80K/port Chiaro (60 Max) $ 5K Force 10 (40 max) $ 500 Arista 48 ports ~$1000 (300+ Max) $ 400 Arista 48 ports Port Pricing is Falling Density is Rising – Dramatically Cost of 10GbE Approaching Cluster HPC Interconnects Source: Philip Papadopoulos, SDSC/Calit2

10G Optical Switch is the Center of a Switched Big Data Analysis Resource 2 12 OptIPuter 32 Co-Lo UCSD RCI CENIC/ NLR Trestles 100 TF 8 Dash 128 Gordon Oasis Procurement (RFP) Phase0: > 8GB/s Sustained Today Phase I: > 50 GB/sec for Lustre (May 2011) :Phase II: >100 GB/s (Feb 2012) 40  128 Source: Philip Papadopoulos, SDSC/Calit2 Triton 32 Radical Change Enabled by Arista G Switch G Capable 8 Existing Commodity Storage 1/3 PB 2000 TB > 50 GB/s 10Gbps

The Next Step for Data-Intensive Science: Pioneering the HPC Cloud

Maturing Cloud Computing Capabilities: Beyond Nebula  Nebula Testing Conclusions  Good first step to evaluate needs for science-driven cloud applications  AWS Performance is better than Nebula  Nebula cannot achieve economy of scale of AWS  NASA cannot invest sufficient funding to compete with AWS ability to move up the learning curve  JPL is carrying out tests of AWS for mission data analysis  Next Step: a NASA Cloud Test-bed  Evaluate improvements to cloud software stacks for S&E applications  Provide assistance to NASA cloud software developers  Assist appropriate NASA S&E users in migrating to clouds  Operate cloud instances at ARC and GSFC  Limited Funding under HEC Program as a Technology Development  Liaison with other NASA centers and external S&E cloud users  Partnership between OCIO and SMD Source: Tsengdar Lee, Mike Little, NASA

Partnering Opportunities with Universities John Hopkins University DataScope  Private Science Cloud for Sustained Analysis of PB Data Sets  Built for Under $1M  6.5PB of Storage, 500 Gbytes/sec Sequential BW  Disk IO + SSDs Streaming Data into an Array of GPUs  Connected to Starlight at 100G (May 2012)  Some Form of a Scalable Cloud Solution Inevitable  Who will Operate it, What Business Model, What Scale?  How does the On/Off Ramp Work?  Science has Different Tradeoffs than eCommerce:  Astronomy,  Space Science,  Turbulence,  Earth Science,  Genomics,  Large HPC Simulations Analysis Source: Alex Szalay, JHU

NASA Corporate Wide Area Network Backbone (Dedicated 10G between ARC/GFSC Supercomputing Centers) Source: Tsengdar Lee, Mike Little, NASA Dedicated 10Gbps

Partnering Opportunities with DOE: ARRA Stimulus Investment for DOE ESnet Source: Presentation to ESnet Policy Board National-Scale 100Gbps Network Backbone

Aims of DOE ESnet 100G National Backbone  To stimulate the market for 100G  To build pre-production 100G network, linking DOE supercomputing facilities with international peering points  To build a national-scale test bed for disruptive research  ESnet is interested in Federal partnerships that accelerate scientific discovery and leverage our particular capabilities Source: Presentation to ESnet Policy Board

Global Partnering Opportunities: The Global Lambda Integrated Facility Research Innovation Labs Linked by 10Gps Dedicated Networks

NAC Committee on IT Infrastructure Recommendation #1  Recommendation: To enable NASA to gain experience on emerging leading-edge IT technologies such as:  Data-Intensive Cyberinfrastructure,  100 Gbps Networking,  GPU Clusters, and  Hybrid HPC Architectures, we recommendation that NASA aggressively pursue partnerships with other Federal agencies, specifically NSF and DOE, as well as public/private opportunities. We believe joint agency program calls for end users to develop innovative applications will help keep NASA at the leading edge of capabilities and enable training of NASA staff to support NASA researchers as these technologies become mainstream.

NAC Committee on IT Infrastructure Recommendation #1 (Continued)  Major Reasons for the Recommendation: NASA has fallen behind the leading edge, compared to other Federal agencies and international centers, in key emerging information and networking technologies. In a budget constrained fiscal environment, it is unlikely that NASA will be able to catch up by internal efforts. Partnering, as was historically done in HPCC, seems an attractive option.  Consequences of No Action on the Recommendation: Within a few more years, the gap between NASA internally driven efforts and the U.S. and global best-of-breed will become a gap to large to bridge. This will severely undercut NASA’s ability to make progress on a number of critical application arenas.

Finding #2  SMD Data Resides in a Highly Distributed Servers  Many Data Storage and Analysis Sites Are Outside NASA Centers  Access to Entire Research Community Essential  Over Half Science Publications are From Using Data Archives  Secondary Storage Needed in Cloud with High Bandwidth and User Portal  Education and Public Outreach of Data Rapidly Expanding  Images for Public Relations  Apps for Smart Phones  Crowdsouring PI Research Community Education Public Outreach

In 2011 there were over 1060 papers written using data archived at MAST Majority of Hubble Space Telescope Scientific Publications Come From Data Archives

Multi-Mission Data Archives at STSI Will Continue to Grow - Doubling by 2018 JWST Cumulative Petabyte Over 20 Years

Solar Dynamics Observatory 4096x4096 AIA Camera – 57, 600 Images/Day March 6, 2012 X5.4 Flare from Sunspot AR1429 Captured by the Solar Dynamics Observatory (SDO) in the 171 Angstrom Wavelength Credit: NASA/SDO/AIA JSOC is Archiving ~5TB/day From 6 Cameras Leads to over 1 Petabyte per year!

Public Outreach Through Multiple Modes Simultaneous with Data to Research Scientists 3D Sun App-10s of Thousands Installed on Android in Last 30 Days

NASA Space Images Are Widely Viewed by Public

32 of the 200+ Apps in the Apple iStore that Return from a Search on “NASA”

Crowdsourcing Science: Galaxy Zoo and Moon Zoo Bring the Public into Scientific Discovery More than 250,000 people have taken part in Galaxy Zoo so far. In the 14 months the site was up Galaxy Zoo 2 users helped us make over 60,000,000 classifications. Over the past year, volunteers from the original Galaxy Zoo project created the world's largest database of galaxy shapes.

NASA Earth Sciences Images Are Brought Together by Earth Observatory Web Site

NASA Earth Observatory Integrates Global Variables: Time Evolution Over Last Decade

EOS-DIS Data Products Distribution Approaching ½ Billion/Year!

5 March 2012Robert Hanisch, NAC JHU 31 The Virtual Observatory  The VO is foremost a data discovery, access, and integration facility  International collaboration on metadata standards, data models, and protocols  Image, spectrum, time series data  Catalogs, databases  Transient event notices  Software and services  Distributed computing (authentication, authorization, process management)  Application inter-communication  International Virtual Observatory Alliance established in 2001, patterned on WorldWideWeb Consortium (W3C)

5 March 2012Robert Hanisch, NAC JHU 32 SAMP

NSF’s Ocean Observatory Initiative Cyberinfrastructure Supports Science, Education, and Public Outreach Source: Matthew Arrott, Calit2 Program Manager for OOI CI

OOI CI is Built on Dedicated Optical Networks and Federal Agency & Commercial Clouds Source: John Orcutt, Matthew Arrott, SIO/Calit2

NAC Committee on IT Infrastructure DRAFT* Recommendation #2  Recommendation: NASA should formally review the existing national data cyberinfrastructure supporting access to data repositories for NASA SMD missions. A comparison with best-of- breed practices within NASA and at other Federal agencies should be made.  We request a briefing on this review to a joint meeting of the NAC IT Infrastructure, Science, and Education committees within one year of this recommendation. The briefing should contain recommendations for a NASA data-intensive cyberinfrastructure to support science discovery by both mission teams, remote researchers, and for education and public outreach appropriate to the growth driven by current and future SMD missions. * To be completed after a joint meeting of ITIC, Science, and Education Committees in July 2012 and the final recommendation submitted to July 2012 NAC meeting

NAC Committee on IT Infrastructure Recommendation #2 (continued)  Major Reasons for the Recommendation: NASA data repository and analysis facilities for SMD missions are distributed across NASA centers and throughout U.S. universities and research facilities.  There is considerable variation in the sophistication of the integrated cyberinfrastructure supporting scientific discovery, educational reuse, and public outreach across SMD subdivisions.  The rapid rise in the last decade of “mining data archives” by groups other than those funded by specific missions implies a need for a national-scale cyberinfrastructure architecture that can allow for free-flow of data to where it is needed.  Other agencies, specifically NSF’s Ocean Observatories Initiative Cyberinfrastructure program, should be used as a benchmark for NASA’s data-intensive architecture.  Consequences of No Action on the Recommendation: The science, education, and public outreach potential of NASA’s investment in SMD space missions will not be realized.