“High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World” Invited Speaker Grand Challenges in Data-Intensive Discovery.

Slides:



Advertisements
Similar presentations
-Grids and the OptIPuter Software Architecture Andrew A. Chien Director, Center for Networked Systems SAIC Chair Professor, Computer Science and Engineering.
Advertisements

High Performance Cyberinfrastructure Enabling Data-Driven Science Supporting Stem Cell Research Invited Presentation Sanford Consortium for Regenerative.
“End-to-end Optical Fiber Cyberinfrastructure for Data-Intensive Research: Implications for Your Campus” Featured Speaker EDUCAUSE 2010 Anaheim Convention.
Sequencing Genomics: The New Big Data Driver IntermezzoTalk SURFnet7, Part of GigaPort3 Utrecht, Netherlands December 7, 2011 Dr. Larry Smarr Director,
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Intensive Research Seminar Presentation Princeton Institute for Computational.
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biomedical Sciences Joint Presentation UCSD School of Medicine Research Council.
High Performance Cyberinfrastructure Required for Data Intensive Scientific Research Invited Presentation National Science Foundation Advisory Committee.
Uses of the OptIPortal Presentation to the Minority Serving Institutions Cyberinfrastructure Empowerment Coalition June 10, 2010 Dr. Larry.
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Supercomputers and Supernetworks are Transforming Research Invited Talk Computing Research that Changed the World: Reflections and Perspectives Washington,
Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
Creating a Community Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analysis (a.k.a. CAMERA) Invited Talk Honoring David Kingsbury.
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World Keynote Presentation Sequencing Data Storage and Management.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
"The OptIPuter: an IP Over Lambda Testbed" Invited Talk NREN Workshop VII: Optical Network Testbeds (ONT) NASA Ames Research Center Mountain View, CA August.
AHM Overview OptIPuter Overview Third All Hands Meeting OptIPuter Project San Diego Supercomputer Center University of California, San Diego January 26,
Electronic Visualization Laboratory, University of Illinois at Chicago Collaborative Visualization Architecture in Scalable Adaptive Graphics Environment.
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced Scientific.
SAN DIEGO SUPERCOMPUTER CENTER Niches, Long Tails, and Condos Effectively Supporting Modest-Scale HPC Users 21st High Performance Computing Symposia (HPC'13)
PRISM: High-Capacity Networks that Augment Campus’ General Utility Production Infrastructure Philip Papadopoulos, PhD. Calit2 and SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political, and Economic Presentation by Larry Smarr to the NSF Campus Bridging Workshop.
Genomics at the Speed of Light: Understanding the Living Ocean The Gordon and Betty Moore Foundation 2nd Annual Marine Microbiology Investigator Symposium.
© UC Regents 2010 Extending Rocks Clusters into Amazon EC2 Using Condor Philip Papadopoulos, Ph.D University of California, San Diego San Diego Supercomputer.
“An Integrated West Coast Science DMZ for Data-Intensive Research” Panel CENIC Annual Conference University of California, Irvine Irvine, CA March 9, 2015.
Silicon Graphics, Inc. Poster Presented by: SGI Proprietary Technologies for Breakthrough Research Rosario Caltabiano North East Higher Education & Research.
Restructuring Campus CI -- UCSD-A LambdaCampus Research CI and the Quest for Zero Carbon ICT Invited Presentation to the Campus Cyberinfrastructure.
“How LambdaGrids are Transforming Science" Keynote iGrid2005 La Jolla, CA September 29, 2005 Dr. Larry Smarr Director, California Institute.
Science and Cyberinfrastructure in the Data-Dominated Era Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
“Set My Data Free: High-Performance CI for Data-Intensive Research” KeynoteSpeaker Cyberinfrastructure Days University of Michigan Ann Arbor, MI November.
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Ocean Observatories Initiative OOI CI Release 2 Life Cycle Objectives Review CyberPoPs & Network.
The OptIPlanet Collaboratory Supporting Researchers Worldwide Talk Australian American Leadership Dialogue January 15, 2008 Dr. Larry Smarr.
Why Optical Networks Will Become the 21 st Century Driver Scientific American, January 2001 Number of Years Performance per Dollar Spent Data Storage.
Source: Jim Dolgonas, CENIC CENIC is Removing the Inter-Campus Barriers in California ~ $14M Invested in Upgrade Now Campuses Need to Upgrade.
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
Developing a North American Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
Cal-(IT) 2 : A Public-Private Partnership in Southern California U.S. Business Council for Sustainable Development Year-End Meeting December 11, 2003 Institute.
Introduction to Calit2 Visit by NASA Ames February 29, 2008 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology.
Chicago/National/International OptIPuter Infrastructure Tom DeFanti OptIPuter Co-PI Distinguished Professor of Computer Science Director, Electronic Visualization.
Innovative Research Alliances Invited Talk IUCRP Fellows Seminar UCSD La Jolla, CA July 10, 2006 Dr. Larry Smarr Director, California Institute for Telecommunications.
“Metagenomics Over Lambdas: Update on the CAMERA Project" Invited Talk 6 th Annual ON*VECTOR International Photonics Workshop UCSD February 27,
A Wide Range of Scientific Disciplines Will Require a Common Infrastructure Example--Two e-Science Grand Challenges –NSF’s EarthScope—US Array –NIH’s Biomedical.
“Cyberinfrastructure for Ocean Cabled Observatories" Invited Talk NEPTUNE Regional Cabled Ocean Observatory Workshop Seattle, WA November 15, 2005 Dr.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
“Cal-(IT)2 Projects with Sun Microsystems” Invited Talk at the Sun Microsystems Booth Supercomputing 2004 Pittsburgh, PA November 9, 2004 Dr. Larry Smarr.
Ocean Sciences Cyberinfrastructure Futures Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technologies Harry E.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
“The Pacific Research Platform: a Science-Driven Big-Data Freeway System.” Big Data for Information and Communications Technologies Panel Presentation.
“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning " Talk to the UCSD Representative Assembly La Jolla, CA November 29, 2005 Dr.
“CAMERA Goes Live!" Presentation with Craig Venter National Press Club Washington, DC March 13, 2007 Dr. Larry Smarr Director, California Institute for.
Slide 1 UCSC 100 Gbps Science DMZ – 1 year 9 month Update Brad Smith & Mary Doyle.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
University of Illinois at Chicago Lambda Grids and The OptIPuter Tom DeFanti.
Integrate access to advanced computational resources and high-level services (resource scheduling, automated data management) to accelerate and improve.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
Lennart Johnsson Professor CSC Director, PDC
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced.
Optical SIG, SD Telecom Council
The OptIPortal, a Scalable Visualization, Storage, and Computing Termination Device for High Bandwidth Campus Bridging Presentation by Larry Smarr to.
Presentation transcript:

“High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World” Invited Speaker Grand Challenges in Data-Intensive Discovery Conference San Diego Supercomputer Center, UC San Diego La Jolla, CA October 28, 2010 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD Follow me on Twitter: lsmarr

Abstract Today we are living in a data-dominated world where distributed scientific instruments, as well as supercomputers, generate terabytes to petabytes of data. It was in response to this challenge that the NSF funded the OptIPuter project to research how user-controlled 10Gbps dedicated lightpaths (or “lambdas”) could provide direct access to global data repositories, scientific instruments, and computational resources from “OptIPortals,” PC clusters which provide scalable visualization, computing, and storage in the user's campus laboratory. The use of dedicated lightpaths over fiber optic cables enables individual researchers to experience “clear channel” 10,000 megabits/sec, times faster than over today’s shared Internet—a critical capability for data-intensive science. The seven-year OptIPuter computer science research project is now over, but it stimulated a national and global build-out of dedicated fiber optic networks. U.S. universities now have access to high bandwidth lambdas through the National LambdaRail, Internet2's WaveCo, and the Global Lambda Integrated Facility. A few pioneering campuses are now building on-campus lightpaths to connect the data- intensive researchers, data generators, and vast storage systems to each other on campus, as well as to the national network campus gateways. I will give examples of the application use of this emerging high performance cyberinfrastructure in genomics, ocean observatories, radio astronomy, and cosmology.

Academic Research “OptIPlatform” Cyberinfrastructure: A 10Gbps “End-to-End” Lightpath Cloud National LambdaRail Campus Optical Switch Data Repositories & Clusters HPC HD/4k Video Images HD/4k Video Cams End User OptIPortal 10G Lightpaths HD/4k Telepresence Instruments

The OptIPuter Project: Creating High Resolution Portals Over Dedicated Optical Channels to Global Science Data Picture Source: Mark Ellisman, David Lee, Jason Leigh Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent Scalable Adaptive Graphics Environment (SAGE)

On-Line Resources Help You Build Your Own OptIPortal OptIPortals Are Built From Commodity PC Clusters and LCDs To Create a 10Gbps Scalable Termination Device

Nearly Seamless AESOP OptIPortal Source: Tom DeFanti, 46” NEC Ultra-Narrow Bezel 720p LCD Monitors

3D Stereo Head Tracked OptIPortal: NexCAVE Source: Tom DeFanti, Array of JVC HDTV 3D LCD Screens KAUST NexCAVE = 22.5MPixels

Project StarGate Goals: Combining Supercomputers and Supernetworks Create an “End-to-End” 10Gbps Workflow Explore Use of OptIPortals as Petascale Supercomputer “Scalable Workstations” Exploit Dynamic 10Gbps Circuits on ESnet Connect Hardware Resources at ORNL, ANL, SDSC Show that Data Need Not be Trapped by the Network “Event Horizon” Rick WagnerMike Norman ANL * Calit2 * LBNL * NICS * ORNL * SDSC Source: Michael Norman, SDSC, UCSD

NICS ORNL NSF TeraGrid Kraken Cray XT5 8,256 Compute Nodes 99,072 Compute Cores 129 TB RAM simulation Argonne NL DOE Eureka 100 Dual Quad Core Xeon Servers 200 NVIDIA Quadro FX GPUs in 50 Quadro Plex S4 1U enclosures 3.2 TB RAM rendering SDSC Calit2/SDSC OptIPortal ” (2560 x 1600 pixel) LCD panels 10 NVIDIA Quadro FX 4600 graphics cards > 80 megapixels 10 Gb/s network throughout visualization ESnet 10 Gb/s fiber optic network *ANL * Calit2 * LBNL * NICS * ORNL * SDSC Using Supernetworks to Couple End User’s OptIPortal to Remote Supercomputers and Visualization Servers Source: Mike Norman, Rick Wagner, SDSC

Eureka 100 Dual Quad Core Xeon Servers 200 NVIDIA FX GPUs 3.2 TB RAM ALCF Rendering Science Data Network (SDN) > 10 Gb/s Fiber Optic Network Dynamic VLANs Configured Using OSCARS ESnet SDSC OptIPortal (40M pixels LCDs) 10 NVIDIA FX 4600 Cards 10 Gb/s Network Throughout Visualization Last Year Last Week High-Resolution (4K+, 15+ FPS)—But: Command-Line Driven Fixed Color Maps, Transfer Functions Slow Exploration of Data Now Driven by a Simple Web GUI Rotate, Pan, Zoom GUI Works from Most Browsers Manipulate Colors and Opacity Fast Renderer Response Time National-Scale Interactive Remote Rendering of Large Datasets Interactive Remote Rendering Real-Time Volume Rendering Streamed from ANL to SDSC Source: Rick Wagner, SDSC

NSF OOI is a $400M Program -OOI CI is $34M Part of This Source: Matthew Arrott, Calit2 Program Manager for OOI CI Software Engineers Housed at

OOI CI Physical Network Implementation Source: John Orcutt, Matthew Arrott, SIO/Calit2 OOI CI is Built on NLR/I2 Optical Infrastructure

California and Washington Universities Are Testing a 10Gbps Connected Commercial Data Cloud Amazon Experiment for Big Data –Only Available Through CENIC & Pacific NW GigaPOP –Private 10Gbps Peering Paths –Includes Amazon EC2 Computing & S3 Storage Services Early Experiments Underway –Robert Grossman, Open Cloud Consortium –Phil Papadopoulos, Calit2/SDSC Rocks

Open Cloud OptIPuter Testbed--Manage and Compute Large Datasets Over 10Gbps Lambdas 14 NLR C-Wave MREN CENICDragon Open Source SW  Hadoop  Sector/Sphere  Nebula  Thrift, GPB  Eucalyptus  Benchmarks Source: Robert Grossman, UChicago 9 Racks 500 Nodes Cores 10+ Gb/s Now Upgrading Portions to 100 Gb/s in 2010/2011

Ocean Modeling HPC In the Cloud: Tropical Pacific SST (2 Month Ave 2002) MIT GCM 1/3 Degree Horizontal Resolution, 51 Levels, Forced by NCEP2. Grid is 564x168x51, Model State is T,S,U,V,W and Sea Surface Height Run on EC2 HPC Instance. In Collaboration with OOI CI/Calit2 Source: B. Cornuelle, N. Martinez, C.Papadopoulos COMPAS, SIO

Run Timings of Tropical Pacific: Local SIO ATLAS Cluster and Amazon EC2 Cloud ATLAS Ethernet NFS ATLAS Myrinet, NFS ATLAS Myrinet Local Disk EC2 HPC Ethernet 1 Node EC2 HPC Ethernet Local Disk Wall Time* User Time* System Time* Atlas: 128 Node SIO COMPAS. Myrinet 10G, 8GB/node, ~3yrs old EC2: HPC Computing Instance, 2.93GHz Nehalem, 24GB/Node, 10GbE Compilers:Ethernet – GNU FORTRAN with OpenMPI Myrinet – PGI FORTRAN with MPICH1 Single Node EC2 was Oversubscribed, 48 Process. All Other Parallel Instances used 6 Physical Nodes, 8 Cores/Node. Model Code has been Ported to Run on ATLAS, Triton and in EC2. *All times in Seconds Source: B. Cornuelle, N. Martinez, C.Papadopoulos COMPAS, SIO

Using Condor and Amazon EC2 on Adaptive Poisson-Boltzmann Solver (APBS) APBS Rocks Roll (NBCR) + EC2 Roll + Condor Roll = Amazon VM Cluster extension into Amazon using Condor Running in Amazon Cloud APBS + EC2 + Condor EC2 Cloud Local Cluster NBCR VM Source: Phil Papadopoulos, SDSC/Calit2

Moving into the Clouds: Rocks and EC2 We Can Build Physical Hosting Clusters & Multiple, Isolated Virtual Clusters: –Can I Use Rocks to Author “Images” Compatible with EC2? (We Use Xen, They Use Xen) –Can I Automatically Integrate EC2 Virtual Machines into My Local Cluster (Cluster Extension) –Submit Locally –My Own Private + Public Cloud What This Will Mean –All your Existing Software Runs Seamlessly Among Local and Remote Nodes –User Home Directories Can Be Mounted –Queue Systems Work –Unmodified MPI Works Source: Phil Papadopoulos, SDSC/Calit2

“Blueprint for the Digital University”--Report of the UCSD Research Cyberinfrastructure Design Team Focus on Data-Intensive Cyberinfrastructure No Data Bottlenecks --Design for Gigabit/s Data Flows April 2009

Current UCSD Optical Core: Bridging End-Users to CENIC L1, L2, L3 Services Source: Phil Papadopoulos, SDSC/Calit2 (Quartzite PI, OptIPuter co-PI) Quartzite Network MRI #CNS ; OptIPuter #ANI Lucent Glimmerglass Force10 Enpoints: >= 60 endpoints at 10 GigE >= 32 Packet switched >= 32 Switched wavelengths >= 300 Connected endpoints Approximately 0.5 TBit/s Arrive at the “Optical” Center of Campus. Switching is a Hybrid of: Packet, Lambda, Circuit -- OOO and Packet Switches

UCSD Campus Investment in Fiber Enables Consolidation of Energy Efficient Computing & Storage DataOasis (Central) Storage OptIPortal Tile Display Wall Campus Lab Cluster Digital Data Collections Triton – Petascale Data Analysis Gordon – HPD System Cluster Condo Scientific Instruments N x 10Gb WAN 10Gb: CENIC, NLR, I2 Source: Philip Papadopoulos, SDSC/Calit2

UCSD Planned Optical Networked Biomedical Researchers and Instruments Cellular & Molecular Medicine West National Center for Microscopy & Imaging Biomedical Research Center for Molecular Genetics Pharmaceutical Sciences Building Cellular & Molecular Medicine East CryoElectron Microscopy Facility Radiology Imaging Lab Bioengineering San Diego Supercomputer Center Connects at 10 Gbps : –Microarrays –Genome Sequencers –Mass Spectrometry –Light and Electron Microscopes –Whole Body Imagers –Computing –Storage

Triton Resource Large Memory PSDAF 256/512 GB/sys 9TB Total 128 GB/sec ~ 9 TF x28 Shared Resource Cluster 24 GB/Node 6TB Total 256 GB/sec ~ 20 TF x256 Campus Research Network UCSD Research Labs Large Scale Storage 2 PB 40 – 80 GB/sec 3000 – 6000 disks Phase 0: 1/3 TB, 8GB/s Moving to a Shared Campus Data Storage and Analysis Resource: Triton SDSC Source: Philip Papadopoulos, SDSC/Calit2

Calit2 Microbial Metagenomics Cluster- Next Generation Optically Linked Science Data Server 512 Processors ~5 Teraflops ~ 200 Terabytes Storage 1GbE and 10GbE Switched / Routed Core ~200TB Sun X4500 Storage 10GbE Source: Phil Papadopoulos, SDSC, Calit2

Calit2 CAMERA Automatic Overflows into SDSC Triton Triton Resource CAMERA SDSC CAMERA - Managed Job Submit Portal (VM) 10Gbps Transparently Sends Jobs to Submit Portal on Triton Direct Mount == No Data Staging

Prototyping Next Generation User Access and Large Data Analysis-Between Calit2 and U Washington Ginger Armbrust’s Diatoms: Micrographs, Chromosomes, Genetic Assembly Photo Credit: Alan Decker Feb. 29, 2008 iHDTV: 1500 Mbits/sec Calit2 to UW Research Channel Over NLR

Rapid Evolution of 10GbE Port Prices Makes Campus-Scale 10Gbps CI Affordable $80K/port Chiaro (60 Max) $ 5K Force 10 (40 max) $ 500 Arista 48 ports ~$1000 (300+ Max) $ 400 Arista 48 ports Port Pricing is Falling Density is Rising – Dramatically Cost of 10GbE Approaching Cluster HPC Interconnects Source: Philip Papadopoulos, SDSC/Calit2

10G Switched Data Analysis Resource: Data Oasis (RFP Responses Due 10/29/2010) 2 12 OptIPuter 32 Colo RCN CalRe n Existing Storage 1500 – 2000 TB > 40 GB/s Trestles 8 Dash 100 Gordon Oasis Procurement (RFP) Phase0: > 8GB/s sustained, today RFP for Phase1: > 40 GB/sec for Lustre Nodes must be able to function as Lustre OSS (Linux) or NFS (Solaris) Connectivity to Network is 2 x 10GbE/Node Likely Reserve dollars for inexpensive replica servers 40 Source: Philip Papadopoulos, SDSC/Calit2 Triton 32

You Can Download This Presentation at lsmarr.calit2.net