“A California-Wide Cyberinfrastructure for Data-Intensive Research” Invited Presentation CENIC Annual Retreat Santa Rosa, CA July 22, 2014 Dr. Larry Smarr.

Slides:



Advertisements
Similar presentations
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Advertisements

Bringing Mexico Into the Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber.
Introduction to the UCSD Division of Calit2" Calit2 Tour NextMed / MMVR20 UC San Diego February 20, 2013 Dr. Larry Smarr Director, California.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
The First Year of Cal-(IT) 2 Report to The University of California Regents UCSF San Francisco, CA March 13, 2002 Dr. Larry Smarr Director, California.
Future Internet Research and Experiments EU vs USA 7 June, 2012 Róbert Szabó Dept. of Telecommunications and Media Informatics Budapest University of Technology.
1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
“High Performance Cyberinfrastructure for Data-Intensive Research”
AHM Overview OptIPuter Overview Third All Hands Meeting OptIPuter Project San Diego Supercomputer Center University of California, San Diego January 26,
“Building US/Mexico Collaborations Using Optical Networks” Opening Workshop Welcome Big Data Big Network 2 Calit2’s Qualcomm Institute February 10, 2014.
SAN DIEGO SUPERCOMPUTER CENTER Emerging HIPAA and Protected Data Requirements for Research Computing at SDSC Ron Hawkins Director of Industry Relations.
PRISM: High-Capacity Networks that Augment Campus’ General Utility Production Infrastructure Philip Papadopoulos, PhD. Calit2 and SDSC.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Cyber UC San Diego Elazar C. Harel May 14, 2008.
“An Integrated West Coast Science DMZ for Data-Intensive Research” Panel CENIC Annual Conference University of California, Irvine Irvine, CA March 9, 2015.
“How LambdaGrids are Transforming Science" Keynote iGrid2005 La Jolla, CA September 29, 2005 Dr. Larry Smarr Director, California Institute.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
“A UC-Wide Cyberinfrastructure for Data-Intensive Research” Invited Presentation UC IT Leadership Council Oakland, CA May 19, 2014 Dr. Larry Smarr Director,
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology Metagenomics Center for Earth Observations and Applications Advisory Committee.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
“Calit2: A UC Experiment for Living in the Future" Talk to UCSD Near You La Jolla, CA April 11, 2006 Dr. Larry Smarr Director, California Institute.
“Creating a High Performance Cyberinfrastructure to Support Analysis of Illumina Metagenomic Data” DNA Day Department of Computer Science and Engineering.
Developing a North American Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E.
A Technology Vision for the Future Rick Summerhill, Chief Technology Officer, Eric Boyd, Deputy Technology Officer, Internet2 Joint Techs Meeting 16 July.
Campus Cyberinfrastructure – Network Infrastructure and Engineering (CC-NIE) Kevin Thompson NSF Office of CyberInfrastructure April 25, 2012.
Cal-(IT) 2 : A Public-Private Partnership in Southern California U.S. Business Council for Sustainable Development Year-End Meeting December 11, 2003 Institute.
Innovative Research Alliances Invited Talk IUCRP Fellows Seminar UCSD La Jolla, CA July 10, 2006 Dr. Larry Smarr Director, California Institute for Telecommunications.
A Wide Range of Scientific Disciplines Will Require a Common Infrastructure Example--Two e-Science Grand Challenges –NSF’s EarthScope—US Array –NIH’s Biomedical.
“Driving Applications on the UCSD Big Data Freeway System” Keynote Lecture Cubic and UC San Diego Innovation Workshop UC San Diego February 26, 2014 Dr.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
Geosciences - Observations (Bob Wilhelmson) The geosciences in NSF’s world consists of atmospheric science, ocean science, and earth science Many of the.
“ Calit2-Living in the Future " Briefing The Future in Review (FiRe) 2006 Conference University of California, San Diego May 15, 2006 Dr. Larry Smarr Director,
A Brief Overview Andrew K. Bjerring President and CEO.
Project GreenLight Overview Thomas DeFanti Full Research Scientist and Distinguished Professor Emeritus California Institute for Telecommunications and.
Dan Cayan USGS Water Resources Discipline Scripps Institution of Oceanography, UC San Diego much support from Mary Tyree, Mike Dettinger, Guido Franco.
The Interaction of UCSD Industrial Partners, the Jacobs School of Engineering, and Cal-(IT) 2 Dr. Larry Smarr Director, California Institute for Telecommunications.
Ocean Sciences Cyberinfrastructure Futures Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technologies Harry E.
University of Illinois at Chicago StarLight: Applications-Oriented Optical Wavelength Switching for the Global Grid at STAR TAP Tom DeFanti, Maxine Brown.
Cyberinfrastructure: An investment worth making Joe Breen University of Utah Center for High Performance Computing.
The Future from the Perspective of The California Institute for Telecommunications and Information Technology Invited Paper to the Eighteenth IEEE Symposium.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
OptIPuter Networks Overview of Initial Stages to Include OptIPuter Nodes OptIPuter Networks OptIPuter Expansion OPtIPuter All Hands Meeting February 6-7.
The PRPv1 Architecture Model Panel Presentation Building the Pacific Research Platform Qualcomm Institute, Calit2 UC San Diego October 16, 2015.
“The Pacific Research Platform: a Science-Driven Big-Data Freeway System.” Big Data for Information and Communications Technologies Panel Presentation.
“ Collaborations Between Calit2, SIO, and the Venter Institute—a Beginning " Talk to the UCSD Representative Assembly La Jolla, CA November 29, 2005 Dr.
“ Ultra-Broadband and Peta-Scale Collaboration Opportunities Between UC and Canada Summary Talk Canada - California Strategic Innovation Partnership Summit.
Slide 1 UCSC 100 Gbps Science DMZ – 1 year 9 month Update Brad Smith & Mary Doyle.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
“Pacific Research Platform Science Drivers” Opening Remarks PRP Science Driver PI Workshop UC Davis March 23, 2016 Dr. Larry Smarr Director, California.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
ORNL Site Report ESCC Feb 25, 2014 Susan Hicks. 2 Optical Upgrades.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“ Building an Information Infrastructure to Support Microbial Metagenomic Sciences " Presentation to the NBCR Research Advisory Committee UCSD La Jolla,
“The Pacific Research Platform Two Years In”
Canada’s National Research and Education Network (NREN)
Optical SIG, SD Telecom Council
The OptIPortal, a Scalable Visualization, Storage, and Computing Termination Device for High Bandwidth Campus Bridging Presentation by Larry Smarr to.
Presentation transcript:

“A California-Wide Cyberinfrastructure for Data-Intensive Research” Invited Presentation CENIC Annual Retreat Santa Rosa, CA July 22, 2014 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD 1

Vision: Creating a California-Wide Science DMZ Connected to CENIC, I2, & GLIF Use Lightpaths to Connect All Data Generators and Consumers, Creating a “Big Data” Plane Integrated With High Performance Global Networks “The Bisection Bandwidth of a Cluster Interconnect, but Deployed on a 10-Campus Scale.” This Vision Has Been Building for Over a Decade

Calit2/SDSC Proposal to Create a UC Cyberinfrastructure of OptIPuter “On-Ramps” to TeraGrid Resources UC San Francisco UC San Diego UC Riverside UC Irvine UC Davis UC Berkeley UC Santa Cruz UC Santa Barbara UC Los Angeles UC Merced OptIPuter + CalREN-XD + TeraGrid = “OptiGrid” Source: Fran Berman, SDSC Creating a Critical Mass of End Users on a Secure LambdaGrid LS 2005 Slide

CENIC Provides an Optical Backplane For the UC Campuses Upgrading to 100G

Global Innovation Centers are Connected with 10 Gigabits/sec Clear Channel Lightpaths Source: Maxine Brown, UIC and Robert Patterson, NCSA Members of The Global Lambda Integrated Facility Meet Annually at Calit2’s Qualcomm Institute

Why Now? Federating the Dozen+ California CC-NIE Grants 2011 ACCI Strategic Recommendation to the NSF #3: –NSF should create a new program funding high-speed (currently 10 Gbps) connections from campuses to the nearest landing point for a national network backbone. The design of these connections must include support for dynamic network provisioning services and must be engineered to support rapid movement of large scientific data sets." –- pg. 6, NSF Advisory Committee for Cyberinfrastructure Task Force on Campus Bridging, Final Report, March 2011 – –Led to Office of Cyberinfrastructure RFP March 1, 2012 NSF’s Campus Cyberinfrastructure – Network Infrastructure & Engineering (CC-NIE) Program –85 Grants Awarded So Far (NSF Summit in June 2014) –Roughly $500k per Campus California Must Move Rapidly or Lose a Ten-Year Advantage!

Creating a “Big Data” Plane NSF CC-NIE Funded NSF CC-NIE Has Awarded Optical Switch Phil Papadopoulos, SDSC, Calit2, PI CHERuB

UC-Wide “Big Data Plane” Puts High Performance Data Resources Into Your Lab 12

How to Terminate 10Gbps in Your Lab FIONA – Inspired by Gordon FIONA – Flash I/O Node Appliance –Combination of Desktop and Server Building Blocks –US$5K - US$7K –Desktop Flash up to 16TB –RAID Drives up to 48TB –Drive HD 2D & 3D Displays –10GbE/40GbE Adapter –Tested speed 30Gbs –Developed by UCSD’s –Phil Papadopoulos –Tom DeFanti –Joe Keefe FIONA 3+GB/s Data Appliance, 32GB 9 X 256GB 510MB/sec 8 X 3TB 125MB/sec 2 x 40GbE 2 TB Cache 24TB Disk

100G CENIC to UCSD—NSF CC-NIE Configurable, High-speed, Extensible Research Bandwidth (CHERuB) Source: Mike Norman, SDSC

NSF CC-NIE Funded UCI LightPath: A Dedicated Campus Science DMZ Network for Big Data Transfer Source: Dana Roode, UCI

NSF CC-NIE Funded UC Berkeley ExCEEDS - Extensible Data Science Networking Source: Jon Kuroda, UCB

NSF CC-NIE Funded UC Davis Science DMZ Architecture Source: Matt Bishop, UCD

NSF CC-NIE Funded Adding a Science DMZ to Existing Shared Internet at UC Santa Cruz BeforeAfter Source: Brad Smith, UCSC

Coupling to California CC-NIE Winning Proposals From Non-UC Campuses Caltech –Caltech High-Performance OPtical Integrated Network (CHOPIN) –CHOPIN Deploys Software-Defined Networking (SDN) Capable Switches –Creates 100Gbps Link Between Caltech and CENIC and Connection to: –California OpenFlow Testbed Network (COTN) –Internet2 Advanced Layer 2 Services (AL2S) network –Driven by Big Data High Energy Physics, astronomy (LIGO, LSST), Seismology, Geodetic Earth Satellite Observations Stanford University –Develop SDN-Based Private Cloud –Connect to Internet2 100G Innovation Platform –Campus-Wide Sliceable/VIrtualized SDN Backbone (10-15 switches) –SDN control and management San Diego State University –Implementing a ESnet Architecture Science DMZ –Balancing Performance and Security Needs –Promote Remote Usage of Computing Resources at SDSU Source: Louis Fox, CENIC CEO Also USC

High Performance Computing and Storage Become Plug Ins to the “Big Data” Plane

NERSC and ESnet Offer High Performance Computing and Networking Cray XC Petaflops Dedicated Feb. 5, 2014

SDSC’s Comet is a ~2 PetaFLOPs System Architected for the “Long Tail of Science” NSF Track 2 award to SDSC $12M NSF award to acquire $3M/yr x 4 yrs to operate Production early 2015

UCSD/SDSC Provides CoLo Facilities Over Multi-Gigabit/s Optical Networks CapacityUtilizedHeadroom Racks480 (=80%) Power (MW) (fall 2014) 6.3 (13 to bldg) Cooling capacity (MW) UPS (total) (MW) UPS/Generator MW Network Connectivity (Fall ’14) 100Gbps (CHERuB - layer 2 only): via CENIC to PacWave, Internet2 AL2S & ESnet 20Gbps (each): CENIC HPR (Internet2), CENIC DC (K-20+ISPs) 10Gbps (each): CENIC HPR-L2, ESnet L3, Pacwave L2, XSEDENet, FutureGrid (IU) Current Usage Profile (racks) UCSD: 248 Other UC campuses: 52 Non-UC nonprofit/industry: 26 Protected-Data Equipment or Services (PHI, HIPAA) UCD, UCI, UCOP, UCR, UCSC, UCSD, UCSF, Rady Children’s Hospital

Triton Shared Computing Cluster “Hotel” & “Condo” Models Participation Model: –Hotel: –Pre-Purchase Computing Time as Needed / Run on Subset of Cluster –For Small/Medium & Short- Term Needs –Condo: –Purchase Nodes with Equipment Funds and Have “Run Of The Cluster” –For Longer Term Needs / Larger Runs –Annual Operations Fee Is Subsidized (~75%) for UCSD System Capabilities: –Heterogeneous System for Range of User Needs –Intel Xeon, NVIDIA GPU, Mixed Infiniband / Ethernet Interconnect –180 Total Nodes, ~ 80-90TF Performance –40+ Hotel Nodes –700TB High Performance Data Oasis Parallel File System –Persistent Storage via Recharge User Profile: –16 Condo Groups (All UCSD) –~600 User Accounts –Hotel Partition –Users From 8 UC Campuses –UC Santa Barbara & Merced Most Active After UCSD –~70 Users from Outside Research Institutes and Industry

approximately 50 miles: Note: locations are approximate to CI and PEMEX HPWREN Topology Covers San Diego, Imperial, and Part of Riverside Counties

SoCal Weather Stations: Note the High Density in San Diego County Source: Jessica Block, Calit2

Interactive Virtual Reality of San Diego County Includes Live Feeds From 150 Met Stations TourCAVE at Calit2’s Qualcomm Institute

Real-Time Network Cameras on Mountains for Environmental Observations Source: Hans Werner Braun, HPWREN PI

Many Disciplines Require Dedicated High Bandwidth on Campus Remote Analysis of Large Data Sets –Particle Physics, Regional Climate Change Connection to Remote Campus Compute & Storage Clusters –Microscopy and Next Gen Sequencers Providing Remote Access to Campus Data Repositories –Protein Data Bank, Mass Spectrometry, Genomics Enabling Remote Collaborations –National and International Extending Data-Intensive Research to Surrounding Counties –HPWREN Big Data Flows Add to Commodity Internet to Fully Utilize CENIC’s 100G Campus Connection

California Integrated Digital Infrastructure: Next Steps White Paper for UCSD Delivered to Chancellor –Creating a Campus Research Data Library –Deploying Advanced Cloud, Networking, Storage, Compute, and Visualization Services –Organizing a User-Driven IDI Specialists Team –Riding the Learning Curve from Leading-Edge Capabilities to Community Data Services White Paper for UC-Wide IDI Under Development –Begin Work on Integrating CC-NIEs Across Campuses –Extending the HPWREN from UC Campuses Calit2 (UCSD, UCI) and CITRIS (UCB, UCSC, UCD) –Organizing UCOP MRPI Planning Grant –NSF Coordinated CC-NIE Supplements Add in Other UCs, Privates, CSU, …

PRISM is Connecting CERN’s CMS Experiment To UCSD Physics Department at 80 Gbps All UC LHC Researchers Could Share Data/Compute Across CENIC/Esnet at Gbps

Dan Cayan USGS Water Resources Discipline Scripps Institution of Oceanography, UC San Diego much support from Mary Tyree, Mike Dettinger, Guido Franco and other colleagues Sponsors: California Energy Commission NOAA RISA program California DWR, DOE, NSF Planning for climate change in California substantial shifts on top of already high climate variability SIO Campus Climate Researchers Need to Download Results from Remote Supercomputer Simulations to Make Regional Climate Change Forecasts

average summer afternoon temperature average summer afternoon temperature 29 GFDL A2 1km downscaled to 1km Source: Hugo Hidalgo, Tapash Das, Mike Dettinger

NIH National Center for Microscopy & Imaging Research Integrated Infrastructure of Shared Resources Source: Steve Peltier, Mark Ellisman, NCMIR Local SOM Infrastructure Scientific Instruments End User FIONA Workstation Shared Infrastructure

PRISM Links Calit2’s VROOM to NCMIR to Explore Confocal Light Microscope Images of Rat Brains

Protein Data Bank (PDB) Needs Bandwidth to Connect Resources and Users Archive of experimentally determined 3D structures of proteins, nucleic acids, complex assemblies One of the largest scientific resources in life sciences Source: Phil Bourne and Andreas Prlić, PDB Hemoglobin Virus

Why is it Important? –Enables PDB to Better Serve Its Users by Providing Increased Reliability and Quicker Results Need High Bandwidth Between Rutgers & UCSD Facilities – More than 300,000 Unique Visitors per Month – Up to 300 Concurrent Users –~10 Structures are Downloaded per Second 7/24/365 PDB Plans to Establish Global Load Balancing Source: Phil Bourne and Andreas Prlić, PDB

Cancer Genomics Hub (UCSC) is Housed in SDSC CoLo: Storage CoLo Attracts Compute CoLo CGHub is a Large-Scale Data Repository/Portal for the National Cancer Institute’s Cancer Genome Research Programs Current Capacity is 5 Petabytes, Scalable to 20 Petabytes; Cancer Genome Atlas Alone Could Produce 10 PB in the Next Four Years (David Haussler, PI) “SDSC [colocation service] has exceeded our expectations of what a data center can offer. We are glad to have the CGHub database located at SDSC.” Researchers can already install their own computers at SDSC, where the CGHub data is physically housed, so that they can run their own analyses. ( to-speed-research.html) to-speed-research.html Berkeley is connecting at 100Gbps to CGHub Source: Richard Moore, et al. SDSC

PRISM Will Link Computational Mass Spectrometry and Genome Sequencing Cores to the Big Data Freeway ProteoSAFe: Compute-intensive discovery MS at the click of a button MassIVE: repository and identification platform for all MS data in the world Source: proteomics.ucsd.edu

Telepresence Meeting Using Digital Cinema 4k Streams Keio University President Anzai UCSD Chancellor Fox Lays Technical Basis for Global Digital Cinema Sony NTT SGI Streaming 4k with JPEG 2000 Compression ½ Gbit/sec 100 Times the Resolution of YouTube! Auditorium 4k = 4000x2000 Pixels = 4xHD

Tele-Collaboration for Audio Post-Production Realtime Picture & Sound Editing Synchronized Over IP Skywalker Diego

Collaboration Between EVL’s CAVE2 and Calit2’s VROOM Over 10Gb Wavelength EVL Calit2 Source: NTT Sponsored ON*VECTOR Workshop at Calit2 March 6, 2013

High Performance Wireless Research and Education Network National Science Foundation awards , and

Development of end-to-end “cyberinfrastructure” for “analysis of large dimensional heterogeneous real-time sensor data” System integration of real-time sensor networks, satellite imagery, near-real time data management tools, wildfire simulation tools connectivity to emergency command centers before during and after a firestorm. A Scalable Data-Driven Monitoring, Dynamic Prediction and Resilience Cyberinfrastructure for Wildfires (WiFire) NSF Has Just Awarded the WiFire Grant – Ilkay Altintas SDSC PI Photo by Bill Clayton

Using Calit2’s Qualcomm Institute NexCAVE for CAL FIRE Research and Planning Source: Jessica Block, Calit2