Download presentation
Presentation is loading. Please wait.
Published byNoah McBride Modified over 8 years ago
1
“The Pacific Research Platform” Opening Keynote Lecture 15th Annual ON*VECTOR International Photonics Workshop Calit2’s Qualcomm Institute University of California, San Diego February 29, 2016 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD http://lsmarr.calit2.net 1
2
Abstract Research in data-intensive fields is increasingly multi-investigator and multi-campus, depending on ever more rapid access to ultra-large heterogeneous and widely distributed datasets. The Pacific Research Platform (PRP) is a multi-institutional extensible deployment that establishes a science-driven high-capacity data-centric “freeway system.” The PRP spans all 10 campuses of the University of California, as well as the major California private research universities, four supercomputer centers, and several universities outside California. Fifteen multi-campus data-intensive application teams act as drivers of the PRP, providing feedback over the five years to the technical design staff. These application areas include particle physics, astronomy/astrophysics, earth sciences, biomedicine, and scalable multimedia, providing models for many other applications. The PRP partnership extends the NSF-funded campus cyberinfrastructure awards to a regional model that allows high-speed data-intensive networking, facilitating researchers moving data between their labs and their collaborators’ sites, supercomputer centers or data repositories, and enabling that data to traverse multiple heterogeneous networks without performance degradation over campus, regional, national, and international distances.
3
Vision: Creating a Pacific Research Platform Use Lightpaths to Connect All Data Generators and Consumers, Creating a “Big Data” Freeway System Using CENIC/Pacific Wave as a 100Gbps Backplane Integrated With High-Performance Global Networks “The Bisection Bandwidth of a Cluster Interconnect, but Deployed on a 20-Campus Scale.” This Vision Has Been Building for 15 Years
4
NSF’s OptIPuter Project: Demonstrating How SuperNetworks Can Meet the Needs of Data-Intensive Researchers OptIPortal– Termination Device for the OptIPuter Global Backplane Calit2 (UCSD, UCI), SDSC, and UIC Leads—Larry Smarr PI Univ. Partners: NCSA, USC, SDSU, NW, TA&M, UvA, SARA, KISTI, AIST Industry: IBM, Sun, Telcordia, Chiaro, Calient, Glimmerglass, Lucent 2003-2009 $13,500,000 In August 2003, Jason Leigh and his students used RBUDP to blast data from NCSA to SDSC over the TeraGrid DTFnet, achieving18Gbps file transfer out of the available 20Gbps LS Slide 2005
5
DOE ESnet’s Science DMZ: A Scalable Network Design Model for Optimizing Science Data Transfers A Science DMZ integrates 4 key concepts into a unified whole: –A network architecture designed for high-performance applications, with the science network distinct from the general-purpose network –The use of dedicated systems for data transfer –Performance measurement and network testing systems that are regularly used to characterize and troubleshoot the network –Security policies and enforcement mechanisms that are tailored for high performance science environments http://fasterdata.es.net/science-dmz/ Science DMZ Coined 2010 The DOE ESnet Science DMZ and the NSF “Campus Bridging” Taskforce Report Formed the Basis for the NSF Campus Cyberinfrastructure Network Infrastructure and Engineering (CC-NIE) Program
6
Based on Community Input and on ESnet’s Science DMZ Concept, NSF Has Funded Over 100 Campuses to Build Local Big Data Freeways Red 2012 CC-NIE Awardees Yellow 2013 CC-NIE Awardees Green 2014 CC*IIE Awardees Blue 2015 CC*DNI Awardees Purple Multiple Time Awardees Source: NSF
7
Creating a “Big Data” Freeway on Campus: NSF-Funded CC-NIE Grants Prism@UCSD and CHeruB Prism@UCSD, Phil Papadopoulos, SDSC, Calit2, PI (2013-15) CHERuB, Mike Norman, SDSC PI CHERuB
8
Prism@UCSD Has Connected the Science Drivers Reported to ON*VECTOR in 2013 “Terminating the GLIF” - 2013 ON*VECTOR www.youtube.com/watch?v=Ar7vmMIM7q8
9
FIONA – Flash I/O Network Appliance: Linux PCs Optimized for Big Data UCOP Rack-Mount Build: FIONAs Are Science DMZ Data Transfer Nodes & Optical Network Termination Devices UCSD CC-NIE Prism Award & UCOP Phil Papadopoulos & Tom DeFanti Joe Keefe & John Graham Cost$8,000$20,000 Intel Xeon Haswell Multicore E5-1650 v3 6-Core 2x E5-2697 v3 14-Core RAM128 GB256 GB SSDSATA 3.8 TB Network Interface10/40GbE Mellanox 2x40GbE Chelsio+Mellanox GPUNVIDIA Tesla K80 RAID Drives 0 to 112TB (add ~$100/TB) John Graham, Calit2’s QI
10
How Prism@UCSD Transforms Big Data Microbiome Science FIONA 12 Cores/GPU 128 GB RAM 3.5 TB SSD 48TB Disk 10Gbps NIC Knight Lab 10Gbps Gordon Prism@UCSD Data Oasis 7.5PB, 200GB/s Knight 1024 Cluster In SDSC Co-Lo CHERuB 100Gbps Emperor & Other Vis Tools 64Mpixel Data Analysis Wall 120Gbps 40Gbps 1.3Tbps See Talk by Raju Kankipati, Arista in Session 1 Next
11
Next Step: The Pacific Research Platform Creates a Regional End-to-End Science-Driven “Big Data Freeway System” NSF CC*DNI Grant $5M 10/2015-10/2020 PI: Larry Smarr, UC San Diego Calit2 Co-Pis: Camille Crittenden, UC Berkeley CITRIS, Tom DeFanti, UC San Diego Calit2, Philip Papadopoulos, UC San Diego SDSC, Frank Wuerthwein, UC San Diego Physics and SDSC FIONAs as Uniform DTN End Points
12
Ten Week Sprint to Demonstrate the West Coast Big Data Freeway System: PRPv0 Presented at CENIC 2015 March 9, 2015 FIONA DTNs Now Deployed to All UC Campuses And Most PRP Sites
13
What About the Cloud? PRP Connects with the 2 NSF Experimental Cloud Grants –Chameleon Through Chicago –CloudLab Through Clemson CENIC/PW Has Multiple 10Gbps into Amazon Web Services –First 10Gbps Connection 5-10 Years Ago –Today, Seven 10Gbps Paths Plus a 100Gbps Path –Peak Usage is <10% –Lots of Room for Experimenting with Big Data –Interest from Microsoft and Google as well Clouds Useful for Lots of Small Data No Business Model for Small Amounts of Really Big Data Also Very High Financial Barriers to Exit See Kate Keahey, University of Chicago, Talk on Chameleon in Session 1 Next
14
PRP Timeline PRPv1 (Years 1 and 2) –A Layer 3 System –Completed In 2 Years –Tested, Measured, Optimized, With Multi-domain Science Data –Bring Many Of Our Science Teams Up –Each Community Thus Will Have Its Own Certificate-Based Access To its Specific Federated Data Infrastructure. PRPv2 (Years 3 to 5) –Advanced IPv6-Only Version with Robust Security Features –e.g. Trusted Platform Module Hardware and SDN/SDX Software –Develop Means to Operate a Shared Federation of Caches –Support Rates up to 100Gb/s in Bursts And Streams –Experiment With Science Drivers Requiring >100Gbps More on SDN/SDX in Session 3 Today Beyond 100Gbps in Session 2 Today
15
Why is PRPv1 Layer 3 Instead of Layer 2 like PRPv0? In the OptIPuter Timeframe, with Rare Exceptions, Routers Could Not Route at 10Gbps, But Could Switch at 10Gbps. Hence for Performance, L2 was Preferred. Today Routers Can Route at 100Gps Without Performance Degradation. Our Prism Arista Switch Routes at 40Gbps Without Dropping Packets or Impacting Performance. The Biggest Advantage of L3 is Scalability via Information Hiding. Details of the End- to-End Pathways are Not Needed, Simplifying the Workload of the Engineering Staff. Also Advantage of L3 is Engineered Path Redundancy within the Transport Network. Thus, a 100Gbps Routed Layer3 Backbone Architecture Has Many Advantages: –A Routed Layer3 Architecture Allows the Backbone to Stay Simple - Big, Fast, and Clean. –Campuses can Use the Connection Without Significant Effort and Complexity on the End Hosts. –Network Operators do not Need to Focus on Getting Layer 2 to Work and Later Diagnosing End-to- End Problems with Less Than Good Visibility. –This Leaves Us Free to Focus on the Applications on The Edges, on The Science Outcomes, and Less on The Backbone Network Itself. These points from Eli Dart, John Hess, Phil Papadopoulos, Ron Johnson, and others
16
Introducing the CENIC / ESnet / PRP Joint Cybersecurity Initiative The Challenge –Data Exchange is Essential for Open Science, But Cybersecurity Threats are Increasing –R&E Community Must be Unconstrained by Physical Location of Data, Scientific Tools, Computational Resources, or Research Collaborators –Global Networks are Essential for Scientific Collaboration Near Term: –Hold Joint CENIC/ESnet Operational Security Retreat –Develop CENIC/ESnet Shared Strategies, Research Goals, Operational Practices, & Technology Longer Term: –Achieve Closer Coordination of CENIC and ESnet Cybersecurity Teams –Assess Cybersecurity Requirements of CENIC and ESnet Constituents –Communicate Shared CENIC/ESnet Vision for Computer & Network Security to R&E Community More from CENIC’s Sean Peisert on Electrical Power Distribution Grids Tomorrow in Session 3 Source: Sean Peisert, CENIC, ESnet
17
Pacific Research Platform Regional Collaboration: Multi-Campus Science Driver Teams Jupyter Hub Particle Physics Astronomy and Astrophysics –Telescope Surveys –Galaxy Evolution –Gravitational Wave Astronomy Earth Sciences –Data Analysis and Simulation for Earthquakes and Natural Disasters –Climate Modeling: NCAR/UCAR –California/Nevada Regional Climate Data Analysis –Wireless Environmental Sensornets Biomedical –Cancer Genomics Hub/Browser –Microbiome and Integrative ‘Omics –Integrative Structural Biology Scalable Visualization, Virtual Reality, and Ultra-Resolution Video 17
18
PRP First Application: Distributed IPython/Jupyter Notebooks: Cross-Platform, Browser-Based Application Interleaves Code, Text, & Images IJulia IHaskell IFSharp IRuby IGo IScala IMathics Ialdor LuaJIT/Torch Lua Kernel IRKernel (for the R language) IErlang IOCaml IForth IPerl IPerl6 Ioctave Calico Project kernels implemented in Mono, including Java, IronPython, Boo, Logo, BASIC, and many others IScilab IMatlab ICSharp Bash Clojure Kernel Hy Kernel Redis Kernel jove, a kernel for io.js IJavascript Calysto Scheme Calysto Processing idl_kernel Mochi Kernel Lua (used in Splash) Spark Kernel Skulpt Python Kernel MetaKernel Bash MetaKernel Python Brython Kernel IVisual VPython Kernel Source: John Graham, QI
19
GPU JupyterHub: 2 x 14-core CPUs 256GB RAM 1.2TB FLASH 3.8TB SSD Nvidia K80 GPU Dual 40GbE NICs And a Trusted Platform Module 40Gbps GPU JupyterHub: 1 x 18-core CPUs 128GB RAM 3.8TB SSD Nvidia K80 GPU Dual 40GbE NICs And a Trusted Platform Module PRP UC-JupyterHub Backbone UCB Next Step: Deploy Across PRP UCSD Source: John Graham, Calit2
20
Open Science Grid Has Had a Huge Growth Over the Last Decade - Currently Federating Over 130 Clusters Crossed 100 Million Core-Hours/Month In Dec 2015 Over 1 Billion Data Transfers Moved 200 Petabytes In 2015 Supported Over 200 Million Jobs In 2015 Source: Miron Livny, Frank Wuerthwein, OSG ATLAS CMS
21
PRP Prototype of Aggregation of OSG Software & Services Across California Universities in a Regional DMZ Aggregate Petabytes of Disk Space & PetaFLOPs of Compute, Connected at 10-100 Gbps Transparently Compute on Data at Their Home Institutions & Systems at SLAC, NERSC, Caltech, UCSD, & SDSC SLAC UCSD & SDSC UCSB UCSC UCD UCR CSU Fresno UCI Source: Frank Wuerthwein, UCSD Physics; SDSC; co-PI PRP PRP Builds on SDSC’s LHC-UC Project Caltech ATLAS CMS other physics life sciences other sciences OSG Hours 2015 by Science Domain
22
PRP Will Support the Computation and Data Analysis in the Search for Sources of Gravitational Radiation Augment the aLIGO Data and Computing Systems at Caltech, by connecting at 10Gb/s to SDSC Comet supercomputer, enabling LIGO computations to enter via the same PRP “job cache” as for LHC.
23
Two Automated Telescope Surveys Creating Huge Datasets Will Drive PRP 300 images per night. 100MB per raw image 30GB per night 120GB per night 250 images per night. 530MB per raw image 150 GB per night 800GB per night When processed at NERSC Increased by 4x Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL Professor of Astronomy, UC Berkeley Precursors to LSST and NCSA PRP Allows Researchers to Bring Datasets from NERSC to Their Local Clusters for In-Depth Science Analysis
24
Global Scientific Instruments Will Produce Ultralarge Datasets Continuously Requiring Dedicated Optic Fiber and Supercomputers https://tnc 15.terena.org/getfil e/1939 Square Kilometer Array Large Synoptic Survey Telescope https://tnc15.terena.org/getfile/1939www.lsst.org/sites/default/files/documents/DM%20Introduction%20-%20Kantor.pdf Tracks ~40B Objects, Creates 10M Alerts/Night Within 1 Minute of Observing 2x40Gb/s
25
Dan Cayan USGS Water Resources Discipline Scripps Institution of Oceanography, UC San Diego much support from Mary Tyree, Mike Dettinger, Guido Franco and other colleagues NCAR Upgrading to 10Gbps Link from Wyoming and Boulder to CENIC/PRP Sponsors: California Energy Commission NOAA RISA program California DWR, DOE, NSF Planning for climate change in California substantial shifts on top of already high climate variability SIO Campus Climate Researchers Need to Download Results from NCAR Remote Supercomputer Simulations to Make Regional Climate Change Forecasts
26
average summer afternoon temperature average summer afternoon temperature 26 GFDL A2 1km downscaled to 1km Source: Hugo Hidalgo, Tapash Das, Mike Dettinger
27
High-Performance Wireless Research and Education Network (HPWREN) Real-Time Cameras on Mountains for Environmental Observations Source: Hans Werner Braun, HPWREN PI
28
HPWREN Users and Public Safety Clients Gain Redundancy and Resilience from PRP Upgrade San Diego Countywide Sensors and Camera Resources UCSD & SDSU Data & Compute Resources UCSD UCR SDSU UCI UCI & UCR Data Replication and PRP FIONA Anchors as HPWREN Expands Northward 10X Increase During Wildfires PRP 10G Link UCSD to SDSU –DTN FIONAs Endpoints –Data Redundancy –Disaster Recovery –High Availability –Network Redundancy Data From Hans-Werner Braun Source: Frank Vernon, Greg Hidley, UCSD
29
CENIC PRP Backbone Enables 2016 Expansion of HPWREN into Orange and Riverside Counties Anchor to CENIC at UCI & UCR –PRP FIONA Connects to CalREN-HPR Network –Potential Data Replication Sites UCR UCI UCSD SDSU Source: Frank Vernon, Greg Hidley, UCSD Sets Stage for Experiments with 5G Mobile Edge Since 5G has Optical Backplane
30
UCD UCSF Stanford NASA AMES/ NREN UCSC UCSB Caltech USC UCLA UCI UCSD SDSU UCR Esnet DoE Labs UW/ PNWGP Seattle Berkeley UCM Los Nettos Internet2 Seattle Note: This diagram represents a subset of sites and connections. * Institutions with Active Archaeology Programs “In an ideal world – Extremely high bandwidth to move large cultural heritage datasets around the PRP cloud for processing & viewing in CAVEs around PRP with Unlimited Storage for permanent archiving.” -Tom Levy, UCSD PRP is NOT Just for Big Data Science and Engineering: Linking Cultural Heritage and Archaeology Datasets Building on CENIC’s Expansion To Libraries, Museums, and Cultural Sites
31
Next Step: Global Research Platform Building on CENIC/Pacific Wave and GLIF Current International GRP Partners
32
New Applications of PRP-Like Networks Brain-Inspired Pattern Recognition –See Session 4 Later Today Internet of Things and the Industrial Internet –See Keynote and Sessions 1-3 Tomorrow Smart Cities –See Sessions 4 Tomorrow
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.