“An Integrated West Coast Science DMZ for Data-Intensive Research” Panel CENIC Annual Conference University of California, Irvine Irvine, CA March 9, 2015.

Slides:



Advertisements
Similar presentations
Calit2-Living in the Future " Keynote Sharecase 2006 University of California, San Diego March 29, 2006 Dr. Larry Smarr Director, California Institute.
Advertisements

High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
The First Year of Cal-(IT) 2 Report to The University of California Regents UCSF San Francisco, CA March 13, 2002 Dr. Larry Smarr Director, California.
“A California-Wide Cyberinfrastructure for Data-Intensive Research” Invited Presentation CENIC Annual Retreat Santa Rosa, CA July 22, 2014 Dr. Larry Smarr.
AHM Overview OptIPuter Overview Third All Hands Meeting OptIPuter Project San Diego Supercomputer Center University of California, San Diego January 26,
“Building US/Mexico Collaborations Using Optical Networks” Opening Workshop Welcome Big Data Big Network 2 Calit2’s Qualcomm Institute February 10, 2014.
SAN DIEGO SUPERCOMPUTER CENTER Emerging HIPAA and Protected Data Requirements for Research Computing at SDSC Ron Hawkins Director of Industry Relations.
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced Scientific.
“Advances and Breakthroughs in Computing – The Next Ten Years” Invited Talk CTO Forum San Francisco, CA November 5, 2014 Dr. Larry Smarr Director, California.
PRISM: High-Capacity Networks that Augment Campus’ General Utility Production Infrastructure Philip Papadopoulos, PhD. Calit2 and SDSC.
A High-Performance Campus-Scale Cyberinfrastructure: The Technical, Political, and Economic Presentation by Larry Smarr to the NSF Campus Bridging Workshop.
“Introduction to UC San Diego’s Integrated Digital Infrastructure” Opening Talk IDI Showcase 2015 University of California, San Diego May 6-7, 2015 Dr.
James Deaton Chief Technology Officer OneNet Glass, Graphs and Gear.
“A UC-Wide Cyberinfrastructure for Data-Intensive Research” Invited Presentation UC IT Leadership Council Oakland, CA May 19, 2014 Dr. Larry Smarr Director,
Science and Cyberinfrastructure in the Data-Dominated Era Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
Ocean Observatories Initiative Common Execution Infrastructure (CEI) Overview Michael Meisinger September 29, 2009.
Source: Jim Dolgonas, CENIC CENIC is Removing the Inter-Campus Barriers in California ~ $14M Invested in Upgrade Now Campuses Need to Upgrade.
“An Integrated Science Cyberinfrastructure for Data-Intensive Research” Panel CISCO Executive Symposium San Diego, CA June 9, 2015 Dr. Larry Smarr Director,
Next Generation Peering for Next Generation Networks Jacqueline Brown Executive Director International Partnerships Pacific Northwest Gigapop CANS2004,
The Research and Education Network: Platform for Innovation Heather Boyles, Next Generation Network Symposium Malaysia 2007-March-15.
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
“Creating a High Performance Cyberinfrastructure to Support Analysis of Illumina Metagenomic Data” DNA Day Department of Computer Science and Engineering.
Developing a North American Global LambdaGrid Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E.
“Comparative Human Microbiome Analysis” Remote Video Talk to CICESE Big Data, Big Network Workshop Ensenada, Mexico October 10, 2013 Dr. Larry Smarr Director,
A Technology Vision for the Future Rick Summerhill, Chief Technology Officer, Eric Boyd, Deputy Technology Officer, Internet2 Joint Techs Meeting 16 July.
Campus Cyberinfrastructure – Network Infrastructure and Engineering (CC-NIE) Kevin Thompson NSF Office of CyberInfrastructure April 25, 2012.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
Chicago/National/International OptIPuter Infrastructure Tom DeFanti OptIPuter Co-PI Distinguished Professor of Computer Science Director, Electronic Visualization.
A Wide Range of Scientific Disciplines Will Require a Common Infrastructure Example--Two e-Science Grand Challenges –NSF’s EarthScope—US Array –NIH’s Biomedical.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
Russ Hobby Program Manager Internet2 Cyberinfrastructure Architect UC Davis.
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
A Brief Overview Andrew K. Bjerring President and CEO.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
Spectrum of Support for Data Movement and Analysis in Big Data Science Network Management and Control E-Center & ESCPS Network Management and Control E-Center.
“The Pacific Research Platform: a Science-Driven Big-Data Freeway System.” Opening Presentation Pacific Research Platform Workshop Calit2’s Qualcomm Institute.
Cyberinfrastructure: An investment worth making Joe Breen University of Utah Center for High Performance Computing.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
The PRPv1 Architecture Model Panel Presentation Building the Pacific Research Platform Qualcomm Institute, Calit2 UC San Diego October 16, 2015.
“The Pacific Research Platform: a Science-Driven Big-Data Freeway System.” Big Data for Information and Communications Technologies Panel Presentation.
Slide 1 UCI Lightpath A Dedicated Network for Research on UCI Campus Jessica Yu Office of Information Technology, UC Irvine NSF PI Workshop 9/29/2015.
Slide 1 UCSC 100 Gbps Science DMZ – 1 year 9 month Update Brad Smith & Mary Doyle.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
“The UCSD Big Data Freeway System” Invited Short Talk Workshop on “Enriching Human Life and Society” UC San Diego February 6, 2014 Dr. Larry Smarr Director,
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
UCSD’s Distributed Science DMZ
Enabling Genomic BIG DATA with Content Centric Networking J.J. Garcia-Luna-Aceves UC Santa Cruz
Cyberinfrastructure and Internet2 Eric Boyd Deputy Technology Officer Internet2.
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
“Genomics: The CAMERA Project" Invited Talk 5 th Annual ON*VECTOR International Photonics Workshop UCSD February 28, 2006 Dr. Larry Smarr Director,
WHAT SURF DOES FOR RESEARCH SURF’s Science Engagement TNC15 June 18, 2015 Sylvia Kuijpers (SURFnet)
“Pacific Research Platform Science Drivers” Opening Remarks PRP Science Driver PI Workshop UC Davis March 23, 2016 Dr. Larry Smarr Director, California.
Slide 1 E-Science: The Impact of Science DMZs on Research Presenter: Alex Berryman Performance Engineer, OARnet Paul Schopis, Marcio Faerman.
High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
“The Pacific Research Platform Two Years In”
The NSRC cultivates collaboration among a community of peers to build and improve a global Internet that benefits all parties. We facilitate the growth.
Research Data Transfer Zones
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced.
ESnet and Science DMZs: an update from the US
Optical SIG, SD Telecom Council
The OptIPortal, a Scalable Visualization, Storage, and Computing Termination Device for High Bandwidth Campus Bridging Presentation by Larry Smarr to.
Presentation transcript:

“An Integrated West Coast Science DMZ for Data-Intensive Research” Panel CENIC Annual Conference University of California, Irvine Irvine, CA March 9, 2015 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD 1

CENIC 2015 Panel: Building the Pacific Research Platform Presenters: –Larry Smarr, Calit2 –Eli Dart, ESnet –John Haskins, UCSC –John Hess, CENIC –Erik McCroskey, UC Berkeley –Paul Murray, Stanford –Michael van Norman, UCLA Abstract: The Pacific Research Platform is a project to forward the work of advanced researchers and their access to technical infrastructure, with a vision of connecting all the National Science Foundation Cyberinfrastructure grants (NSF CC-NIE & CC-IIE) to research universities within the region, as well as the Department of Energy (DOE) labs and the San Diego Supercomputer Center (SDSC). Larry Smarr, founding Director of Calit2, will present an overview of the project, followed by a panel discussion of regional inter-site connectivity challenges and opportunities. LS had assistance today from: –Tom DeFanti, Research Scientist, Calit2’s Qualcomm Institute, UC San Diego –John Graham, Senior Development Engineer, Calit2’s Qualcomm Institute, UC San Diego –Richard Moore, Deputy Director, San Diego Supercomputer Center, UC San Diego –Phil Papadopoulos, CTO, San Diego Supercomputer Center, UC San Diego

Vision: Creating a West Coast “Big Data Freeway” Connected by CENIC/Pacific Wave to I2 & GLIF Use Lightpaths to Connect All Data Generators and Consumers, Creating a “Big Data” Plane Integrated With High Performance Global Networks “The Bisection Bandwidth of a Cluster Interconnect, but Deployed on a 10-Campus Scale.” This Vision Has Been Building for Over Two Decades

I-WAY: Information Wide Area Year Supercomputing ‘95 The First National 155 Mbps Research Network –65 Science Projects –Into the San Diego Convention Center I-Way Featured: –Networked Visualization Application Demonstrations –Large-Scale Immersive Displays –I-Soft Programming Environment UIC CitySpace Cellular Semiotics

Global Connections Between University Research Centers at 10Gbps September 26-30, 2005 University of California, San Diego California Institute for Telecommunications and Information Technology i Grid 2005 T H E G L O B A L L A M B D A I N T E G R A T E D F A C I L I T Y Maxine Brown, Tom DeFanti, Co-Chairs 21 Countries Driving 50 Demonstrations 1 or 10Gbps to Building Sept 2005

Academic Research “OptIPlatform” Cyberinfrastructure: A 10Gbps Lightpath Cloud National LambdaRail Campus Optical Switch Data Repositories & Clusters HPC HD/4k Video Images HD/4k Video Cams End User OptIPortal 10G Lightpath HD/4k Telepresence Instruments LS 2009 Slide

CENIC is Rapidly Moving to Connect at 100 Gbps Across the State and Nation DOE Internet2

Creating a “Big Data” Plane on Campus: NSF CC-NIE Funded and CHeruB Phil Papadopoulos, SDSC, Calit2, PI CHERuB, Mike Norman, SDSC PI CHERuB

SDSC Big Data Compute/Storage Facility - Interconnected at Over 1 Tbps 128 COMET VM SC 2 PF 128 Gordon Big Data SC Oasis Data Store 128 Source: Philip Papadopoulos, SDSC/Calit2 Arista Router Can Switch Gps Light Paths 6000 TB > 800 Gbps # of Parallel 10Gbps Optical Light Paths 128 x 10Gbps = 1.3Tbps SDSC Supercomputers

High Performance Computing and Storage Become Plug Ins to the “Big Data” Plane

SDSC’s Comet is a ~2 PetaFLOPs System Architected for the “Long Tail of Science” NSF Track 2 award to SDSC $12M NSF award to acquire $3M/yr x 4 yrs to operate Production early 2015

NERSC and ESnet Offer High Performance Computing and Networking Cray XC Petaflops Dedicated Feb. 5, 2014

Many Disciplines Beginning to Need Dedicated High Bandwidth on Campus Remote Analysis of Large Data Sets –Particle Physics Connection to Remote Campus Compute & Storage Clusters –Microscopy and Next Gen Sequencers Providing Remote Access to Campus Data Repositories –Protein Data Bank and Mass Spectrometry Enabling Remote Collaborations –National and International How to Utilize a CENIC 100G Campus Connection

Particle Physics: Creating a Gbps LambdaGrid to Support LHC Researchers ATLAS CMS U.S. Institutions Participating in LHC LHC Data Generated by CMS & ATLAS Detectors Analyzed on OSG Maps from

Cancer Genomics Hub (UCSC) is Housed in SDSC CoLo: Large Data Flows to End Users 1G 8G 15G Cumulative TBs of CGH Files Downloaded Data Source: David Haussler, Brad Smith, UCSC 30 PB

Earth Sciences: Pacific Earthquake Engineering Research Center Enabling Real-Time Coupling Between Shake Tables and Supercomputer Simulations

Automated Telescope Surveys Are Creating Huge Datasets 300 images per night. 100MB per raw image 30GB per night 120GB per night 250 images per night. 530MB per raw image 150 GB per night 800GB per night When processed at NERSC Increased by 4x Source: Peter Nugent, Division Deputy for Scientific Engagement, LBL Professor of Astronomy, UC Berkeley

NICS ORNL NSF TeraGrid Kraken Cray XT5 8,256 Compute Nodes 99,072 Compute Cores 129 TB RAM simulation Argonne NL DOE Eureka 100 Dual Quad Core Xeon Servers 200 NVIDIA Quadro FX GPUs in 50 Quadro Plex S4 1U enclosures 3.2 TB RAM rendering SDSC Calit2/SDSC OptIPortal ” (2560 x 1600 pixel) LCD panels 10 NVIDIA Quadro FX 4600 graphics cards > 80 megapixels 10 Gb/s network throughout visualization ESnet 10 Gb/s fiber optic network *ANL * Calit2 * LBNL * NICS * ORNL * SDSC Using Supernetworks to Couple End User to Remote Supercomputers and Visualization Servers Source: Mike Norman, Rick Wagner, SDSC Real-Time Interactive Volume Rendering Streamed from ANL to SDSC Demoed SC09

Collaboration Between EVL’s CAVE2 and Calit2’s VROOM Over 10Gb Wavelength EVL Calit2 Source: NTT Sponsored ON*VECTOR Workshop at Calit2 March 6, 2013

DOE Esnet’s Science DMZ: A Scalable Network Design Model for Optimizing Science Data Transfers A Science DMZ integrates 4 key concepts into a unified whole: –A network architecture designed for high-performance applications, with the science network distinct from the general-purpose network –The use of dedicated systems for data transfer –Performance measurement and network testing systems that are regularly used to characterize and troubleshoot the network –Security policies and enforcement mechanisms that are tailored for high performance science environments

NSF Funding Has Enabled Science DMZs at Over 100 U.S. Campuses 2011 ACCI Strategic Recommendation to the NSF #3: –NSF should create a new program funding high-speed (currently 10 Gbps) connections from campuses to the nearest landing point for a national network backbone. The design of these connections must include support for dynamic network provisioning services and must be engineered to support rapid movement of large scientific data sets." –- pg. 6, NSF Advisory Committee for Cyberinfrastructure Task Force on Campus Bridging, Final Report, March 2011 – –Led to Office of Cyberinfrastructure CC-NIE RFP March 1, 2012 NSF’s Campus Cyberinfrastructure – Network Infrastructure & Engineering (CC-NIE) Program –>130 Grants Awarded So Far (New Solicitation Open) –Roughly $500k per Campus Next Logical Step-Interconnect Campus Science DMZs

Science DMZ Data Transfer Nodes Can Be Inexpensive PCs Optimized for Big Data FIONA – Flash I/O Node Appliance –Combination of Desktop and Server Building Blocks –US$5K - US$7K –Desktop Flash up to 16TB –RAID Drives up to 48TB –10GbE/40GbE Adapter –Tested speed 40Gbs –Developed Under UCSD CC-NIE Prism Award by UCSD’s –Phil Papadopoulos –Tom DeFanti –Joe Keefe FIONA 3+GB/s Data Appliance, 32GB 9 X 256GB 510MB/sec 8 X 3TB 125MB/sec 2 x 40GbE 2 TB Cache 24TB Disk For More on Science DMZ DTNs See:

Audacious Goal: Build a West Coast Science DMZ Why Did We Think This Was Possible? –Esnet Designed Science DMZs to be: –Scalable and incrementally deployable, –Easily adaptable to incorporate emerging technologies such as: –100 Gigabit Ethernet services, –virtual circuits, and –software-defined networking capabilities –Many Campuses on the West Coast Created Science DMZs –CENIC/Pacific Wave is Upgrading to 100G Services –UCSD’s FIONAs Are Rapidly Deployable Inexpensive DTNs So Can We Use CENIC/PW to Interconnect Many Science DMZs?

CENIC/Pacific Wave is the Optical Backplane of the Pacific Research Platform (PRP) NSF Has Invested Over $9M in CC-NIE Campus Awards

The Pacific Wave Platform Creates a Regional Science DMZ Source: John Hess, CENIC

Pacific Research Platform – Panel Discussion CENIC 2015 March 9, 2015

Thanks to: Caltech CENIC / Pacific Wave ESnet / LBNL San Diego State University SDSC Stanford University University of Washington USC UC Berkeley UC Davis UC Irvine UC Los Angeles UC Riverside UC San Diego UC Santa Cruz

Pacific Research Platform Strategic Arc High performance network backplane for data-intensive science o This is qualitatively different than the commodity Internet o High performance data movement provides capabilities that are otherwise unavailable to scientists o Linking the Science DMZs across the West Coast is building something new o This capability is extensible, both regionally and nationally Goal - scientists at CENIC institutions can get the data they need, where they need it, when they need it

What did we do? Concentrated on the regional aspects of the problem. There are lots of parts to the research data movement problem. This experiment mostly looked at the inter-campus piece. If it looks a bit rough, this has all happened in about 10 weeks of work. Collaborated among lots of network and HPC staff at lots of sites to Build mesh of perfSONAR instances. Implement MaDDash -- Measurement and Debugging Dashboard. Deploy Data Transfer Nodes (DTN) Perform GridFTP file transfers to quantify throughput of reference data sets.

What did we do? Constructed a temporary network using 100G links to demonstrate the potential of networks with burst capacity greater than that of a single DTN. Partial ad-hoc BGP peering mesh between some test points to make use of 100G paths. Identified some specific optimizations needed. Fixed a few problems in pursuit of gathering illustrative data for this preso. Identified anomalies for further investigation.

Test nodes ordered by geographic latitude Performance for nodes that are close is better than for nodes that are far away Network problems that manifest over a distance may not manifest locally

DTNs loaded with Globus Connect Server suite to obtain GridFTP tools. cron-scheduled transfers using globus- url-copy. ESnet-contributed script parses GridFTP transfer log and loads results in an esmond measurement archive.

bost-pt1.es.net -- ps10g-asm2.tools.ucla.net ●Something changed ○Consistent performance decrease ○Both directions

ps10g.sdsc.edu—lbl-pt1.es.net Something changed Consistent performance decrease Only in one direction

perf-scidmz.cac.washington.edu -- dps10.ucsc.edu Something changed Consistent performance improvement Both Directions Path changed 2/20

Coordinating this effort was quite a bit of work, and there’s still a lot to do. Traffic doesn’t always go where you think it does. Familiarity with measurement toolkits such as perfSONAR (bwctl / iperf3, owamp) and MaDDash. We need people’s time to continue the effort.

●Future of CENIC High Performance Research Network (HPR) o Migrate to 100 Gbps Layer3 on HPR. o Evolve into persistent infrastructure ●Enhance and maintain perfSONAR test infrastructure across R&E sites. ●Engagement with scientists to map their research to the Pacific Research Platform

Links  ESnet fasterdata knowledge base  Science DMZ paper  Science DMZ list To subscribe, send to subject "subscribe esnet-sciencedmz”  perfSONAR  perfSONAR dashboard 38 – ESnet Science Engagement - © 2014, Energy Sciences Network