TeraGrid and I-WIRE: Models for the Future? Rick Stevens and Charlie Catlett Argonne National Laboratory The University of Chicago.

Slides:



Advertisements
Similar presentations
Tom DeFanti, Maxine Brown Principal Investigators, STAR TAP Linda Winkler, Bill Nickless, Alan Verlo, Caren Litvanyi, Andy Schmidt STAR TAP Engineering.
Advertisements

Electronic Visualization Laboratory University of Illinois at Chicago EVL Optical Networking Research Oliver Yu Electronic Visualization Laboratory University.
I-WIRE Background State Funded Infrastructure to support Networking and Applications Research $6.5M Total Funding $4M FY00-01 (in hand) $2.5M FY02 (approved.
Grids and Biology: A Natural and Happy Pairing Rick Stevens Director, Mathematics and Computer Science Division Argonne National Laboratory Professor,
June Canadas National Optical Internet.
University of Illinois at Chicago The Future of STAR TAP: Enabling e-Science Research Thomas A. DeFanti Principal Investigator, STAR TAP Director, Electronic.
MUNIS Platform Migration Project WELCOME. Agenda Introductions Tyler Cloud Overview Munis New Features Questions.
ANL NCSA PICTURE 1 Caltech SDSC PSC 128 2p Power4 500 TB Fibre Channel SAN 256 4p Itanium2 / Myrinet 96 GeForce4 Graphics Pipes 96 2p Madison + 96 P4 Myrinet.
University of Illinois at Chicago StarLight Located in Northwestern’s Downtown Campus Dark Fiber to UIC Carrier POPs Chicago NAP NU UIC.
Chapter 10 Wide Area Networks. Contents The need for Wide area networks (WANs) Point-to-point approaches Statistical multiplexing, TDM, FDM approaches.
Router Architecture : Building high-performance routers Ian Pratt
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Some Thoughts on Technology and Strategies for Petaflops.
RIT Campus Data Network. General Network Statistics Over 23,000 wired outlets Over 14,500 active switched ethernet ports > 250 network closets > 1,000.
Massive High-Performance Global File Systems for Grid Computing -By Phil Andrews, Patricia Kovatch, Christopher Jordan -Presented by Han S Kim.
Simo Niskala Teemu Pasanen
Company and Product Overview Company Overview Mission Provide core routing technologies and solutions for next generation carrier networks Founded 1996.
1 Wide Area Network. 2 What is a WAN? A wide area network (WAN ) is a data communications network that covers a relatively broad geographic area and that.
Questionaire answers D. Petravick P. Demar FNAL. 7/14/05 DLP -- GDB2 FNAL/T1 issues In interpreting the T0/T1 document how do the T1s foresee to connect.
1 Computing platform Andrew A. Chien Mohsen Saneei University of Tehran.
Chapter 2 The Infrastructure. Copyright © 2003, Addison Wesley Understand the structure & elements As a business student, it is important that you understand.
Circuit Services - IPTV Christian Todorov Internet2 Fall Member Meeting October 9, 2007.
Creating a Global Lambda GRID: International Advanced Networking and StarLight Presented by Joe Mambretti, Director, International Center for Advanced.
San Diego Supercomputer Center SDSC Storage Resource Broker SRB as data grid solution (Chinese version) Arun Jagatheesan San Diego Supercomputer.
GigaPoP Transport Options: I-WIRE Positioning for the Bandwidth Tsunami Virtual Internet2 Member Meeting Oct 4, 2001 Linda Winkler Argonne National Laboratory.
Hybrid Packet-Optical Infrastructure Tom DeFanti, Maxine Brown, Joe Mambretti & Linda Winkler.
HOPI Update Rick Summerhill Director Network Research, Architecture, and Technologies Jerry Sobieski MAX GigaPoP and TSC Program Manager Mark Johnson MCNC.
HOPI: Making the Connection Chris Robb 23 June 2004 Broomfield, CO Quilt Meeting.
Delivering Circuit Services to Researchers: The HOPI Testbed Rick Summerhill Director, Network Research, Architecture, and Technologies, Internet2 Joint.
Copyright 2004 National LambdaRail, Inc N ational L ambda R ail Update 9/28/2004 Debbie Montano Director, Development & Operations
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
“Gigabit or Bust” Reaching all users Dave Reese CTO, CENIC 17 September 2002.
National Computational Science The NSF TeraGrid: A Pre-Production Update 2 nd Large Scale Cluster Computing Workshop FNAL 21 Oct 2002 Rémy Evard,
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Services in a Converged WAN Accessing the WAN – Chapter 1.
Abilene update IBM Internet2 Day July 26, 2001 Steve Corbató Director of Backbone Network Infrastructure.
Tom DeFanti, Maxine Brown Principal Investigators, STAR TAP Linda Winkler, Bill Nickless, Alan Verlo, Caren Litvanyi, Andy Schmidt STAR TAP Engineering.
1 How High Performance Ethernet Plays in RONs, GigaPOPs & Grids Internet2 Member Meeting Sept 20,
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Capability Computing – High-End Resources Wayne Pfeiffer Deputy Director NPACI & SDSC NPACI.
Sep 02 IPP Canada Remote Computing Plans Pekka K. Sinervo Department of Physics University of Toronto 4 Sep IPP Overview 2 Local Computing 3 Network.
SANs Today Increasing port count Multi-vendor Edge and Core switches
IPv6 on vBNS+ Greg Miller NANOG - Albuquerque, NM June 12, 2000
STAR TAP, Euro-Link, and StarLight Tom DeFanti April 8, 2003.
1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.
The OptIPuter – From SuperComputers to SuperNetworks GEON Meeting San Diego Supercomputer Center, UCSD La Jolla, CA November 19, 2002 Dr. Larry Smarr Director,
University of Illinois at Chicago StarLight: Applications-Oriented Optical Wavelength Switching for the Global Grid at STAR TAP Tom DeFanti, Maxine Brown.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Thomas A. DeFanti, Maxine Brown Principal Investigators, STAR TAP/StarLight Linda Winkler, Bill Nickless, Alan Verlo, Caren Litvanyi STAR TAP Engineering.
OptIPuter Networks Overview of Initial Stages to Include OptIPuter Nodes OptIPuter Networks OptIPuter Expansion OPtIPuter All Hands Meeting February 6-7.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
INDIANAUNIVERSITYINDIANAUNIVERSITY HOPI: Hybrid Packet and Optical Infrastructure Chris Robb and Jim Williams Indiana University 7 July 2004 Cairns, AU.
7 May 2002 Next Generation Abilene Internet2 Member Meeting Washington DC Internet2 Member Meeting Washington DC.
Click to edit Master title style Literature Review Interconnection Architectures for Petabye-Scale High-Performance Storage Systems Andy D. Hospodor, Ethan.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
Internet2 Members Meeting Washington, DC 1 Advanced Networking Infrastructure and Research (ANIR) Aubrey Bush Division Director, ANIR National Science.
Run - II Networks Run-II Computing Review 9/13/04 Phil DeMar Networks Section Head.
The Internet2 Network and LHC Rick Summerhill Director Network Research, Architecture, and Technologies Internet2 Given by Rich Carlson LHC Meeting 25.
August 22, 2001 Traffic and Cost Model for RPR versus 1GbE and 10GbE Architectures A Carriers’ Carrier Perspective Stevan Plote Director of Technology.
The Internet2 Network and LHC Rick Summerhill Director Network Research, Architecture, and Technologies Internet2 LHC Meeting 23 October 2006 FERMI Lab,
Charlie Catlett UIUC/NCSA Starlight International Optical Network Hub (NU-Chicago) Argonne National Laboratory U Chicago IIT UIC.
Fermilab T1 infrastructure
Joint Techs, Columbus, OH
Wide Area Network.
Ken Gunnells, Ph.D. - Networking Paul Crigler - Programming
Chapter 1: WAN Concepts Connecting Networks
Internet2 Network of the Future
Internet2 Abilene Network and Next Generation Optical Networking
Internet2 Network of the Future
Next Generation Abilene
Optical SIG, SD Telecom Council
Wide-Area Networking at SLAC
Presentation transcript:

TeraGrid and I-WIRE: Models for the Future? Rick Stevens and Charlie Catlett Argonne National Laboratory The University of Chicago

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett TeraGrid Interconnect Objectives Traditional: Interconnect sites/clusters using WAN WAN bandwidth balances cost and utilization- objective to keep utilization high to justify high cost of WAN bandwidth TeraGrid: Build a wide area “machine room” network TeraGrid WAN objective to handle peak M2M traffic Partnering with Qwest to begin with 40 Gb/s and grow to ≥80 Gb/s within 2 years. Long-Term TeraGrid Objective Build Petaflops capable distributed system, requiring Petabytes storage and a Terabit/second network. Current objective is to step toward this goal. Terabit/second network will require many lambdas operating at minimum OC-768 and its architecture is not yet clear.

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Outline and Major Issues Trends in national cyberinfrastructure development TeraGrid as a model for advanced grid Infrastructure I-WIRE as a model for advanced regional fiber infrastructure What is needed for these models to succeed Recommendations

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Trends Cyberinfrastructure Advent of regional dark fiber infrastructure Community owned and managed (via 20 yr IRUs) Typically supported by state or local resources Lambda services (IRUs) viable replacements for bandwidth service contracts Need to be structured with built in capability escalation (BRI) Need strong operating capability to exploit this Regional (NGO) groups moving faster (much faster!) than national network providers and agencies A viable path to putting bandwidth on a Moore’s law curve Source of new ideas for national infrastructure architecture

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett OC-48 Cloud 0.5 GB/s78 MB/s 2000 s (33 min) 13k s (3.6h) Traditional Cluster Network Access 64 GB 1 TB 1024 MB GbE OC-12 Traditionally, high-performance computers have been islands of capability separated by wide area networks that provide a fraction of a percent of the internal cluster network bandwidth. (Time to move entire contents of memory) High performance cluster system interconnect using Myrinet with very high bisection bandwidth (hundreds of GB/s) with external connection of n x GbE, n is small integer.

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Interconnect To Build a Distributed Terascale Cluster… Big Fast Interconnect 4096 GB 10 TB 64 GB 5 GB/s 200 s (3.3 min) 10 TB 5 GB/s = 200 nodes x 25 MB/s (=20% of GbE per node) TeraGrid is building a “machine room” network across the country while increasing external cluster bandwidth to many GbE. Requires edge systems that handle n x 10 GbE and hubs that handle minimum 10 x 10 GbE. (Time to move entire contents of memory and application state on rotating disk) Each node with external GbE

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett 13.6 TF Linux TeraGrid Router or Switch/Router 32 quad-processor McKinley Servers 4GF, 8GB memory/server) Fibre Channel Switch HPSS ESnet HSCC MREN/Abilene Starlight 10 GbE 16 quad-processor McKinley Servers 4GF, 8GB memory/server) NCSA 500 Nodes 8 TF, 4 TB Memory 240 TB disk SDSC 256 Nodes 4.1 TF, 2 TB Memory 225 TB disk Caltech 32 Nodes 0.5 TF 0.4 TB Memory 86 TB disk Argonne 64 Nodes 1 TF 0.25 TB Memory 25 TB disk IA-32 nodes 4 Juniper M160 OC-12 OC-48 OC p IA-32 Chiba City 128p Origin HR Display & VR Facilities = 32x 1GbE = 64x Myrinet = 32x FibreChannel Myrinet Clos Spine = 8x FibreChannel OC-12 OC-3 vBNS Abilene MREN Juniper M p IBM SP Blue Horizon OC-48 NTON Sun E10K p Origin UniTree 1024p IA p IA Juniper M40 vBNS Abilene Calren ESnet OC-12 OC-3 8 Sun Starcat 16 GbE = 32x Myrinet HPSS 256p HP X-Class 128p HP V p IA Extreme Black Diamond 32 quad-processor McKinley Servers 4GF, 12GB memory/server) OC-12 ATM Calren 2 2

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett TeraGrid Network Architecture Cluster interconnect using multi-stage switch/router tree with multiple 10 GbE external links Separation of cluster aggregation and site border routers necessary for operational reasons Phase 1: Four routers or switch/routers each with three OC-192 or 10 GbE WAN PHY MPLS to allow for >10 Gb/s between any two sites Phase 2: Add Core routers or switch/routers Each with ten OC-192 or 10 GbE WAN PHY Ideally should be expandable with additional 10 Gb/s interfaces

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Los Angeles 710 N. Lakeshore (Starlight) Chicago 1 mi Option 1: Full Mesh with MPLS Cluster Aggregation Switch/Router One Wilshire (Carrier Fiber Collocation Facility) Qwest San Diego POP Site Border Router or Switch/Router 2200mi 140mi25mi 115mi20mi 455 N. Cityfront Plaza (Qwest Fiber Collocation Facility) CaltechSDSC NCSA ANL Caltech Cluster SDSC Cluster NCSA Cluster ANL Cluster DWDM OC GbE Cienna Corestream DWDM DWDM TBD Other site resources IP Router

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Expansion Capability: “Starlights” Los Angeles One Wilshire (Carrier Fiber Collocation Facility) Qwest San Diego POP 2200mi 140mi25mi 115mi20mi 455 N. Cityfront Plaza (Qwest Fiber Collocation Facility) CaltechSDSC NCSA ANL Caltech Cluster SDSC Cluster NCSA Cluster ANL Cluster Regional Fiber Aggregation Points Additional Sites And Networks 710 N. Lakeshore (Starlight) Chicago DWDM OC GbE Cienna Corestream DWDM DWDM TBD Cluster Aggregation Switch/Router Site Border Router or Switch/Router Other site resources 1 mi IP Router IP Router (packets) or Lambda Router (circuits)

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Partnership: Toward Terabit/s Networks Aggressive Current-Generation TeraGrid Backplane 3 x 10 GbE per site today with 40 Gb/s in core Grow to 80 Gb/s or higher core within months Requires hundreds of Gb/s in core/hub devices Architecture Evaluation for Next-Generation Backplane Higher Lambda-Counts, Alternative Topologies OC-768 lambdas Parallel Persistent Testbed Use of 1 or more Qwest 10 Gb/s lambdas to keep next-generation technology and architecture testbeds going at all times. Partnership with Qwest and local fiber/transport infrastructure to test OC- 768 and additional lambdas. Can provide multiple, additional dedicated regional10 Gb/s lambdas and dark fiber for OC-768 testing beginning 2q 2002 via I-WIRE.

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett UIUC/NCSA Starlight (NU-Chicago) Argonne UChicago IIT UIC Illinois Century Network James R. Thompson Ctr City Hall State of IL Bldg Level(3) 111 N. Canal McLeodUSA 151/155 N. Michigan Doral Plaza Qwest 455 N. Cityfront UC Gleacher 450 N. Cityfront I-Wire Logical and Transport Topology Next Steps- -Fiber to FermiLab, other sites -Additional fiber to ANL, UIC -DWDM terminals at Level(3), McLeodUSA locations -Experiments with OC-768, Optical Switching/Routing

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Gigapops  Terapops (OIX) Gigapop data from Internet2 Pacific Lightrail TeraGrid Interconnect

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Leverage Regional/Community Fiber Experimental Interconnects

Argonne National Laboratory + University of ChicagoR. Stevens / C. Catlett Recommendations ANIR Program should support Interconnection of fiber islands via bit rate independent or advanced ’s (BRI s) Hardware to light-up community fibers and build out advanced testbeds People resources to run these research community driven infrastructures A next gen connection program will not help advance state of the art Lambda services need to be BRI