SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.

Slides:



Advertisements
Similar presentations
Electronic Visualization Laboratory University of Illinois at Chicago EVL Optical Networking Research Oliver Yu Electronic Visualization Laboratory University.
Advertisements

-Grids and the OptIPuter Software Architecture Andrew A. Chien Director, Center for Networked Systems SAIC Chair Professor, Computer Science and Engineering.
Why Optical Networks Are Emerging as the 21 st Century Driver Scientific American, January 2001.
"The OptIPuter: an IP Over Lambda Testbed" Invited Talk NREN Workshop VII: Optical Network Testbeds (ONT) NASA Ames Research Center Mountain View, CA August.
The OptIPuter Lambda Coupled Distributed Computing, Peer-to-Peer Storage, and Volume Visualization Dr. Larry Smarr, Director, California Institute for.
AHM Overview OptIPuter Overview Third All Hands Meeting OptIPuter Project San Diego Supercomputer Center University of California, San Diego January 26,
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
DWDM-RAM: DARPA-Sponsored Research for Data Intensive Service-on-Demand Advanced Optical Networks DWDM RAM DWDM RAM BUSINESS WITHOUT BOUNDARIES.
Distributed Virtual Computer (DVC): Simplifying the Development of High-Performance Grid Applications Nut Taesombut and Andrew A. Chien Department of Computer.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
RIT Campus Data Network. General Network Statistics Over 23,000 wired outlets Over 14,500 active switched ethernet ports > 250 network closets > 1,000.
Hands-On Microsoft Windows Server 2003 Networking Chapter 1 Windows Server 2003 Networking Overview.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Chapter 1 An Introduction to Networking
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Ocean Observatories Initiative OOI CI Release 2 Life Cycle Objectives Review CyberPoPs & Network.
OptIPuter Physical Testbed at UCSD, Extensions Beyond the Campus Border Philip Papadopoulos and Cast of Real Workers: Greg Hidley Aaron Chin Sean O’Connell.
NORDUnet NORDUnet The Fibre Generation Lars Fischer CTO NORDUnet.
Feb 6-7, OptIPuter Software Research and Architecture Andrew A. Chien Computer Science and Engineering University of California, San Diego OptIPuter.
Mantychore Oct 2010 WP 7 Andrew Mackarel. Agenda 1. Scope of the WP 2. Mm distribution 3. The WP plan 4. Objectives 5. Deliverables 6. Deadlines 7. Partners.
Why Optical Networks Will Become the 21 st Century Driver Scientific American, January 2001 Number of Years Performance per Dollar Spent Data Storage.
Local Area Networks: Internetworking
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
The Cluster Computing Project Robert L. Tureman Paul D. Camp Community College.
The Singapore Advanced Research & Education Network.
Overview of PlanetLab and Allied Research Test Beds.
Chiaro’s Enstara™ Summary Scalable Capacity –6 Tb/S Initial Capacity –GigE  OC-192 Interfaces –“Soft” Forwarding Plane With Network Processors For Maximum.
Chicago/National/International OptIPuter Infrastructure Tom DeFanti OptIPuter Co-PI Distinguished Professor of Computer Science Director, Electronic Visualization.
A Wide Range of Scientific Disciplines Will Require a Common Infrastructure Example--Two e-Science Grand Challenges –NSF’s EarthScope—US Array –NIH’s Biomedical.
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
Using Photonics to Prototype the Research Campus Infrastructure of the Future: The UCSD Quartzite Project Philip Papadopoulos Larry Smarr Joseph Ford Shaya.
Delivering Circuit Services to Researchers: The HOPI Testbed Rick Summerhill Director, Network Research, Architecture, and Technologies, Internet2 Joint.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
Copyright 2004 National LambdaRail, Inc N ational L ambda R ail Update 9/28/2004 Debbie Montano Director, Development & Operations
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 1: Introduction to Scaling Networks Scaling Networks.
A High-Performance Campus-Scale Cyberinfrastructure For Effectively Bridging End-User Laboratories to Data-Intensive Sources Presentation by Larry Smarr.
OS Services And Networking Support Juan Wang Qi Pan Department of Computer Science Southeastern University August 1999.
EGEE is a project funded by the European Union under contract IST HellasGrid Hardware Tender Christos Aposkitis GRNET EGEE 3 rd parties Advanced.
© 2006 National Institute of Informatics 1 Jun Matsukata National Institute of Informatics SINET3: The Next Generation SINET July 19, 2006.
NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE Capability Computing – High-End Resources Wayne Pfeiffer Deputy Director NPACI & SDSC NPACI.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Chapter2 Networking Fundamentals
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Optical Architecture Invisible Nodes, Elements, Hierarchical, Centrally Controlled, Fairly Static Traditional Provider Services: Invisible, Static Resources,
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
NETWORKING FUNDAMENTALS. Network+ Guide to Networks, 4e2.
Advances Toward Economic and Efficient Terabit LANs and WANs Cees de Laat Advanced Internet Research Group (AIRG) University of Amsterdam.
NSF ANNUAL REVIEW June 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Terrestrial CyberPoPs Implementation Matthew Arrott, Mark James, Brian.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
OptIPuter Networks Overview of Initial Stages to Include OptIPuter Nodes OptIPuter Networks OptIPuter Expansion OPtIPuter All Hands Meeting February 6-7.
“ OptIPuter Year Five: From Research to Adoption " OptIPuter All Hands Meeting La Jolla, CA January 22, 2007 Dr. Larry Smarr Director, California.
TeraPaths: A QoS Enabled Collaborative Data Sharing Infrastructure for Petascale Computing Research The TeraPaths Project Team Usatlas Tier 2 workshop.
Roadmap to Next Generation Internet: Indian Initiatives Subbu C-DAC, India.
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
Advanced research and education networking in the United States: the Internet2 experience Heather Boyles Director, Member and Partner Relations Internet2.
NORDUnet NORDUnet e-Infrastrucure: Grids and Hybrid Networks Lars Fischer CTO, NORDUnet Fall 2006 Internet2 Member Meeting, Chicago.
TransLight Tom DeFanti 50 years ago, 56Kb USA to Netherlands cost US$4.00/minute Now, OC-192 (10Gb) costs US$2.00/minute* That’s 400,000 times cheaper.
1 Revision to DOE proposal Resource Optimization in Hybrid Core Networks with 100G Links Original submission: April 30, 2009 Date: May 4, 2009 PI: Malathi.
RobuSTore: Performance Isolation for Distributed Storage and Parallel Disk Arrays Justin Burke, Huaxia Xia, and Andrew A. Chien Department of Computer.
ASSIGNMENT 3 - NETWORKING COMPONENTS BY JONATHAN MESA.
Southern California Infrastructure Philip Papadopoulos Greg Hidley.
University of Illinois at Chicago Lambda Grids and The OptIPuter Tom DeFanti.
“OptIPuter: From the End User Lab to Global Digital Assets" Panel UC Research Cyberinfrastructure Meeting October 10, 2005 Dr. Larry Smarr.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Clouds , Grids and Clusters
Optical SIG, SD Telecom Council
Presentation transcript:

SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program Director, Grids and clusters San Diego Supercomputer Center January 2004

SoCal Infrastructure Year 1 Mod-0, UCSD

SoCal Infrastructure Building an Experimental Apparatus Mod-0 Optiputer Ethernet (Packet) Based –Focused as an Immediately-usable High-bandwidth Distributed Platform –Multiple Sites on Campus ( a Few Fiber Miles ) –Next-generation Highly-scalable Optical Chiaro Router at Center of Network Hardware Balancing Act –Experiments Really Require Large Data Generators and Consumers –Science Drivers Require Significant Bandwidth to Storage –OptIPuter Predicated on Price/performance curves of > 1GE networks System Issues –How does one Build and Manage a Reconfigurable Distributed Instrument?

SoCal Infrastructure Year 2 – Mod-0, UCSD

SoCal Infrastructure Southern Cal Metro Extension Year 2

SoCal Infrastructure Aggregates Year 1 (Network Build) –Chiaro Router Purchased, Installed, Working (Feb) –5 sites on Campus. Each with 4 GigE Uplinks to Chiaro –Private Fiber, UCSD-only. –~40 Individual nodes, Most Shared with Other Projects –Endpoint resource poor. Network Rich Year 2 (Endpoint Enhancements) –Chiaro Router – Additional Line Cards, IPV6, Starting 10GigE Deployment –8 Sites on Campus –h 3 Metro Sites –Multiple Virtual Routers for Connection to Campus, CENIC HPR, others –> 200 Nodes. Most are Donated (Sun and IBM). Most Dedicated to OptIPuter –Infiniband Test Network on 16 nodes + Direct IB Switch to GigE –Enough Resource to Support Data-intensive Activity, –Slightly network poor. Year 3 + (Balanced Expansion Driven by Research Requirements) –Expand 10GigE deployments –Bring Network, Endpoint, and DWDM (Mod-1) Forward Together –Aggregate at Least a Terabit (both Network and Endpoints) by Year 5

SoCal Infrastructure Web Information on the SD OptIPuter Need folks to start using resources and feedback how things can work better Intention is give full control of resources to experiments Experiments themselves should be of defined timeframe –This is a shared instrument The resource endpoints are experimental –NO BACKUPS –DO NOT EXPECT 7/24 (We don’t have the staff) –THINGS WILL BREAK

SoCal Infrastructure High-Level Program Bullets UCSD will complete deployment of a 150+ node distributed test bed that consists of compute, storage, visualization, and instrument endpoints. Management policies will be put in place so that experiments can be assigned physical hardware and then have specialized (experiment-specific) software loaded on assigned nodes. This forms the Core Southern California OptIPuter Test bed In year 2, we won an IBM SURS Grant which allowed us to define a larger storewidth evaluation platform than in the original program plan. This cluster, deployed in January 2004 is 48 nodes with 6 spindles per node. This enables middleware and applications to understand how lambdagrids enable applications. As part of the SURS grant a small IB test fabric was purchased. The Topspin switch includes 4 gigE uplinks to allow investigation of IB to Ethernet communication.

SoCal Infrastructure Revised Program Plan Bullets UCSD will complete deployment of a 150+ node distributed test bed that consists of compute, storage, visualization, and instrument endpoints. Management policies will be put in place so that experiments can be assigned physical hardware and then have specialized (experiment-specific) software loaded on assigned nodes. This forms the Core Southern California Optiputer Test bed –Account creation, cataloging of resources, MRTG-based network monitoring, ganglia resource monitoring are already available as a web page at the web.optiputer.net UCSD deployed test bed structure will have a best case bisection bandwidth of 5:1 and a worst case of 8:1. Monitoring will be put in place to measure when these links are saturated UCSD, UCI, and USC/ISI Connections of Optiputer Nodes is expected to be made through CENIC CalREN HPR 1 gigabit/s circuits. This will form a southern California OptIPuter Test bed. At UCSD, connection from the Campus Border Router will be made to the existing Chiaro OptIPuter Router. IPv6 software will be made a standard part of the base OptIPuter software stack by the end of year 2.

SoCal Infrastructure Revised Program Plan Bullets Part 2 In year 2, we won an IBM SURS Grant which allowed us to define a larger storewidth evaluation platform than originally proposed. This cluster, deployed in January 2004 is 48 nodes with 6 spindles per node. This enables middleware and applications to understand how lambdagrids enable applications. The 48 Node Storage Cluster will employ a 48-port gigabit switch with a single 10 GigE Uplink to the Chiaro Enstara Router. This will achieve a 5:1 external (bisection) for this particular OptIPuter node. As part of the SURS grant a small IB test fabric was purchased. The Topspin switch includes 4 gigE uplinks to allow investigation of IB to Ethernet Communication..

SoCal Infrastructure Some Extracts from the Program Plan USC, UCI and SDSU, in addition to UCSD, will install OptIPuter nodes and link them to the CENIC dark fiber experimental network. –UCSD, UCI, and USC/ISI Connections of Optiputer Nodes is expected to be made through CENIC CalREN HPR 1 gigabit/s circuits. This will form a southern California OptIPuter Test bed. At UCSD, connection from the Campus Border Router will be made to the existing Chiaro OptIPuter Router. UIC will work with NU and UCSD to measure if a 4:1 local bisection bandwidth can be achieved –UCSD will deploy a test bed structure where best bisection bandwidth is 5:1 and the worst is 8:1. Monitoring will be put in place to measure when these links are saturated UCSD will continue to work with its campus and CENIC to connect all OptIPuter sites in southern California. UIC will continue to build up the facilities at StarLight to accept several new 10Gb links, notably from Asia, the UK and CERN. –See above

SoCal Infrastructure IPv6: Tom Hutton (UCSD/SDSC) is being given one month of funding to develop an IPv6 OptIPuter testbed –IPv6 software will be made a standard part of the base OptIPuter software stack by the end of year 2. IPv^ experiments will include moving data from electron microscopes at NCMIR to optiputer Resources over IPv6 and measuring performance. Extend the Chiaro network to SDSU, UCI and USC Work towards 10GigE integration (whether point-to-point, switch, or routed, needs to be determined). –The 48 Node Storage Cluster will employ a 48-port gigabit switch with a single 10 GigE Uplink to the Chiaro Enstara Router. This will achieve a 5:1 external (bisection) for this particular OptIPuter node.

SoCal Infrastructure UCSD (Papadopoulos) is developing a storewidth evaluation platform. Some clusters will have 16 nodes with 4-6 drives each. UCSD is planning on 25% of cluster platforms to be high-bandwidth storage nodes. –Refinement: In year 2, we won an IBM SURS Grant which allowed us to define a larger storewidth evaluation platform than originally proposed. This cluster, deployed in January 2004 is 48 nodes with 6 spindles per node. This enables middleware and applications to understand how lambdagrids enable applications. UCSD’s (Papadopoulos) dynamic credit scheme design will be dependent on the availability of InfiniBand hardware –As part of the SURS grant a small IB test fabric was purchased. The Topspin switch includes 4 gigE uplinks to allow investigation of IB to Ethernet Communication.. Working with Alan Benner at IBM, initial investigations into the integration of IB and GigE connected nodes will be explored. UCSD will complete deployment of a 150+ node distributed testbed that consists of compute, storage, and visualization endpoints. Management software will be put in place so that experiments can be assigned physical hardware and then have specialized (experiment-specific) software loaded on assigned nodes. This forms the Core Southern California Optiputer Testbed –Account creation, cataloging of resources, MRTG-based network monitoring, ganglia resource monitoring are already available as a web page at the web.optiputer.net

SoCal Infrastructure Cluster Hardware Complete installation of a visualization cluster at UCSD/SIO Purchase and install four general-purpose IA-32 clusters at key campus sites (compute and storage); clusters will all have PCI-X (100 MHz, 1Gbps). Note: Use of InfiniBand is dependent on emerging company positions/products and on result of IBM equipment proposal. Expand an IA-32 system to 16- or 32-way. Connect one or two campus sites with 32-node clusters at 10GigE (3:1 campus bisection ratio) Integrate the SIO/SDSU Panoram systems into the OptIPuter network Install and configure a variety of OptIPuter management workstations (work has already begun): Dell IA32 and HP IA64 workstations, integrated with the Chiaro network, and equipped with National Middleware Initiative (NMI) software capabilities, OptIPuter accounts, shared storage, revision control and other general support services

SoCal Infrastructure Other Areas Where OptIPuter Platform needs to support research activities Software Architecture Activities –USC will begin designing Lambda Overlay Protocols and the interface between the cluster communication fabric and the wide-area optical network. Design of a transparent (MAN/WAN) lambda  OS-bypass (local cluster communication) messaging transport framework will be undertaken. –UCSD will collaborate with UCI to begin exploring security models using packet-level redundancy and multi-pathing –The first-generation OptIPuter System Software will be released to partner sites for application development and evaluation. –UCI will begin design of the Time-Triggered Message-Triggered Object (TMO) programming framework for the OptIPuter –TAMU will include network protocols and middleware in its performance analysis work. –USC (Bannister, et. al.) will document XCP’s design, develop a protocol specification, and publish XCP header format and other protocol parameters as Internet Drafts. –Work on Quanta/RBUDP and SABUL will continue.

SoCal Infrastructure System Architecture (Chien) Prototype Distributed Virtual Computer alternatives on OptIPuter prototypes Prototype novel Group Transport Protocols (GTP) and demonstrate on OptIPuter prototypes Do detailed studies of resource management in OptIPuter systems using the MicroGrid.

SoCal Infrastructure Data Visualization Activities UIC, UCSD and SDSU will deploy data management, visualization and collaboration software systems developed in Year 1 to OptIPuter application teams, both for testing and for feedback. UIC will continue to enhance computing and visualization cluster hardware/software to achieve very-large-field real-time volume rendering. UIC, UCSD and UCI will integrate the data middleware and high-level data services into an OptIPuter application and begin using the data middleware to develop data-mining algorithms for clustering data streaming over the OptIPuter. In addition, high-level software abstractions for remote data retrieval and for accessing bundled computing resources as high-level computational units or peripherals will be studied. Multi-site synchronous and near-synchronous visualization will be demonstrated.

SoCal Infrastructure From the Site Review Response Focused questions that we said we would address in Year 2 –How to we control lambdas and how do protocols influence their utility? –How is a LambdaGrid different from a Grid in terms of middleware? –How can lambdas enhance collaboration? –How are applications quantitatively helped by LambdaGrids? What we said would happen at the AHM –Distribute research focus topics as soon as NSF approves the Year 2 Program Plan –Have individuals refine year 2 deliverables (described in the Cooperative Agreement and PPP) –Have individuals send refined deliverables to Team Leaders by the end of December –Have Team Leaders and their teams review and summarize at the AHM –Finalize at AHM and submit to NSF shortly thereafter

SoCal Infrastructure Goals for this Group Experiments to be run – many overlaps with other teams Construction and use plans of the SD OptIPuter Infrastructure Software Integration activities –Program Plan says we will make SW available to all OptIPuter participants Details and Proposed Timelines –Exact dates not essential –Dependency ordering of activities IS essential –Who? –For Centralized infrastructure, we need to know who to help. –6 and 12 and 18 Month plans are useful Shortened Timeframe –2005 Program plan due end of June 2004, so 5 months is key horizon date. –Site Review/Program Plan Review is likely in July. Give Time for vacations in Aug and Sept

SoCal Infrastructure Group Comments Questions of other groups –What needs to be “persistent” software capability, even when researchers are changing kernels, middleware and other software pieces. –How are people going to use storage –How to get data in and out of OptIPuter –Serving out blocks, file systems –Federated Storage (not in 2004) –48 node (300 spindle) cluster of storage is available Root access to nodes needed by experiments (expected) What about security constrictions (firewalls) and high-performance –? For Opti security researchers –Line rate, near line rate. How much variety in performance of endpoints ? –Are basic benchmarks useful? If so, can we get basic benchmarks to run –Listing of capabilities of nodes: Storage, memory, speed, connectivity What sort of performance monitoring do groups need. Experiment priority -

SoCal Infrastructure Experiments Replica management of very large data sets (NASA requirements) –100 Million Grid Points – Terabyte-sized chunks. How to really move this –How to compute in one place and store on a different optiputer node