Silicon Graphics, Inc. Cracow ‘03 Grid Workshop SAN over WAN - a new way of solving the GRID data access bottleneck Dr. Wolfgang Mertz Business Development Manager for Storage in EMEA Presented by:
Cracow ‘03 Grid WorkshopPage 2 - 1,000,000 2,000,000 3,000,000 4,000,000 5,000,000 6,000,000 7,000, (in Terabytes) From 1998 to 2000 Storage Shipped grew at 78% CAGR From 2001 to 2005 it is projected to grow at 83% CAGR Data under management in an HPC environment is currently growing at over 100%/year. Source: Lyman, Peter and Hal R. Varian, "How Much Information", Retrieved from on 12/19/2002. Data Growth Trends
Cracow ‘03 Grid WorkshopPage 3 2 Buzzwords in IT Industry Server Consolidation –maybe in a commercial environment –usually not in a technical environment a hammer is a hammer, a screwdriver is a screwdriver an HPC system cannot be used as a HPV system Storage Consolidation –DAS -> NAS -> SAN
Cracow ‘03 Grid WorkshopPage 4 History of Storage Architectures DAS - Direct Attached Storage pro –appropriate performance con –distributed, expensive administration –data may not be where it is needed –multiple copies of data stored
Cracow ‘03 Grid WorkshopPage 5 History of Storage Architectures NAS - Network Attached Storage pro –centralized, less expensive administration –one copy of data –access from every system con –network performance is the bottleneck
Cracow ‘03 Grid WorkshopPage 6 Switch History of Storage Architectures SAN - Storage Area Network pro –centralized administration –performance equivalent to DAS con –NO FILE SHARING –multiple copies of data stored
Cracow ‘03 Grid WorkshopPage 7 How does that translate to a GRID Environment? Storage Consolidation –useful in a local environment (GRID node) –does not work between remote GRID nodes Current Data Access between GRID Nodes –Data has to be copied before/after the execution of a job –Problems copy process has to be done manually or included in the job script copy can take long multiple copies of data –additional disk space needed –revision problem
Cracow ‘03 Grid WorkshopPage 8 What if a SAN would have the same file sharing capability as a NAS?... one could build a SAN between different buildings/sites/cities and not loose performance?
Cracow ‘03 Grid WorkshopPage 9 LAN SAN A first step: each host owns a dedicated volume consolidated on a RAID array. Storage management is centralized. Offers a certain level of flexibility. Storage Area Networks (SAN) The High Performance Solution
Cracow ‘03 Grid WorkshopPage 10 LAN SAN SOLARIS, AIX, HP-UX Windows NT, 2000 and XP Linux, Mac OS IRIX A unique high performances solution : Each host shares one or more volumes consolidated in one or more RAID arrays. Centralized storage management High modularity True High Performances Data sharing Heterogeneous Environment SGI InfiniteStorage Shared FileSystem (CXFS)
Cracow ‘03 Grid WorkshopPage 11 Data re-transmission due to IP packet loss limits actual IP throughput over distance Distance (kilometers) New York BostonChicagoDenver Hours Fibre Channel over SONET/SDH The High Efficiency, Long Distance Alternative
Cracow ‘03 Grid WorkshopPage 12 SAN Tape System Storage Servers Client LAN IP Router Fibre Channel Switch WAN DWDM Dedicated Fiber SDH SONET SAN Tape System Storage Servers Client LAN IP Router Fibre Channel Switch SONET FC IP LightSand Solution for building a Global-SAN
Cracow ‘03 Grid WorkshopPage 13 LightSand Products S-600 –2 ports FC and/or IP 1Gb/s –Point-to-point SAN interconnect over SONET/SDH OC-12c (622 Mb/s bandwidth) –Low latency (approximately 50 µSec) S-2500 –3 ports FC and/or IP 1Gb/s –Point-to-point SAN interconnect over SONET/SDH OC-48c (2.5 Gb/s bandwidth) –Point-to-multipoint SAN interconnect over SONET/SDH (up to 5 SAN islands. 622 Mb/s per link) –Low latency (approximately 50 µSec)
Cracow ‘03 Grid WorkshopPage 14 Sandia National Laboratory (SNL) Los Alamos National Laboratory (LANL) IP Network Server Fibre Channel Storage Area Network Fibre Channel Storage Area Network Scientists at LANL currently dump 100GB of supercomputing data to tape and FedEx it to SNL because it is faster than trying to use the existing 155Mb/s IP WAN connection –Actual measured throughput of 16Mb/s! (10% bandwidth utilization) Scientists at LANL currently dump 100GB of supercomputing data to tape and FedEx it to SNL because it is faster than trying to use the existing 155Mb/s IP WAN connection –Actual measured throughput of 16Mb/s! (10% bandwidth utilization) Data Movement Today – A Recent Case Study
Cracow ‘03 Grid WorkshopPage 15 Using LightSand gateways, the same data could be transferred in a few minutes! Remote Data Center Local Data Center IP Network FC SAN Server FC SAN LightSand Gateway Telco SONET/SDH Infrastructure LightSand Gateway The Better Way – Directly Between Storage Systems
Cracow ‘03 Grid WorkshopPage 16 GDAŃSK POZNAŃ ŁÓDŹ KRAKÓW WROCŁAW GDAŃSK ŁÓDŹ KRAKÓW POZNAŃ What does that mean for a GRID Environment? Full Bandwidth Data Access across the GRID No Multiple Copies of Data –avoid the revision problem –do not waste disk space Make GRID Computing more efficient WARSZAWA
Cracow ‘03 Grid WorkshopPage 17 Storage Advanced Graphics High- Performance Computing Highly Integrated, Massively Scalable Systems
Cracow ‘03 Grid WorkshopPage 18 Storage Hardware TP900, TP9100, TP9300, TP9400, TP9500, HDS 99x0, STK Tape Libraries, ADIC Libraries, Brocade Switches, NAS 2000, SAN 2000, SAN 3000 High AvailabilityData ProtectionHSMData Sharing NASDASSAN High Availability Data Protection HSM Data Sharing Redundant Hardware and FailSafe™ XVM Legato NetWorker, XFS™ Dump, OpenVault™ SGI Data Migration Facility (DMF), TMF, OpenVault™ XFS, CIFS/NFS, Samba, ClusteredXFS (CXFS™), SAN over WAN Choose only the integrated capabilities you need SGI InfiniteStorage Product Line
age