Download presentation
Presentation is loading. Please wait.
Published bySydney Donovan Modified over 11 years ago
1
DAS 3 and StarPlane have Landed Architecture, Status... Freek Dijkstra
2
DAS history Project to prove distributed clusters are as effective as supercomputers Simple Computer Science grid that works DAS-1 (1997-2002) DAS-2 (2002-2006) DAS-3 (2006-future) 4 sites5 sites4 sites, 5 clusters 200 MHz Pentium Pro Myrinet BSD Linux 1 GHz Pentium3 1 GB memory Myrinet Red Hat Linux 2.2+ GHz Opteron 4 GB memory Myrinet + WAN Not uniform! 6 Mb/s ATM (full mesh)1 Gb/s SURFnet routed8×10 Gb/s dedicated
3
Parallel to Distributed Computing Cluster Computing Parallel languages (Orca, Spar) Parallel applications Distributed Computing Parallel processing on multiple clusters Study non-trivially parallel applications Exploit hierarchical structure for locality optimizations Grid Computing
4
DAS-2 Usage 200 users; 25 Ph.D. Theses Simple, clean, laboratory-like system Example Applications: Solving Awari (3500-year old game) HIRLAM: Weather forecasting GRAPE: simulation hardware for astrophysics Manta: distributed supercomputing in Java Ensflow: Stochastic ocean flow model http://www.cs.vu.nl/das2/
5
Grid Computing Ibis: Java-centric grid computing Satin: divide-and-conquer on grids Zorilla: P2P distributed supercomputing KOALA: co-allocation of grid resources CrossGrid: interactive simulation and visualization of a biomedical system VL-e: scientific collaboration using the grid (e-Science) LamdaRAM: share memory among cluster nodes Grid Middleware Computing Clusters + Network Applications
6
Colourful Future: DAS-3 Timeline Autumn DAS-3 proposal initiated Summer Proposal accepted September European tender preparation December Tender call February Five proposals received April ClusterVision chosen June Pilot cluster at VU August Intended installation End Official ending DAS-2 Funding: NWO, NCF, VL-e (UvA, Delft, part VU), MultimediaN (UvA), Universiteit Leiden 2006 2005 2004
7
DAS-2 Cluster MyrinetMyrinet 32-72 compute nodes Fast interconnectLocal interconnect 100 Mb/s Ethernet head node To local University and wide area interconnect 1 Gbit/s Ethernet 2 Gbit/s
8
DAS-3 Cluster MyrinetMyrinet 32-85 compute nodes Fast interconnectLocal interconnect 10 Gbit/s Ethernet 1 Gbit/s Ethernet To SURFnet head node To local University NortelNortel 10 Gbit/s Ethernet 10 Gbit/s
9
Heterogeneous Clusters LUTUDUvA-VLeUvA-MNVU TOTALS Head * storage10TB5TB2TB 10TB 29TB * CPU2x2.4GHz DC 2x2.2GHz DC 2x2.4GHz DC * memory16GB 8GB16GB8GB 64GB * Myri 10G1111 * 10GE11111 Compute326840 (1)4685 271 * storage400GB250GB 2x250GB250GB 84 TB * CPU2x2.6GHz2x2.4GHz2x2.2GHz DC2x2.4GHz2x2.4GHz DC 1.9 THz * memory4GB 1048 GB * Myri 10G1111 Myrinet * 10G ports33 (7)414786 (2) * 10GE ports8888 320 Gb/s Nortel * 1GE ports32 (16)136 (8)40 (8)46 (2)85 (11) 339 Gb/s * 10GE ports1 (1)9 (3)221 (1)
10
Problem space CPUData Network DAS-2 DAS-3 & StarPlane
11
SURFnet6 In The Netherlands SURFnet connects between 180: universities; academic hospitals; most polytechnics; research centers. with a user base of ~750k users ~6000km fiber comparable to railway system
12
Common Photonic Layer (CPL) 5 rings Initially 36 lambdas (4x9) Later 72 lambdas (8x9) Troughput of each lambda is up to 10 Gb/s now Later up to 40 Gb/s per lambda
13
Quality of Service (QoS) by providing wavelengths Old Quality of Service: One fiber, with a single lambda Set part of it aside on request Rest gets less service New Quality of Service: One fiber, multiple lambda (separate colours) Move requests to other lambdas as needed Rest also gets happier!
14
StarPlane Topology 4 DAS-3 sites, with 5 clusters Interconnected with 4 to 8 dedicated lambdas of 10 Gb/s each Same fiber as for regular Internet External Connectivity Grid 5000 GridLab Media archives in Hilversum
15
StarPlane Project StarPlane will use the SURFnet6 infrastructure to interconnect the DAS-3 sites The novelty: to give flexibility directly to the applications by allowing them to choose the logical topology in real time Ultimately configure within subseconds People and Timeline: 1 postdoc, 1 AIO, 1 scientific programmer (Jason Maassen - VU; Li Xu - UvA; JP Velders - UvA) February 2006 - February 2010 Funding: NWO, with major contributions from SURFnet and Nortel.
16
Application - Network Interaction Application Control Plane Network Use Configuration Request start, ring, full mesh
17
Application - Network Interaction Network App1App2App3 time App1App2App3 Application Initiated Network Configuration Workflow Initiated Network Configuration Work Flow Manager
18
StarPlane Applications Large stand-alone file transfers User-driven file transfers Nightly backups Transfer of medical data files (MRI) Large file (speedier) Stage-in/Stage-out MEG modeling (Magneto encephalography) Analysis of video data Application with static bandwidth requirements Distributed game-tree search Remote data access for analysis of video data Remote visualization Applications with dynamic bandwidth requirements Remote data access for MEG modeling SCARI
19
Conclusions This fall, DAS-3 will be available at a university near you StarPlane allows applications to configure the network We aim for fast (subsecond) lambda switching. Workflow systems and/or applications need to become network aware For details: see the StarPlane poster this evening!
20
DAS 3 and StarPlane have Landed Architecture, Status...... and Application Research
21
Network Memory LambdaRAM software uses memory in the local cluster as a local cache. Faster then caching at disk (access time ~1ms for network; ~10ms for disk) (Very) high-rez remote image Blue box: active (visualized) zoom region Green area: cached on other cluster nodes http://www.evl.uic.edu/cavern/optiputer/lambdaram.html
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.