Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clusterix:National IPv6 Computing Facility in Poland Artur Binczewski Radosław Krzywania Maciej Stroiński

Similar presentations


Presentation on theme: "Clusterix:National IPv6 Computing Facility in Poland Artur Binczewski Radosław Krzywania Maciej Stroiński"— Presentation transcript:

1 Clusterix:National IPv6 Computing Facility in Poland Artur Binczewski artur@man.poznan.pl Radosław Krzywania sfrog@man.poznan.pl Maciej Stroiński stroins@man.poznan.pl Jan Węglarz weglarz@man.poznan.pl

2 Agenda Clusterix Project PIONIER Network Clusterix Network Architecture Network as a resource Dynamic Computing Resources

3 Clusterix Project

4 Initiated in the year 2003 by 12 Polish computing centers Objectives: – To build productive and efficient GRID environment – To provide enhanced security to created GRID infrastructure – To introduce IPv6 based communication to GRID applications – To create scalable computing infrastructure with dynamic resourced attachment

5 Clusterix Project 64 bits Intel computing nodes Over 800 processors with computing power at 4.4 TFLOPS Linux operating system (Debian distribution) IPv6 as primary protocol (with IPv4 coexistence) Communication based on dedicated channels within PIONIER network

6 PIONIER network

7 Polish Optical Internet – PIONIER – Modern fiber based network – Connects 21 academic and research centres – Over 5500 km of fibers is planned (over 3500 km exist by now) – Build with DWDM infrastructure – 10 Gbps capacity is available by now

8 PIONIER network TELIA 2x2,5 Gb/s GTS 1,2 Gb/s GDAŃSK POZNAŃ ZIELONA GÓRA KATOWICE KRAKÓW LUBLIN WARSZAWA BYDGOSZCZ TORUŃ CZĘSTOCHOWA BIAŁYSTOK OLSZTYN RZESZÓW BIELSKO-BIAŁA Metropolitan Area Networks KOSZALIN SZCZECIN WROCŁAW ŁÓDŹ KIELCE PUŁAWY OPOLE RADOM BASNET 34 Mb/s CESNET, SANET 10 Gb/s 10 Gb/s (1 lambda) PIONIER’S FIBERS 2 x 10 Gb/s (2 lambdas) 1 Gb/s CBDF 10GE GÉANT 10+10 Gb/s DFN 10 Gb/s

9 Clusterix Network Architecture

10 Communication to all cluster is passed through router/firewall routing based on IPv6 protocol, with IPv4 for back compatibility feature Application and Clusterix middleware are adjusted to IPv6 usage For security reason only outgoing connections to Internet are permitted Two 1 Gbps VLANs are used to improve management of network traffic – Communication VLAN is dedicated to support nodes messages exchange – NFS VLAN is dedicated to support file transfer PIONIER Core Switch Clusterix Storage Element Local Cluster Switch Computing Nodes Access Node Router Firewall Internet Network Access Communication & NFS VLANs Internet Network Backbone Traffic 1 Gbps

11 Network as a resource

12 Network management application – Objectives and features Tracking and monitoring network status Performing measurements Discovering failures location Providing network statistics for GRID services Layer 3 QoS management Automatic measurement session configuration Failure resistance

13 Network as a resource – Measurements Network Manager Measurement Reports Computing Cluster Local Cluster Measurements PIONIER Backbone Measurements SNMP Monitoring Measurement architecture – Distributed 2-level measurement agent mesh (backbone/cluster) – Centralized control manager (multiple redundant instances) – Switches are monitored via SNMP – Reports are stored by manager (forwarded to database) – IPv6 protocol and addressing schema is used for measurement

14 Network as a resource – Architecture Database External Clients GUI Backup Manager Controller External Interfaces Redundancy Controller System Logic Measurement Agents Manager Device Manager Devices Backbone measurements Local Cluster measurements System Manager External Entities System Resources Manager architecture – Statistics are stored in external database (short time backup is stored in manager) – GUI shows network status and configure manager – Backup managers improves failure recovery (active manager switching) – External applications are allowed to retrieve various network statistics – Devices and agents management modules collect network data

15 Network as a resource – Protocol Active Measurement Protocol – All agent types uses the same communication protocol – First implementation was OWAMP based – One way measurements was abandoned, and round trip measurement approach is used – Future modifications was done due to non-fixed messages length and extra requirements – Protocol supports both IPv6 and IPv4 protocols – Measurements traffic pattern can be specified for more detailed network examination – Network metrics: RTT Duplicated packets Jitter Packets out of order Packet loss

16 Network as a resource – Monitoring Monitoring – Core switches are monitored via SNMP protocol to track Interfaces status Maximum available capacity Current link utilization – SNMP View is used to improve device's security

17 Network as a resource – Fail Safe Manager Backup Manager Synchronization Data Measurement Network Regular working Only one active manager is allowed (selection algorithm is based on Bully algorithm) Required data are exchanged between active and backup managers Measurement agents register at active manager only

18 Network as a resource – Fail Safe Manager Failure New Manager Failure event In case of failure, the selection of new active manager is performed Agents not register until new active manager is elected Measurements are still performed, and results are temporarily stored on agents side Newly elected manager recovers system state and accepts agents registrations System is ready to serve information

19 Network as a resource – GUI GUI – Provides view of network status – Gives look at statistics – Simplifies network troubleshooting – Allows to configure measurement sessions – Useful for topology browsing

20 Dynamic Computing Resources

21 Dynamic Computing Resources – Motivation External clusters can be easily attached to Clusterix infrastructure in order to: – Increase computing power with new clusters – Utilize external clusters during nights or non-active periods – Make Clusterix infrastructure scalable

22 Dynamic Computing Resources - Architecture Dynamic cluster attachment: – Requirements needs to be checked against new clusters Installed software SSL certificates – Communication through router/firewall – Network Management System will automatically discover new resources – New cluster can serve computing power on regular basis PIONIER Backbone Switch Local Switch Router Firewall Regular Cluster Dynamic Resources Internet

23 Summary Fast computing center interconnection through PIONIER IPv6 protocol is introduced to GRID environment Failure resist network monitoring system Network is used as a regular GRID resource Dynamic architecture allows easy power upgrades

24 Thank you for your attention! Visit http://www.clusterix.pcz.pl


Download ppt "Clusterix:National IPv6 Computing Facility in Poland Artur Binczewski Radosław Krzywania Maciej Stroiński"

Similar presentations


Ads by Google