Download presentation
Presentation is loading. Please wait.
Published byJemima Randall Modified over 9 years ago
1
Production Grid Challenges in Hungary Péter Stefán Ferenc Szalai Gábor Vitéz NIIF/HUNGARNET
2
Agenda Brief introduction Grid initiatives - ClusterGrid Challenges in a production environment Generic ClusterGrid operation model Management issues User support Monitoring ClusterGrid future challenges Conclusions
3
computing infrastructure networking infrastructure middleware infrastructure collaborative infrastructure Brief NIIF Introduction Hungarian NREN. GRID supercomputing IP, IPv6, MPLS, lambda etc Videoconference, central HA cluster VPNs, VoIP, directory service 10G backbone, ~600.000 users, ~750 institutions
4
Supercomputers Consists of 2 SUN E15Ks and 2 SUN 10Ks located at two universities, including 276 CPUs, 300 GB of memory. Used to be in the top 500. In production since 2001. Serves more than 200 users, and 100 scientific projects.
5
Hungarian grid initiatives, MGKK Hungarian grid initiatives can be classified into grid infrastructure and grid system development projects. Key role-players formulate grid collaboration: Hungarian Grid Competence Center (MGKK) involving BUTE, ELUB, MTA-SZTAKI, NIIF/HUNGARNET, KFKI, University of Veszprém. Intensive participation in many national and European grid initiatives: EGEE, NorduGrid, SEE- GRID, etc.
6
ClusterGrid initiative It is a pool of 1400 PC nodes throughout the country involving more than 26 clusters. Production infrastructure since July 2002. Supercomputer clusters are planned to be involved too. A rough measurement on the total compute capacity is about 600 Gflops. Even though it is much smaller than regional, continental grids, in complexity it is at the same range.
7
Challenges in production environment Grid definition - set clear objectives what to build Simplicity - keep the system transparent, usable Completeness - cover not only application level Security - using computer networking methods (MPLS, VLAN technologies) Compatibility - other grids (X509, LDAP) Manageability - easy-to-maintain Robustness - fault tolerant behavior Usability - cover many job classes, user support Platform independency - to be able to execute on MS
8
ClusterGrid architecture
9
Some new ideas… MPLS, VLAN connected resources Web-transaction based resource broker Dynamic, separated run-time environment
10
Generic production service model
11
Challenges in production … cont’d … Management physical compute resources (supercomputers, clusters), virtual resources (virtual clusters), storage nodes, users, services User support Grid architecture monitoring
12
Compute-cluster management
13
Virtual compute-cluster management
15
Storage management Low level management of disks and volumes, file systems (cost efficient storage solutions by using ATA over Ethernet - AoE). Medium level access management (gridFTP, FTPS). High level data brokering (extended SRM model).
16
User management User personal data is kept in an LDAP based directory service separately from authentication data. Aided by a web registration interface. Authentication: X509 certificates, LDAP based authentication. No authorization yet.
17
Service management (experimental) Relatively new direction. It is a special service. It is based on well- established authorization. Basically helps to start, stop, (re)configure grid services.
18
User support Grid service provider gives user support covering: consultation about the benefits of grid usage, code porting and optimization, partial aid in code implementation, job formation and execution, generic grid usage. Not yet covered: model creation, formal description, algorithm creation.
19
ClusterGrid monitoring Fluctuation of grid cluster resources between the day-shift and night-shift operation. Blue line – total; Green area – occupied. 2-layer hierarchical monitoring system.
20
ClusterGrid monitoring
22
Future ClusterGrid (?) challenges Continuously growing demands for reliable compute and data storage infrastructure. Grid systems should conform to international standards and MUST interoperate with one another. Platform-independency is not an issue yet, but will be. LEGO-based principles are of increasing importance. Threats: solutions that prevent development; erosion of the belief in the power of “grid”.
23
Conclusions One of the first production-level grids have been shown in a nutshell. With special emphasis on operation, management and user support issues. Management generally covers grid resource, grid user management and monitoring. Some remarks regarding future development were also done.
24
Thanks for your attention! www.clustergrid.hu www.mgkk.hu grid-tech@niif.hu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.